I need to unzip 24 tar.gz files coming in my s3 bucket and upload it back to another s3 bucket using lambda or glue, it should be serverless the total size for all the 24 files will be maxing 1 GB. Is there any way I can achieve that, Below is the lambda function which uses s3 even based trigger to unzip the files, but I am not able to achieve the result.
import boto3
import botocore
import tarfile
from io import BytesIO
s3_client = boto3.client('s3')
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
input_tar_file = s3_client.get_object(Bucket = bucket, Key = key)
input_tar_content = input_tar_file['Body'].read()
with tarfile.open(fileobj = BytesIO(input_tar_content)) as tar:
for tar_resource in tar:
if (tar_resource.isfile()):
inner_file_bytes = tar.extractfile(tar_resource).read()
s3_client.upload_fileobj(BytesIO(bytes_content), Bucket =bucket,Key=uncompressed_key)
It's saying bytes_content not defined. Is it possible to use lambda aur glue to get a solution for this problem? Any help will be much appreciated.