Python AWS Boto3 How do i read files from S3 Bucket

Question

Using Boto3, the python script downloads files from an S3 bucket to read them and write the contents of the downloaded files to a file called blank_file.txt.

What my question is, how would it work the same way once the script gets on an AWS Lambda function?

MD · Answer 1 · Dec 7, 2018

Best answer

You can use the following code,

import boto3
s3 = boto3.resource('s3')
obj = s3.Object(bucketname, itemname)
body = obj.get()['Body'].read()

answered Dec 7, 2018 by Nitesh

selected Dec 9, 2020 by MD

What is itemname here?

commented May 15, 2019 by anonymous

As far as I know, the itemname here is the file that is being fetched and read by the function.

commented May 15, 2019 by Vishal

itemname is Key (string) -- Key of the object to get.

commented Jan 24, 2020 by Mayur

Thanks, @Mayur for your contribution.

Please register at Edureka Community and earn credits for every contribution. A contribution could be asking a question, answering, commenting or even upvoting/downvoting an answer or question.

These credits can be used to get a discount on the course. Also, you could become the admin at Edureka Community with certain points.

Thanks!

commented Jan 24, 2020 by Edureka
• 2,960 points

Thanks it solved my issue.

commented Dec 28, 2020 by anonymous

Archana · Answer 2 · Aug 29, 2018

AWS Lambda usually provides 512 MB of /tmp space. You can use that mount point to store the downloaded S3 files or to create new ones. I have specified the command to do so below.

s3client.download_file(bucket_name, obj.key, '/tmp/'+filename) ... blank_file = open('/tmp/blank_file.txt', 'w')

The working directory used by Lambda is /var/task and it is a read-only filesystem. You will not be able to create files in it.

To know more about AWS lambda and its features in detail check this out! https://www.youtube.com/watch?v=XjPUyGKRjZs

answered Aug 29, 2018 by Archana
• 4,170 points

score +2 · Answer 3 · Dec 7, 2018

You can download the file from S3 bucket

import boto3
bucketname = 'my-bucket' # replace with your bucket name
filename = 'my_image_in_s3.jpg' # replace with your object key
s3 = boto3.resource('s3')
s3.Bucket(bucketname).download_file(filename, 'my_localimage.jpg')

answered Dec 7, 2018 by Jino

Kalgi · Answer 4 · Dec 10, 2018

s3 = boto3.resource('s3')
bucket = s3.Bucket('test-bucket')
for obj in bucket.objects.all():
    key = obj.key
    body = obj.get()['Body'].read()

answered Dec 10, 2018 by Saptdvip

Does this reads all the objects in the bucket? If yes, is there a way to send all these read objects to a sqs through the same lambda?

commented Dec 4, 2019 by anonymous

Yes, you can! Have a look at this:

https://github.com/tesera/lambda-s3-to-sqs

commented Dec 4, 2019 by Kalgi
• 52,350 points

score 0 · Answer 5 · Dec 10, 2018

This is the code i found and can be used to read the file from S3 bucket using lambda function

def lambda_handler(event, context):
    # TODO implement
    import boto3

    s3 = boto3.client('s3')
    data = s3.get_object(Bucket='my_s3_bucket', Key='main.txt')
    contents = data['Body'].read()
    print(contents)

score 0 · Answer 6 · Dec 10, 2018

You can use this function to read the file

exports.handler = (event, context, callback) => {
var bucketName = process.env.bucketName;
var keyName = event.Records[0].s3.object.key;
readFile(bucketName, keyName, readFileContent, onError);
};

awsdbaexpert · Answer 7 · Mar 30, 2019

All of the answers are kind of right, but no one is completely answering the specific question OP asked. I'm assuming that the output file is also being written to a 2^nd S3 bucket since they are using lambda. This code also uses an in-memory object to hold everything, so that needs to be considered:

import boto3

import io


#buckets

inbucket = 'my-input-bucket'

outbucket = 'my-output-bucket'


s3 = boto3.resource('s3')


outfile = io.StringIO()


# Print out bucket names (optional)

for bucket in s3.buckets.all():

    print(bucket.name)


# Pull data from everyfile in the inbucket

bucket = s3.Bucket(inbucket)

for obj in bucket.objects.all():

  x = obj.get()['Body'].read().decode()

  print(x)


# Generate output file and close it!

outobj = s3.Object(outbucket,'outputfile.txt')

outobj.put(Body=outfile.getvalue())

outfile.close()

import boto3

s3 =boto3.resource('s3')

BUCKET_NAME ='give you bucket name here eg. deletemetesting11'

#for key in s3.buckets(BUCKET_NAME).Key:

allFiles = s3.Bucket(BUCKET_NAME).objects.all()

for file in allFiles:

print(file.key) — Jun 7, 2020

Gitika · Answer 8 · Jul 1, 2020

I understand the requirement of knowledge necessary regarding the query can be solved with a demonstration with " how does s3 read data from python? Here goes a small example:

import boto3 client = boto3. client('s3') #low-level functional API resource = boto3. ...
import pandas as pd obj = client. get_object(Bucket='my-bucket', Key='path/to/my/table.csv') grid_sizes = pd. ...
from io import BytesIO obj = client. ...
my_bucket. ...
files = list(my-bucket.

score 0 · Answer 9 · Jul 1, 2020

Before moving forward with the query please try to check that the particular file has downloaded from an s3 bucket or not. Here go "how did I download from s3 bucket with boto3?"

To set up and run this example, you must first:

Configure your AWS credentials, as described in Quickstart.
Create an S3 bucket and upload a file to the bucket.
Replace the BUCKET_NAME and KEY values in the code snippet with the name of your bucket and the key for the uploaded file.