My requirements are as follows.
Every two hours, all the messages in a SQS are read, collected, and processed.
A file containing information from SQS messages is created as part of processing, and it is then uploaded to a sftp server.
I used an AWS Lambda to implement step 1 of the process. My Lambda contains a sqs trigger. I've set the batch timeframe to 2 hours and the batch size to 50. My presumption was that Lambda would be activated every two hours, that it would receive 50 messages all at once, and that I would create a file for every 50 records.
But I observed that my lambda function is triggered with varied number of messages(sometimes 50 sometimes 20, sometimes 5 etc) even though I have configured batch size as 50.
After reading some documentation I got to know(I am not sure) that there are 5 long polling connections which lambda spawns to read from SQS and this is causing this behaviour of lambda function being triggered with varied number of messages.
My question is
- Is my assumption on 5 parallel connections being established correct? If yes, is there a way I can control it? I want this to happen in a single thread / connection
- If 1 is not possible, what other alternative do I have here. I do not want to have one file created for every few records. I want one file to be generated every two hours with all the messages in sqs.