This doc should answer your first concern: https://docs.aws.amazon.com/lambda/latest/dg/retries-on-errors.html
It talks specifically about retry behaviour.
This is a specific you should make note of:
Asynchronous invocation – Asynchronous events are queued before being used to invoke the Lambda function. If AWS Lambda is unable to fully process the event, it will automatically retry the invocation twice, with delays between retries. If you have specified a Dead Letter Queue for your function, then the failed event is sent to the specified Amazon SQS queue or Amazon SNS topic. If you don’t specify a Dead Letter Queue (DLQ), which is not required and is the default setting, then the event will be discarded. For more information, see Dead Letter Queues.
Now getting to your second question:
Use AWS Cloudwatch:
Following are the metrics interesting for you:
- AWS/Lambda - Invocations
- AWS/SNS - NumberOfMessagesPublished
I hope this helps