Innovate anywhere, anytime withruncode.io Your cloud-based dev studio.
Amazon Web Services

Using AWS Lambda with S3 and DynamoDB

2022-07-19

Using AWS lambda with S3 and DynamoDB

What is AWS lambda?

   Simply put, it's just a service which executes a given code based on certain events.

Why lambda?

    Obviously, we can use sqs or sns service for event based computation but lambda makes it easy and further it logs the code stdout to cloud watch logs.

Using lambda with s3 and dynamodb:

     Here we are going to configure lambda function such that whenever an object is created in the s3 bucket we are going to download that file and log that filename into our dynamobd database.

Prerequisites:

Access to s3 and dynamodb for putting and execute operations, here we assume that you have already created a table with the key being the filename in dynamodb.

  1. Goto aws console and click on aws lambda, click over create a lambda function.
    2. You can see blue prints(sample code) for different languages. Choose s3-get-object-python.
    3. Select event source type as s3, select the desired bucket. 
    4. The event type is should be 'created' as we want to capture events only when objects are created and cleck next.
    5. Provide your code name. Now you can see the sample code which includes the boto3 library by default. If you need to include other libraries then you should create a zip file with main code file and all required libraries.
           When you upload a zip file with main code filename being main_file and the handler function inside the main_file being lambda_handler then the 'Handler' option should represent: main_file.lambda_handler.
aws-mp-banner

example main.py file

import re
import json
import traceback
import boto3 s3_resource = boto3.resource('s3')
s3_client = boto3.client('s3')
dynamodb_client = boto3.client('dynamodb') table_name = 'dynamodb_table_name' def lambda_handler(event, context):
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']
    if not key.endswith('/'):
        try:
            split_key = key.split('/')
            file_name = split_key[-1]
            s3_client.download_file(bucket_name, key, '/tmp/'+file_name)
            item = {'key': {'S': file_name}}
            dynamodb_client.put_item(TableName=table_name, Item=item)
        except Exception as e:
            print(traceback.format_exc())     return (bucket_name, key)

6. For the role, you can select s3 execution role.
7. Leave all the options as default and click next.
8. You can enable the event source but it's recommended not to, until you test the code is working. So just leave that and create function.
Now that you have configured lambda, to test click on the test which shows the test data, here change the bucket name with the desired one. Click on the test.
This will display the result at the bottom where you can check the output for any syntax errors or bugs.
Once you are confident that the code is fine, you can enable the event source.
Now got to your bucket and create a file. As you create the file, lambda will invoke the function associated with it. You can check the output at cloud watch logs.

Note:

  • The number of executions at any given time for a lambda function depends on the time taken by the function and number of events per second. So if per second 10 events are triggered and it takes 3 seconds for function to complete then the number concurrent executions will be 10*3 i.e. 30 concurrent executions
  • A function can have a maximum of 300seconds as execution time. The default setting is 3sec.