Using AWS lambda with S3 and DynamoDB
What is AWS lambda?
Simply put, it's just a service which executes a given code based on certain events.
Why lambda?
Obviously, we can use sqs or sns service for event based computation but lambda makes it easy and further it logs the code stdout to cloud watch logs.
Using lambda with s3 and dynamodb:
Here we are going to configure lambda function such that whenever an object is created in the s3 bucket we are going to download that file and log that filename into our dynamobd database.
Prerequisites:
Access to s3 and dynamodb for putting and execute operations, here we assume that you have already created a table with the key being the filename in dynamodb.
example main.py file
import re
import json
import traceback
import boto3 s3_resource = boto3.resource('s3')
s3_client = boto3.client('s3')
dynamodb_client = boto3.client('dynamodb') table_name = 'dynamodb_table_name' def lambda_handler(event, context):
bucket_name = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
if not key.endswith('/'):
try:
split_key = key.split('/')
file_name = split_key[-1]
s3_client.download_file(bucket_name, key, '/tmp/'+file_name)
item = {'key': {'S': file_name}}
dynamodb_client.put_item(TableName=table_name, Item=item)
except Exception as e:
print(traceback.format_exc()) return (bucket_name, key)
6. For the role, you can select s3 execution role.
7. Leave all the options as default and click next.
8. You can enable the event source but it's recommended not to, until you test the code is working. So just leave that and create function.
Now that you have configured lambda, to test click on the test which shows the test data, here change the bucket name with the desired one. Click on the test.
This will display the result at the bottom where you can check the output for any syntax errors or bugs.
Once you are confident that the code is fine, you can enable the event source.
Now got to your bucket and create a file. As you create the file, lambda will invoke the function associated with it. You can check the output at cloud watch logs.
Note: