Reactive Applications with AWS Lambda

Sometimes you may find yourself requiring a CRON script to clean a file, or maybe you need to watch a directory of images to create preview thumbnails when they arrive on the server. Processes like these suffer from the same limitation; they require you to poll a script until you get a “successful” result.

This is problematic because it forces the developer to write redundancy checks in the code instead of just focusing on the core problem. Moreover, file watching utilities generally notify once the file is created, not when the file is finished writing. All of these problems must be accounted for, and result in more complexity, overhead, and development time.

This is where event driven programming can greatly reduce your development overhead. Part of that is maintaining a centralized data lake for all of your raw files. Data lakes generally maintain an event API for easy management and access of files within the lake. In our case, Amazon S3 is the data lake of choice and thanks to AWS Lambda we can hook into the S3 event API with minimal effort for simple use cases like cleaning files.

What is Lambda?

Lambda is an event driven compute service. It allows developers to create micro-applications or scripts that are run on demand without the need for managing the underlying server. Simply upload your code and bind the application to certain Amazon events and the program will run every time those events fire.

Given the file below, we have a fully running Lambda program that can scale on demand. This type of provisioning is extremely useful when processing multiple files at once because of parallel execution we can process hundreds of files concurrently and at a fraction of the cost of maintaining a full time server to hold these scripts.

app.js

// Import the AWS SDK and S3 client

var aws = require(‘aws-sdk’);

var s3 = new aws.S3();// Simple wrapper to extract the bucket and key

// from the S3 Lambda event after the file is downloaded

// the “done” callback is executed

function downloadS3File(event, done) {

s3.getObject({

Bucket: event.Records[0].s3.bucket.name,

Key: event.Records[0].s3.object.key

}, done);

}// “exports.handler” is the application entry point

// that Lambda will use to start the program

// Event time an S3 Lambda event is fired this

// callback will be executed

exports.handler = function(event, response) {

downloadS3File(event, function(error, file) {

if (error) {

console.error(‘Error:’, error);

return;

}

// Do something with the file…

});

}

How it works

This simple script will download the file that triggered the event on S3 and let you perform arbitrary actions on it. The idea behind Lambda is that you can quickly process data very fast and on demand. This means all Lambda scripts have a mandatory timeout limit before the script is forcefully exited. The longest a Lambda script can run is 60 seconds. Therefore its imperative not to treat Lambda like a full scale queue that can run long running processes because it simply won’t work.

Lambda also provides 500mb of temporary disk space per execution this is useful if you need to buffer a larger file. Its worth noting that it is possible for multiple executions to share the same temporary disk space therefore care should be taken when relying on files written to the /tmp directory.

With the new trend in event driven programming we are seeing a wider spread adoption across enterprise platforms such as AWS. This makes sense when you consider the reduced programming costs from having to design a complicated solutions or maintain extra servers.

Lambda will continue to mature as more people use it and the development team at Koddi is excited to see where this project will go and the new features to come.