This article assumes that you have some knowledge about:
- AWS Lambda functions
A while back, when AWS announced Lambda support for containers, my first reaction was to dismiss it as something to be used only as a last resort, typically if you really wanted to use a language not supported out of the box, such as Rust or PHP. I also assumed that there would be some disadvantages, like slower start time and some additional work to create your own container, but I didn’t look further into it.
It has now been 3 years since this feature was released and I decided it was time I revisited the subject to see how much of my assumptions were true and if containers for Lambda had other usages or limitations that I hadn’t considered at the time.
So let’s get to it and see what we can learn!
Standard Lambda deployment
Let’s start with reviewing how a standard Lambda packaging and deployment normally works.
The usual steps are as follows:
- package your code, usually as a Zip file
- deploy your code, via one of the many tools available (SAM, Cloudformation, Serverless, Terraform…)
- configure one or more triggers to execute your Lambda function (SQS, API Gateway…)
This simple process has several advantages:
- very little management overhead
- offers Lambda Layers to provide common dependencies and extensions
- no need for any Docker experience
- limited runtime choices
- no control over what is eventually present alongside your executable (in the Firecracker MicroVm used by AWS)
- limited package and dependencies size (50 Mb for the package, or 250Mb when decompressed for the Lambda + Layers)
Currently, the supported runtimes are:
- Node.js: 12 to 18
- Python: 3.7 to 3.10
- Java: 8, 11 or 17
- Ruby: 2.7
- .NET 7 and .Net Core 3.1
Let’s now have a look at what Lambda containers offer.
Deployment via Docker container
Why use it?
Before looking at what this type of deployment entails, let’s consider why it might be worth considering an alternative Lambda deployment method. Here are the most common use cases for which a standard Lambda deployment would be inadequate, at least in its base form:
- Runtime: Unsupported runtime (PHP, Rust): Unsupported runtime version, either too recent (Java 19…) or too old (legacy code using Node.js 10…)
- Custom binaries: The need for specific binaries not included inside the default lambda Microvm, for instance pdf processing libraries like
- Performance: A special container setup that allows the code to load and run much faster than the default Lambda (barebone Go image…)
- Dependency size: Very large set of dependencies, for instance for Data Processing via Panda (it can easily exceed the maximum Lambda layer size of 250Mb)
- Deployment package size: Very large deployment package (exceeds the maximum 50Mb)
- Dataset size: A very large dataset needs to be included and used in every Lambda
- Legal: Some regulatory or legal requirements, which impose to limit Lambda to a simple orchestrator, and to control what goes into the container
How does it work?
The only step that changes compared to a standard lambda deployment is the packaging step. But the change is drastic and adds a lot of complexity.
Here are the steps:
- package the code
- choose a base image among the following options (more details in the docs)
a. base image with runtimes included
b. base linux image with Lambda components but no runtime
c. any docker image (Debian, Alpine…) as long as they support the Lambda Runtime API
- write a Dockerfile based on that image
- build your container
- push it to ECR
- provide the ECR image URL in the Lambda config
- deploy the code
- configure the execution triggers
As expected, the operational overhead is a lot more significant when using a Docker image. However, with complexity comes flexibility, since any OS base and Docker image can be used, or even start completely from scratch (literally if the Scratch Docker image is used).
Note that the container will be limited to 10GB in size, much more than what is allowed with standard Lambdas.
This additional operational overhead is however not the only disadvantage when using containers for Lambdas.
What are the downsides?
Using containers instead of the standard Zip archive has several negative implications, which affect both the management and execution performances of Lambdas.
- Operational overhead: As mentioned previously, the build process and the maintenance of the Lambda will be much more complex than the standard process.
- Extra costs: ECR is the only container registry allowed, and using it will result in additional storage costs (VPC transfer inside the same region is free, so no additional network charges).
- Lambda Layers: Lambda Layers are not usable with containers, so no shared libraries, and it also means that the potential cold start speed boost it can provide via caching will not apply.
- Lambda Extensions: Lambda Extensions are a type of Lambda Layers and thus not usable either. This also means that AWS managed extensions such as the Parameters and Secrets extension cannot be used.
- Slow cold start: If the container is relatively large, the cold start performance will be significantly impacted, just as large Zip files affect standard Lambda cold start.
Note that it is however possible to find solutions to compensate or mitigate some of those issues.
For instance the slow cold start for large container can be improved by using one of the AWS amazon base images. This seems counter-intuitive at first, since they are larger than most base Linux Docker images, but those images are cached pro-actively by the Lambda service, which means they will not need to be downloaded from ECR.
As we have seen, in some cases the container for Lambdas option looks viable and can even seem necessary. But some of the advantages it provides can be obtained by other means that will not preclude the usage of the standard Lambda packaging and will avoid most of the operational overhead.
If the main reason for using containers is to use a runtime not officially supported, then consider instead using a Lambda Layer, since it can be used to include a custom runtime. This can solve the problem without any of the previously described downsides.
Similarly to runtimes, custom binaries can be integrated via Lambda Layers. See this article for an example on how this can be achieved.
If the goal is instead to include a common dataset consider loading the dataset in S3 and downloading it from there (using an S3 VPC Gateway endpoint will remove the network fees at no extra cost)
If the problem is that the deployment package is too big (> 50Mb), consider uploading the overly large package to S3 and using it in your Lambda configuration.
Using containers can fit the bill for specific edge cases, but it will imply significant operational overhead and disadvantages. It is worth considering that the difficulties that make us consider this option can instead often be overcome by using the standard Lambda deployment strategy, combined with some simple techniques. So it is best to make sure to consider all the alternatives, as well as to compare cost and maintenance complexity before settling on using this feature.
If you eventually decide that the extra complexity and downsides are worth it for your use case, make sure to test your containers to verify that they will function properly in the AWS Lambda environment via the Lambda RIE.
Thank you for reading! And I hope this helps you make the best decision possible.