Making NumPy (and other problematic Python packages) work seamlessly with AWS Lambda using Serverless Framework
Introduction
AWS Lambda is a great solution for running Python code in serverless manner. Unfortunately, managing dependencies for your serverless Python applications can often be a struggle. If you've previously seen errors like No module named 'numpy.core._multiarray_umath
or Unable to import module 'lambda_function': /var/task/lib/nacl/_sodium.abi3.so
then you know very well how frustrating it can be. In today's post, I want to present a reliable way of building Python projects for AWS Lambda, especially with dependencies that have C-extensions like NumPy
.
Prerequisites
We will be using Serverless Framework to build our sample application. In addition to that, you need to have node
, npm
, and docker
installed on your machine. If you need help setting up Serverless Framework, please refer to getting started guide. For docker, please see the official documentation.
Creating our service and using serverless-python-requirements plugin to solve our problems
In order to make things simple, we will start from a very basic Python service. Let's run the following command that will bootstrap our initial project:
sls create --template aws-python --path serverless-numpy
It will create a project with the following structure:
serverless-numpy
├── README.md
├── handler.py
└── serverless.yml
Now, let's try to add NumPy
as a dependency in our project. First, we will create requirements.txt
file with the following contents:
numpy==1.23.4
Then, let's modify our handler.py
file to import and use numpy
:
import numpy as np
def hello(event, context):
print(np.__version__)
sample_array = np.array([[1. ,2. ,3.], [4. ,5. ,6.]])
print(sample_array)
We're not ready yet for the deployment. First, we also need to configure our project, so it automatically packages all needed dependencies. For that, we will use the serverless-python-requirements
plugin.
We will need to install it using npm
. In the process, we will also create package.json
file in our project. We will use that file to manage installations of plugins for Serverless Framework:
npm init -y
npm install --save-dev serverless-python-requirements
We also need to instruct Serverless Framework to use serverless-python-plugin
, we can do that by adding it to the serverless.yml
configuration file:
service: serverless-numpy
frameworkVersion: '3'
provider:
name: aws
runtime: python3.8
functions:
hello:
handler: handler.hello
plugins:
- serverless-python-requirements
Now we can try to deploy our service by running the following command:
sls deploy
After deployment is finished, lets try to invoke our function:
sls invoke --function hello --log
Bummer, in my case, the invocation failed and instead I saw a very unpleasant (and lengthy) error:
START
[ERROR] Runtime.ImportModuleError: Unable to import module 'handler':
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.
We have compiled some common reasons and troubleshooting tips at:
https://numpy.org/devdocs/user/troubleshooting-importerror.html
Please note and check the following:
* The Python version is: Python3.8 from "/var/lang/bin/python3.8"
* The NumPy version is: "1.23.4"
and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.
Original error was: No module named 'numpy.core._multiarray_umath'
Traceback (most recent call last):END RequestId: 194a6352-a15f-4113-a596-806d94f7b334
END Duration: 2.58 ms (init: 152.29 ms) Memory Used: 40 MB
Environment: darwin, node 16.16.0, framework 3.23.0 (local) 3.22.0v (global), plugin 6.2.2, SDK 4.3.2
Credentials: Local, "datalake" profile
Docs: docs.serverless.com
Support: forum.serverless.com
Bugs: github.com/serverless/serverless/issues
Error:
Invoked function failed
The error above can be caused by many things. In my case its because I'm building the project on an M1 Mac. If you're not building your project on Linux, its highly likely that you'll run into similar error. Well, even on Linux, since AWS Lambda can now use two different processor architectures, you can run into the same problem. Luckily, we can use serverless-python-requirements
together with docker
to solve our problem. In order to do that, we need to add the following to our serverless.yml
configuration:
custom:
pythonRequirements:
dockerizePip: true
dockerImage: public.ecr.aws/sam/build-python3.8:latest-x86_64
The configuration above instructs serverless-python-requirements
to build and package all dependencies in isolated, reproducible docker
container that is very similar to an environment that AWS Lambda uses. We also explicitly specify which container image should be used in the process. If you use different Python version or configured different architecture for your Lambda functions, you might need to adjust it. The list of available container image repositories can be found here.
Note: There is an upcoming version of serverless-python-requirements
that will automatically resolve the image and architecture, so providing dockerImage
might not be needed anymore.
Let's try to deploy our function and invoke it again:
sls deploy
sls invoke --function hello --log
Now, we can invoke our function without issues and we see the following output:
START
1.23.4
[[1. 2. 3.]
[4. 5. 6.]]
END Duration: 3.83 ms (init: 710.20 ms) Memory Used: 78 MB
Summary
In a few simple steps, we were able to create a basic, but reliable starter template for projects that might need to bundle dependencies with C-extensions such as numpy
for AWS Lambda. If you'd like to use that as a base for your applications, it's available here. Thanks for reading!