Making NumPy (and other problematic Python packages) work seamlessly with AWS Lambda using Serverless Framework
Introduction
AWS Lambda is a great solution for running Python code in serverless manner. Unfortunately, managing dependencies for your serverless Python applications can often be a struggle. If you've previously seen errors like No module named 'numpy.core._multiarray_umath or Unable to import module 'lambda_function': /var/task/lib/nacl/_sodium.abi3.so then you know very well how frustrating it can be. In today's post, I want to present a reliable way of building Python projects for AWS Lambda, especially with dependencies that have C-extensions like NumPy.
Prerequisites
We will be using Serverless Framework to build our sample application. In addition to that, you need to have node, npm, and docker installed on your machine. If you need help setting up Serverless Framework, please refer to getting started guide. For docker, please see the official documentation.
Creating our service and using serverless-python-requirements plugin to solve our problems
In order to make things simple, we will start from a very basic Python service. Let's run the following command that will bootstrap our initial project:
sls create --template aws-python --path serverless-numpyIt will create a project with the following structure:
serverless-numpy
├── README.md
├── handler.py
└── serverless.ymlNow, let's try to add NumPy as a dependency in our project. First, we will create requirements.txt file with the following contents:
numpy==1.23.4Then, let's modify our handler.py file to import and use numpy:
import numpy as np
def hello(event, context):
print(np.__version__)
sample_array = np.array([[1. ,2. ,3.], [4. ,5. ,6.]])
print(sample_array)We're not ready yet for the deployment. First, we also need to configure our project, so it automatically packages all needed dependencies. For that, we will use the serverless-python-requirements plugin.
We will need to install it using npm. In the process, we will also create package.json file in our project. We will use that file to manage installations of plugins for Serverless Framework:
npm init -ynpm install --save-dev serverless-python-requirementsWe also need to instruct Serverless Framework to use serverless-python-plugin, we can do that by adding it to the serverless.yml configuration file:
service: serverless-numpy
frameworkVersion: '3'
provider:
name: aws
runtime: python3.8
functions:
hello:
handler: handler.hello
plugins:
- serverless-python-requirementsNow we can try to deploy our service by running the following command:
sls deployAfter deployment is finished, lets try to invoke our function:
sls invoke --function hello --logBummer, in my case, the invocation failed and instead I saw a very unpleasant (and lengthy) error:
START
[ERROR] Runtime.ImportModuleError: Unable to import module 'handler':
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.
We have compiled some common reasons and troubleshooting tips at:
https://numpy.org/devdocs/user/troubleshooting-importerror.html
Please note and check the following:
* The Python version is: Python3.8 from "/var/lang/bin/python3.8"
* The NumPy version is: "1.23.4"
and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.
Original error was: No module named 'numpy.core._multiarray_umath'
Traceback (most recent call last):END RequestId: 194a6352-a15f-4113-a596-806d94f7b334
END Duration: 2.58 ms (init: 152.29 ms) Memory Used: 40 MB
Environment: darwin, node 16.16.0, framework 3.23.0 (local) 3.22.0v (global), plugin 6.2.2, SDK 4.3.2
Credentials: Local, "datalake" profile
Docs: docs.serverless.com
Support: forum.serverless.com
Bugs: github.com/serverless/serverless/issues
Error:
Invoked function failed
The error above can be caused by many things. In my case its because I'm building the project on an M1 Mac. If you're not building your project on Linux, its highly likely that you'll run into similar error. Well, even on Linux, since AWS Lambda can now use two different processor architectures, you can run into the same problem. Luckily, we can use serverless-python-requirements together with docker to solve our problem. In order to do that, we need to add the following to our serverless.yml configuration:
custom:
pythonRequirements:
dockerizePip: true
dockerImage: public.ecr.aws/sam/build-python3.8:latest-x86_64The configuration above instructs serverless-python-requirements to build and package all dependencies in isolated, reproducible docker container that is very similar to an environment that AWS Lambda uses. We also explicitly specify which container image should be used in the process. If you use different Python version or configured different architecture for your Lambda functions, you might need to adjust it. The list of available container image repositories can be found here.
Note: There is an upcoming version of serverless-python-requirements that will automatically resolve the image and architecture, so providing dockerImage might not be needed anymore.
Let's try to deploy our function and invoke it again:
sls deploysls invoke --function hello --logNow, we can invoke our function without issues and we see the following output:
START
1.23.4
[[1. 2. 3.]
[4. 5. 6.]]
END Duration: 3.83 ms (init: 710.20 ms) Memory Used: 78 MBSummary
In a few simple steps, we were able to create a basic, but reliable starter template for projects that might need to bundle dependencies with C-extensions such as numpy for AWS Lambda. If you'd like to use that as a base for your applications, it's available here. Thanks for reading!