Making NumPy (and other problematic Python packages) work seamlessly with AWS Lambda using Serverless Framework

| 3 min read

Introduction

AWS Lambda is a great solution for running Python code in serverless manner. Unfortunately, managing dependencies for your serverless Python applications can often be a struggle. If you've previously seen errors like No module named 'numpy.core._multiarray_umath or Unable to import module 'lambda_function': /var/task/lib/nacl/_sodium.abi3.so then you know very well how frustrating it can be. In today's post, I want to present a reliable way of building Python projects for AWS Lambda, especially with dependencies that have C-extensions like NumPy.

Prerequisites

We will be using Serverless Framework to build our sample application. In addition to that, you need to have node, npm, and docker installed on your machine. If you need help setting up Serverless Framework, please refer to getting started guide. For docker, please see the official documentation.

Creating our service and using serverless-python-requirements plugin to solve our problems

In order to make things simple, we will start from a very basic Python service. Let's run the following command that will bootstrap our initial project:

sls create --template aws-python --path serverless-numpy

It will create a project with the following structure:

serverless-numpy
├── README.md
├── handler.py
└── serverless.yml

Now, let's try to add NumPy as a dependency in our project. First, we will create requirements.txt file with the following contents:

numpy==1.23.4

Then, let's modify our handler.py file to import and use numpy:

import numpy as np


def hello(event, context):
print(np.__version__)

sample_array = np.array([[1. ,2. ,3.], [4. ,5. ,6.]])
print(sample_array)

We're not ready yet for the deployment. First, we also need to configure our project, so it automatically packages all needed dependencies. For that, we will use the serverless-python-requirements plugin.

We will need to install it using npm. In the process, we will also create package.json file in our project. We will use that file to manage installations of plugins for Serverless Framework:

npm init -y
npm install --save-dev serverless-python-requirements

We also need to instruct Serverless Framework to use serverless-python-plugin, we can do that by adding it to the serverless.yml configuration file:

service: serverless-numpy

frameworkVersion: '3'

provider:
name: aws
runtime: python3.8

functions:
hello:
handler: handler.hello

plugins:
- serverless-python-requirements

Now we can try to deploy our service by running the following command:

sls deploy

After deployment is finished, lets try to invoke our function:

sls invoke --function hello --log

Bummer, in my case, the invocation failed and instead I saw a very unpleasant (and lengthy) error:

START
[ERROR] Runtime.ImportModuleError: Unable to import module 'handler':

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

* The Python version is: Python3.8 from "/var/lang/bin/python3.8"
* The NumPy version is: "1.23.4"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: No module named 'numpy.core._multiarray_umath'

Traceback (most recent call last):END RequestId: 194a6352-a15f-4113-a596-806d94f7b334
END Duration: 2.58 ms (init: 152.29 ms) Memory Used: 40 MB

Environment: darwin, node 16.16.0, framework 3.23.0 (local) 3.22.0v (global), plugin 6.2.2, SDK 4.3.2
Credentials: Local, "datalake" profile
Docs: docs.serverless.com
Support: forum.serverless.com
Bugs: github.com/serverless/serverless/issues

Error:
Invoked function failed

The error above can be caused by many things. In my case its because I'm building the project on an M1 Mac. If you're not building your project on Linux, its highly likely that you'll run into similar error. Well, even on Linux, since AWS Lambda can now use two different processor architectures, you can run into the same problem. Luckily, we can use serverless-python-requirements together with docker to solve our problem. In order to do that, we need to add the following to our serverless.yml configuration:

custom:
pythonRequirements:
dockerizePip: true
dockerImage: public.ecr.aws/sam/build-python3.8:latest-x86_64

The configuration above instructs serverless-python-requirements to build and package all dependencies in isolated, reproducible docker container that is very similar to an environment that AWS Lambda uses. We also explicitly specify which container image should be used in the process. If you use different Python version or configured different architecture for your Lambda functions, you might need to adjust it. The list of available container image repositories can be found here.

Note: There is an upcoming version of serverless-python-requirements that will automatically resolve the image and architecture, so providing dockerImage might not be needed anymore.

Let's try to deploy our function and invoke it again:

sls deploy
sls invoke --function hello --log

Now, we can invoke our function without issues and we see the following output:

START
1.23.4
[[1. 2. 3.]
[4. 5. 6.]]
END Duration: 3.83 ms (init: 710.20 ms) Memory Used: 78 MB

Summary

In a few simple steps, we were able to create a basic, but reliable starter template for projects that might need to bundle dependencies with C-extensions such as numpy for AWS Lambda. If you'd like to use that as a base for your applications, it's available here. Thanks for reading!