If you need to turn noisy audio into studio-quality audio with no background noise, DeepFilter 3 is an excellent choice.
Unfortunately, setting up a Pytorch-based process on AWS Lambda is not particularly straightforward. This guide aims to make it so.
If you want to skip the explanation and go straight to the code, you can find it on Github.
DeepFilterNet is an open-source, low-complexity speech enhancement framework that uses deep filtering techniques for real-time noise suppression in full-band audio (48kHz). Developed primarily by Hendrik Schröter and collaborators, the project has evolved through multiple iterations—DeepFilterNet, DeepFilterNet2, and DeepFilterNet3—each refining the model to improve efficiency, accuracy, and real-time processing capabilities, particularly for embedded devices.
Initially launched to provide an efficient solution for speech enhancement without the computational overhead of traditional methods, DeepFilterNet combines the power of deep learning with signal processing techniques like short-time Fourier transform (STFT). The framework is implemented in both Rust and Python, supporting Linux, macOS, and Windows. It also offers integration as a virtual noise suppression microphone via LADSPA and PipeWire, demonstrating its versatility in real-time applications.
Its evolution is marked by continuous improvements, such as multi-frame filtering for hearing aids and perceptually motivated models, reflecting the project’s commitment to enhancing audio clarity in diverse environments and devices.
DeepFilter can be run on GPU as well as on CPU, taking advantage of Pytorch CPU.
In this guide, we won't bother using GPU as we'll be running the process on AWS Lambda. If you want to use GPU, follow the instructions here to use CUDA.
To run DeepFilter, you will need:
Create a new folder called models
in the root of your project.
Add the model configuration and checkpoint files to the models
folder.
models ├── config.ini └── checkpoints └── model_120.ckpt.best
You can get the model configuration and checkpoint files here. Alternatively, you can download and unzip the model files from Deepfilter.
In the root of your project, create a file called Dockerfile
.
Start from a Python 3.10 runtime.
FROM public.ecr.aws/lambda/python:3.10
You need Python 3.10 because DeepFilterNet3 requires Python 3.10. Using later versions will result in installation candidates for Pytorch 2.0 not being found.
Next, install system dependencies using yum
. If you are using Amazon Linux 2023, you will need to use dnf
instead of yum
.
RUN yum update -y && \ yum install -y \ git \ wget \ tar \ xz \ gcc \ gcc-c++ \ make \ openssl-devel \ bzip2-devel \ libffi-devel \ zlib-devel \ pkg-config \ && yum clean all
Next, install HDF5 from source. HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data.
ENV HDF5_VERSION=1.12.2 RUN wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.12/hdf5-${HDF5_VERSION}/src/hdf5-${HDF5_VERSION}.tar.gz && \ tar -xzf hdf5-${HDF5_VERSION}.tar.gz && \ cd hdf5-${HDF5_VERSION} && \ ./configure --prefix=/usr/local/hdf5 && \ make && \ make install && \ cd .. && \ rm -rf hdf5-${HDF5_VERSION} hdf5-${HDF5_VERSION}.tar.gz
Then set the environment variable for HDF5.
# Set HDF5 environment variables ENV HDF5_DIR=/usr/local/hdf5 \ HDF5_LIBDIR=/usr/local/hdf5/lib \ HDF5_INCLUDEDIR=/usr/local/hdf5/include \ LD_LIBRARY_PATH=/usr/local/hdf5/lib:$LD_LIBRARY_PATH \ PATH=/usr/local/hdf5/bin:$PATH
You will need to install Rust and Cargo for the compilers to work.
You can do so with the following commands:
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y ENV PATH="/root/.cargo/bin:${PATH}"
Next, you will need to install a static build of Ffmpeg.
Static builds are preferred because they include all of the dependencies in a single executable, as opposed to dynamic builds which require you to install the dependencies separately. On Amazon Linux, which is a cross between Centos and Fedora, this can leave you needing to install a lot of incompatible dependencies simply for the purpose of compiling Ffmpeg.
It is much easier to use a static build of Ffmpeg.
Thankfully, John Van Sickle has compiled a static build of Ffmpeg that you can download here.
Add the following to your Dockerfile:
RUN wget https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-arm64-static.tar.xz && \ tar xvf ffmpeg-release-arm64-static.tar.xz && \ mv ffmpeg-*-arm64-static/ffmpeg /usr/local/bin/ && \ mv ffmpeg-*-arm64-static/ffprobe /usr/local/bin/ && \ rm -rf ffmpeg-*-arm64-static*
Next, use pip to install Pytorch and torchaudio with CPU support.
RUN pip install torch==2.0.0 torchaudio==2.0.1 -f https://download.pytorch.org/whl/cpu/torch_stable.html
Next, install the Python dependencies that don't require compilation.
RUN pip install numpy pydub boto3 deepfilternet
Set additional environment variables and print out key information to ensure the build process is working correctly.
# Set additional environment variables for the build process ENV RUSTFLAGS="-L ${HDF5_LIBDIR}" \ LIBHDF5_LIBDIR=${HDF5_LIBDIR} \ LIBHDF5_INCLUDEDIR=${HDF5_INCLUDEDIR} # Debug: Print out key information RUN echo "HDF5_DIR: $HDF5_DIR" && \ echo "HDF5_LIBDIR: $HDF5_LIBDIR" && \ echo "HDF5_INCLUDEDIR: $HDF5_INCLUDEDIR" && \ echo "PKG_CONFIG_PATH: $PKG_CONFIG_PATH" && \ ls -l $HDF5_LIBDIR && \ ls -l $HDF5_INCLUDEDIR
Set the workdir and copy the function code, modules, and model files to the task root.
# Set back to Lambda task root WORKDIR ${LAMBDA_TASK_ROOT} # Copy function code COPY main.py ${LAMBDA_TASK_ROOT}/ COPY modules/ ${LAMBDA_TASK_ROOT}/modules/ COPY models/ /opt/deepfilter_models/
CMD ["main.lambda_handler"]
Build the container with the following command:
docker build -t deepfilter-lambda .
If you are using ECR, you can push the container to ECR with the following command:
docker push <account-id>.dkr.ecr.<region>.amazonaws.com/deepfilter-lambda:latest
Lastly, deploy the AWS Lambda container with serverless.
functions: cleanAudio: image: {accountID}.dkr.ecr.eu-west-2.amazonaws.com/deepfilter-lambda:latest timeout: 600 memorySize: 2048 ephemeralStorageSize: 4096 provisionedConcurrency: 1
Provisioned concurrency is a feature that allows you to reserve a certain number of concurrent executions for your Lambda function. This can help you ensure that your function has enough resources to handle the expected load and avoid cold starts.
If your API needs to be extremely responsive, even in the events of cold starts, you should use a provisioned concurrency value greater than 0, though beware that you will be charged per provisioned concurrency unit per hour. If you do not need this level of responsiveness, you can omit the provisionedConcurrency
parameter.
You should not attempt to run large or long files through this API for several reasons.
If you need to clean long files, you should use a more powerful runtime such as Graviton3 or Graviton4 or consider using AWS Batch.
This guide has shown you how to set up a free audio cleaning API with DeepFilter and Lambda.
You can get the code for this guide on Github.
If you found it helpful, please leave a star!