NxEigen

CIHex.pmHexdocs.pm

High-performance numerical computing for Elixir on embedded systems.

NxEigen is an Nx backend that binds the Eigen C++ library to provide efficient linear algebra and tensor operations. It's specifically optimized for embedded Linux systems like the Arduino Uno Q, bringing BLAS-like performance without external dependencies.

Why NxEigen?

Perfect for running machine learning inference, signal processing, and control algorithms on edge devices.

Features

Dependencies

Required

Installation

Using Local Directories

You can specify a local installation of Eigen:

# Set environment variables before compiling
export EIGEN_DIR=/path/to/eigen

mix deps.get
mix compile

FFT Library Choice

FFT support (Nx.fft/2, Nx.ifft/2) uses a pluggable C interface defined in c_src/nx_eigen_fft.h. The interface exposes two functions:

int nx_eigen_fft_forward(const double *in, double *out, int n);
int nx_eigen_fft_inverse(const double *in, double *out, int n);

Buffers are interleaved complex doubles ([re0, im0, re1, im1, ...], 2×n doubles total). Both transforms are unnormalised; the NIF divides by n for the inverse. Return 0 on success.

Default: FFTW3

By default, NxEigen compiles and links the FFTW3 implementation (c_src/nx_eigen_fft_fftw.cpp). Install FFTW3 on your system:

# Debian/Ubuntu
sudo apt-get install libfftw3-dev

# macOS (Homebrew)
brew install fftw

# Fedora/RHEL
sudo dnf install fftw-devel
Configuration

Two environment variables control FFT at build time:

Variable Values / meaning
NX_EIGEN_FFT_LIBfftw(default) · none (stubs that return errors)
NX_EIGEN_FFT_SO Absolute path to a custom .sooverridesNX_EIGEN_FFT_LIB

Examples:

# Disable FFT entirely
export NX_EIGEN_FFT_LIB=none
mix compile

# Use a custom FFT shared library
export NX_EIGEN_FFT_SO=/path/to/libmy_fft.so
mix compile

When using the CMake build path, the same variables are forwarded:

# Disable FFT via CMake
make USE_CMAKE=1 CMAKE_ARGS="-DNX_EIGEN_FFT_LIB=none"

# Custom .so via CMake
make USE_CMAKE=1 CMAKE_ARGS="-DNX_EIGEN_FFT_SO=/path/to/libmy_fft.so"
Building a custom FFT .so

Implement the two functions declared in c_src/nx_eigen_fft.h and compile them into a shared library for your target platform. Minimal example:

// my_fft.c
#include "nx_eigen_fft.h"
#include <my_platform_fft.h>  // your platform&#39;s FFT API

int nx_eigen_fft_forward(const double *in, double *out, int n) {
    // ... call your platform FFT ...
    return 0;
}

int nx_eigen_fft_inverse(const double *in, double *out, int n) {
    // ... call your platform IFFT ...
    return 0;
}
# Cross-compile for the target
aarch64-linux-gnu-gcc -shared -fPIC -o libmy_fft.so my_fft.c -lmy_platform_fft

Then build NxEigen against it:

export NX_EIGEN_FFT_SO=/path/to/libmy_fft.so
export CROSSCOMPILE=aarch64-linux-gnu-
mix compile

At runtime the NIF finds the custom .so via $ORIGIN rpath, so either place it next to priv/libnx_eigen.so or ensure it's in a standard library search path on the target.

Cross-compilation

This project builds a NIF (priv/libnx_eigen.so) via make. For cross-compilation you typically want to:

Example (toolchain-prefix style):

export CROSSCOMPILE=aarch64-linux-gnu-
export TARGET_OS=Linux
export EIGEN_DIR=/path/to/eigen
export NX_EIGEN_FFT_LIB=none  # or: NX_EIGEN_FFT_SO=/path/to/libmy_fft.so

mix deps.get
mix compile

If you already have a CMake toolchain file, you can also build via CMake:

make USE_CMAKE=1 CMAKE_TOOLCHAIN_FILE=/path/to/toolchain.cmake

Fully working dev-build → copy .so to a Debian arm64 target

Goal: build priv/libnx_eigen.so on your dev machine (x86_64/macOS/Linux), then copy it to the target at /home/arduino/nx_eigen/priv/libnx_eigen.so.

Key requirements:

On the target (Debian arm64), install deps:

sudo apt-get update
sudo apt-get install -y erlang-dev

Still on the target, print the exact NIF include dir you need:

erl -noshell -eval 'io:format("~s/erts-~s/include~n", [code:root_dir(), erlang:system_info(version)]), halt().'

On the dev machine, create a sysroot by copying the target's headers/libs (example using rsync over SSH):

export TARGET_HOST=arduino@your-target-hostname-or-ip
export SYSROOT=$PWD/sysroot-debian-arm64

mkdir -p "$SYSROOT"
rsync -a "$TARGET_HOST":/usr/include/ "$SYSROOT/usr/include/"
rsync -a "$TARGET_HOST":/usr/lib/ "$SYSROOT/usr/lib/"
rsync -a "$TARGET_HOST":/lib/ "$SYSROOT/lib/"

Now build the NIF on the dev machine using CMake + sysroot:

export ERL_INCLUDE_DIR="$SYSROOT/usr/lib/erlang/erts-<VERSION>/include"

make SKIP_DOWNLOADS=1 USE_CMAKE=1 \
  CMAKE_TOOLCHAIN_FILE=cmake/toolchains/aarch64-linux-gnu-sysroot.cmake \
  CMAKE_BUILD_DIR=$PWD/cmake-build-aarch64 \
  CMAKE_BUILD_TYPE=Release \
  CMAKE_ARGS="-DCMAKE_SYSROOT=$SYSROOT -DNX_EIGEN_FFT_LIB=none" \  # or -DNX_EIGEN_FFT_SO=/path/to/libmy_fft.so
  ERL_INCLUDE_DIR="$ERL_INCLUDE_DIR"

Finally copy the result to the target:

scp priv/libnx_eigen.so "$TARGET_HOST":/home/arduino/nx_eigen/priv/

Verify on the target:

file /home/arduino/nx_eigen/priv/libnx_eigen.so
ldd  /home/arduino/nx_eigen/priv/libnx_eigen.so

Or set them in your mix.exs:

def project do
  [
    # ...
    make_env: %{
      "EIGEN_DIR" => "/path/to/eigen",
      "CROSSCOMPILE" => "aarch64-linux-gnu-",
      "TARGET_OS" => "Linux",
      "NX_EIGEN_FFT_LIB" => "none",  # or "fftw", or omit and set NX_EIGEN_FFT_SO instead
      # "NX_EIGEN_FFT_SO" => "/path/to/libmy_fft.so"  # custom FFT for the target
    }
  ]
end

Installation

From Hex (Recommended)

Add nx_eigen to your list of dependencies in mix.exs:

def deps do
  [
    {:nx, "~> 0.10"},
    {:nx_eigen, "~> 0.1.0"}
  ]
end

Precompiled binaries are automatically downloaded for supported platforms:

No need to install FFTW separately - it's statically linked into the precompiled binaries.

These binaries are produced by GitHub Actions on version tags; see PRECOMPILATION.md for the CI matrix and release steps.

Supported Platforms

Platform Architectures Notes
Linux (glibc) x86_64, aarch64, riscv64 Ubuntu, Debian, Fedora, etc.
Arduino Uno Qaarch64Optimized with -march=armv8-a+crypto+crc
macOS x86_64, aarch64 Intel and Apple Silicon

The Arduino Uno Q target is specifically optimized for the Qualcomm QRB2210 processor (ARM Cortex-A53) with cryptographic and CRC extensions enabled for maximum performance.

Forcing Compilation from Source

If you need to compile from source (e.g., for an unsupported platform):

# Install FFTW first
brew install fftw  # macOS
# or
sudo apt-get install libfftw3-dev  # Linux

# Then install the package
mix deps.get
mix compile

Usage

# Create tensors with the NxEigen backend
t = NxEigen.tensor([[1, 2], [3, 4]])

# All Nx operations work automatically
result = Nx.dot(t, t)
#=> #Nx.Tensor<
#=>   s64[2][2]
#=>   NxEigen.Backend
#=>   [
#=>     [7, 10],
#=>     [15, 22]
#=>   ]
#=> >

# Matrix operations use Eigen&#39;s optimized routines
a = NxEigen.tensor([[1.0, 2.0], [3.0, 4.0]], type: {:f, 32})
b = Nx.transpose(a)
result = Nx.dot(a, b)

# FFT (requires FFTW3; see FFT Library Choice in README)
fft_result = Nx.fft(NxEigen.tensor([1.0, 0.0, 0.0, 0.0]), length: 4)

Implementation Details

Efficient dot Operation

The dot implementation uses a transpose-reshape-multiply strategy:

  1. Transpose axes to [batch, free, contract] and [batch, contract, free]
  2. Use Eigen's optimized matrix multiplication for each batch
  3. No manual loops - leverages BLAS-like performance

Type System

All Nx types are supported via std::variant with runtime dispatch:

Memory Management

Using with Arduino Uno Q

The Arduino Uno Q features a Linux microprocessor (Qualcomm QRB2210) alongside an STM32 microcontroller. NxEigen runs on the Linux side and provides:

Quick Setup (Required for Optimized Performance)

To get the Arduino Uno Q optimized binary, set these environment variables before installing:

# One-time setup on your Arduino Uno Q
cat >> ~/.bashrc << 'EOF'
export TARGET_ARCH=aarch64
export TARGET_OS=arduino-uno-q-linux
export TARGET_ABI=gnu
EOF

source ~/.bashrc

Then install normally:

cd your_project
mix deps.get  # Downloads the optimized binary automatically

Why is this needed? The Arduino Uno Q reports itself as generic aarch64-linux-gnu to Erlang. These environment variables tell the system to fetch the specifically optimized binary with hardware acceleration flags.

Without these variables: NxEigen will still work, but you'll get the generic ARM64 binary which is ~20-30% slower.

Verification

Check you have the optimized binary:

# Should show: aarch64-arduino-uno-q-linux-gnu (optimized)
ls ~/.cache/elixir_make/nx_eigen-nif-*

License

Copyright (c) 2025

Documentation

Full API documentation is available on HexDocs.

You can also generate documentation locally:

mix docs

Additional Guides