# NxEigen
An Elixir Nx backend that binds the [Eigen C++ library](https://eigen.tuxfamily.org) for efficient numerical computing on embedded systems, specifically targeting the Arduino Uno Q.
## Features
- **Complete Nx.Backend implementation** - All required callbacks implemented
- **Efficient linear algebra** - Uses Eigen's optimized matrix operations
- **FFT support** - Pluggable interface; FFTW3 by default, bring-your-own `.so` for cross-compilation
- **All Nx types** - Support for u8-u64, s8-s64, f32/f64, c64/c128
- **Embedded-friendly** - Bitwise operations, integer math, and efficient memory usage
- **No template metaprogramming nonsense** - Clean, straightforward C++ implementations
## Dependencies
### Required
- **Eigen** (≥3.4.0) - C++ template library for linear algebra
- **FFTW3** - For FFT support (optional; see [FFT Library Choice](#fft-library-choice) below)
- **Elixir** (≥1.14)
- **Erlang/OTP** (≥25)
### Installation
#### Using Local Directories
You can specify a local installation of Eigen:
```bash
# Set environment variables before compiling
export EIGEN_DIR=/path/to/eigen
mix deps.get
mix compile
```
#### FFT Library Choice
FFT support (`Nx.fft/2`, `Nx.ifft/2`) uses a **pluggable C interface** defined in
[`c_src/nx_eigen_fft.h`](c_src/nx_eigen_fft.h). The interface exposes two functions:
```c
int nx_eigen_fft_forward(const double *in, double *out, int n);
int nx_eigen_fft_inverse(const double *in, double *out, int n);
```
Buffers are interleaved complex doubles (`[re0, im0, re1, im1, ...]`, 2×n
doubles total). Both transforms are **unnormalised**; the NIF divides by n
for the inverse. Return 0 on success.
##### Default: FFTW3
By default, NxEigen compiles and links the FFTW3 implementation
(`c_src/nx_eigen_fft_fftw.cpp`). Install FFTW3 on your system:
```bash
# Debian/Ubuntu
sudo apt-get install libfftw3-dev
# macOS (Homebrew)
brew install fftw
# Fedora/RHEL
sudo dnf install fftw-devel
```
##### Configuration
Two environment variables control FFT at build time:
| Variable | Values / meaning |
|----------------------|----------------------------------------------------------------------|
| `NX_EIGEN_FFT_LIB` | `fftw` **(default)** · `none` (stubs that return errors) |
| `NX_EIGEN_FFT_SO` | Absolute path to a custom `.so` – **overrides** `NX_EIGEN_FFT_LIB` |
Examples:
```bash
# Disable FFT entirely
export NX_EIGEN_FFT_LIB=none
mix compile
# Use a custom FFT shared library
export NX_EIGEN_FFT_SO=/path/to/libmy_fft.so
mix compile
```
When using the CMake build path, the same variables are forwarded:
```bash
# Disable FFT via CMake
make USE_CMAKE=1 CMAKE_ARGS="-DNX_EIGEN_FFT_LIB=none"
# Custom .so via CMake
make USE_CMAKE=1 CMAKE_ARGS="-DNX_EIGEN_FFT_SO=/path/to/libmy_fft.so"
```
##### Building a custom FFT `.so`
Implement the two functions declared in `c_src/nx_eigen_fft.h` and compile
them into a shared library for your target platform. Minimal example:
```c
// my_fft.c
#include "nx_eigen_fft.h"
#include <my_platform_fft.h> // your platform's FFT API
int nx_eigen_fft_forward(const double *in, double *out, int n) {
// ... call your platform FFT ...
return 0;
}
int nx_eigen_fft_inverse(const double *in, double *out, int n) {
// ... call your platform IFFT ...
return 0;
}
```
```bash
# Cross-compile for the target
aarch64-linux-gnu-gcc -shared -fPIC -o libmy_fft.so my_fft.c -lmy_platform_fft
```
Then build NxEigen against it:
```bash
export NX_EIGEN_FFT_SO=/path/to/libmy_fft.so
export CROSSCOMPILE=aarch64-linux-gnu-
mix compile
```
At runtime the NIF finds the custom `.so` via `$ORIGIN` rpath, so either
place it next to `priv/libnx_eigen.so` or ensure it's in a standard
library search path on the target.
#### Cross-compilation
This project builds a NIF (`priv/libnx_eigen.so`) via `make`. For cross-compilation you typically want to:
- **Set a toolchain**: `CROSSCOMPILE` (prefix) or `CXX` (full path)
- **Set the target OS** (so we don't add macOS-only linker flags): `TARGET_OS=Linux|Darwin`
- **FFT**: disable with `NX_EIGEN_FFT_LIB=none`, or provide a custom `.so` with `NX_EIGEN_FFT_SO=/path/to/lib.so`
- **(If needed)** override `ERL_INCLUDE_DIR` to a matching Erlang/OTP include directory
Example (toolchain-prefix style):
```bash
export CROSSCOMPILE=aarch64-linux-gnu-
export TARGET_OS=Linux
export EIGEN_DIR=/path/to/eigen
export NX_EIGEN_FFT_LIB=none # or: NX_EIGEN_FFT_SO=/path/to/libmy_fft.so
mix deps.get
mix compile
```
If you already have a CMake toolchain file, you can also build via CMake:
```bash
make USE_CMAKE=1 CMAKE_TOOLCHAIN_FILE=/path/to/toolchain.cmake
```
#### Fully working dev-build → copy `.so` to a Debian arm64 target
Goal: build `priv/libnx_eigen.so` on your dev machine (x86_64/macOS/Linux), then copy it to the target at `/home/arduino/nx_eigen/priv/libnx_eigen.so`.
Key requirements:
- The `.so` must be built for **Linux/aarch64**
- You must compile against the target's **Erlang/OTP NIF headers** (matching the target OTP version)
On the **target** (Debian arm64), install deps:
```bash
sudo apt-get update
sudo apt-get install -y erlang-dev
```
Still on the **target**, print the exact NIF include dir you need:
```bash
erl -noshell -eval 'io:format("~s/erts-~s/include~n", [code:root_dir(), erlang:system_info(version)]), halt().'
```
On the **dev machine**, create a sysroot by copying the target's headers/libs (example using rsync over SSH):
```bash
export TARGET_HOST=arduino@your-target-hostname-or-ip
export SYSROOT=$PWD/sysroot-debian-arm64
mkdir -p "$SYSROOT"
rsync -a "$TARGET_HOST":/usr/include/ "$SYSROOT/usr/include/"
rsync -a "$TARGET_HOST":/usr/lib/ "$SYSROOT/usr/lib/"
rsync -a "$TARGET_HOST":/lib/ "$SYSROOT/lib/"
```
Now build the NIF on the **dev machine** using CMake + sysroot:
```bash
export ERL_INCLUDE_DIR="$SYSROOT/usr/lib/erlang/erts-<VERSION>/include"
make SKIP_DOWNLOADS=1 USE_CMAKE=1 \
CMAKE_TOOLCHAIN_FILE=cmake/toolchains/aarch64-linux-gnu-sysroot.cmake \
CMAKE_BUILD_DIR=$PWD/cmake-build-aarch64 \
CMAKE_BUILD_TYPE=Release \
CMAKE_ARGS="-DCMAKE_SYSROOT=$SYSROOT -DNX_EIGEN_FFT_LIB=none" \ # or -DNX_EIGEN_FFT_SO=/path/to/libmy_fft.so
ERL_INCLUDE_DIR="$ERL_INCLUDE_DIR"
```
Finally copy the result to the **target**:
```bash
scp priv/libnx_eigen.so "$TARGET_HOST":/home/arduino/nx_eigen/priv/
```
Verify on the **target**:
```bash
file /home/arduino/nx_eigen/priv/libnx_eigen.so
ldd /home/arduino/nx_eigen/priv/libnx_eigen.so
```
Or set them in your `mix.exs`:
```elixir
def project do
[
# ...
make_env: %{
"EIGEN_DIR" => "/path/to/eigen",
"CROSSCOMPILE" => "aarch64-linux-gnu-",
"TARGET_OS" => "Linux",
"NX_EIGEN_FFT_LIB" => "none", # or "fftw", or omit and set NX_EIGEN_FFT_SO instead
# "NX_EIGEN_FFT_SO" => "/path/to/libmy_fft.so" # custom FFT for the target
}
]
end
```
## Installation
### From Hex (Recommended)
Add `nx_eigen` to your list of dependencies in `mix.exs`:
```elixir
def deps do
[
{:nx, "~> 0.10"},
{:nx_eigen, "~> 0.1.0"}
]
end
```
**Precompiled binaries are automatically downloaded** for supported platforms:
- Linux: x86_64, aarch64, riscv64 (glibc)
- Arduino Uno Q: aarch64 (**optimized** via `aarch64-arduino-uno-q-linux-gnu`; requires `TARGET_ARCH/TARGET_OS/TARGET_ABI` env vars)
- macOS: x86_64 (Intel), aarch64 (Apple Silicon)
No need to install FFTW separately - it's statically linked into the precompiled binaries.
These binaries are produced by GitHub Actions on version tags; see [PRECOMPILATION.md](PRECOMPILATION.md) for the CI matrix and release steps.
### Supported Platforms
| Platform | Architectures | Notes |
|----------|--------------|-------|
| Linux (glibc) | x86_64, aarch64, riscv64 | Ubuntu, Debian, Fedora, etc. |
| **Arduino Uno Q** | **aarch64** | **Optimized with `-march=armv8-a+crypto+crc`** |
| macOS | x86_64, aarch64 | Intel and Apple Silicon |
The Arduino Uno Q target is specifically optimized for the Qualcomm QRB2210 processor (ARM Cortex-A53) with cryptographic and CRC extensions enabled for maximum performance.
### Forcing Compilation from Source
If you need to compile from source (e.g., for an unsupported platform):
```bash
# Install FFTW first
brew install fftw # macOS
# or
sudo apt-get install libfftw3-dev # Linux
# Then install the package
mix deps.get
mix compile
```
## Usage
```elixir
# Create tensors with the NxEigen backend
t = NxEigen.tensor([[1, 2], [3, 4]])
# All Nx operations work automatically
result = Nx.dot(t, t)
#=> #Nx.Tensor<
#=> s64[2][2]
#=> NxEigen.Backend
#=> [
#=> [7, 10],
#=> [15, 22]
#=> ]
#=> >
# Matrix operations use Eigen's optimized routines
a = NxEigen.tensor([[1.0, 2.0], [3.0, 4.0]], type: {:f, 32})
b = Nx.transpose(a)
result = Nx.dot(a, b)
# FFT (requires FFTW3; see FFT Library Choice in README)
fft_result = Nx.fft(NxEigen.tensor([1.0, 0.0, 0.0, 0.0]), length: 4)
```
## Implementation Details
### Efficient `dot` Operation
The `dot` implementation uses a transpose-reshape-multiply strategy:
1. Transpose axes to `[batch, free, contract]` and `[batch, contract, free]`
2. Use Eigen's optimized matrix multiplication for each batch
3. No manual loops - leverages BLAS-like performance
### Type System
All Nx types are supported via `std::variant` with runtime dispatch:
- Unsigned integers: u8, u16, u32, u64
- Signed integers: s8, s16, s32, s64
- Floating point: f32, f64
- Complex: c64, c128
### Memory Management
- Tensors stored as flat 1D arrays (`Eigen::Array<Scalar, Dynamic, 1>`)
- Shape tracked separately for N-D operations
- Automatic resource cleanup via BEAM
## Using with Arduino Uno Q
The Arduino Uno Q features a Linux microprocessor (Qualcomm QRB2210) alongside an STM32 microcontroller. NxEigen runs on the **Linux side** and provides:
- **Optimized binaries** with `-march=armv8-a+crypto+crc -mtune=cortex-a53` plus Cortex-A53 erratum fixes
- **Static FFTW linking** - no separate installation needed
- **Efficient numerical computing** for sensor data processing
- **Fast FFT operations** for signal processing (30-50% faster than generic ARM64)
- **Matrix operations** for control algorithms (15-25% faster)
- **Hardware acceleration** via NEON SIMD and crypto extensions
### Quick Setup (Required for Optimized Performance)
To get the Arduino Uno Q optimized binary, **set these environment variables before installing**:
```bash
# One-time setup on your Arduino Uno Q
cat >> ~/.bashrc << 'EOF'
export TARGET_ARCH=aarch64
export TARGET_OS=arduino-uno-q-linux
export TARGET_ABI=gnu
EOF
source ~/.bashrc
```
Then install normally:
```bash
cd your_project
mix deps.get # Downloads the optimized binary automatically
```
**Why is this needed?** The Arduino Uno Q reports itself as generic `aarch64-linux-gnu` to Erlang. These environment variables tell the system to fetch the specifically optimized binary with hardware acceleration flags.
**Without these variables:** NxEigen will still work, but you'll get the generic ARM64 binary which is ~20-30% slower.
### Verification
Check you have the optimized binary:
```bash
# Should show: aarch64-arduino-uno-q-linux-gnu (optimized)
ls ~/.cache/elixir_make/nx_eigen-nif-*
```
### Documentation
- **[ARDUINO_UNO_Q_QUICKSTART.md](ARDUINO_UNO_Q_QUICKSTART.md)** - TL;DR setup guide
- **[ARDUINO_UNO_Q.md](ARDUINO_UNO_Q.md)** - Complete deployment guide with examples
- **[TARGET_DETECTION_ISSUE.md](TARGET_DETECTION_ISSUE.md)** - Technical details on target detection
## License
Copyright (c) 2025
## Documentation
### Quick Links
- **[Arduino Uno Q Setup](ARDUINO_UNO_Q_QUICKSTART.md)** - Arduino Uno Q quick start guide
- **[Precompilation Guide](PRECOMPILATION.md)** - Building precompiled binaries
- **[Testing Precompiled Binaries](TESTING_PRECOMPILED.md)** - Testing precompiled binaries
- **[Documentation Index](DOCUMENTATION_INDEX.md)** - Complete documentation overview
### API Documentation
Documentation can be generated with [ExDoc](https://github.com/elixir-lang/ex_doc):
```bash
mix docs
```