guides/building.md

Select File:
# Building erllama

erllama is a single OTP application with a single NIF
(`erllama_nif.so`). The first compile builds the vendored
`c_src/llama.cpp/` (~3 minutes on a fast machine), then compiles the
small NIF surface and a CRC table. Subsequent builds reuse the cmake
cache and finish in seconds.

## Toolchain requirements

| | Required | Notes |
|---|---|---|
| Erlang/OTP | **28** | rebar.config declares `{minimum_otp_vsn, "28"}`. |
| rebar3 | **3.25.0+** | Earlier 3.24.x is fine for compile but the CI pinned version is 3.25.0. |
| C++17 toolchain | clang 14+ or gcc 11+ | Apple clang as shipped on macOS works. |
| cmake | **3.20+** | llama.cpp's own minimum is 3.18; we set 3.20 for the FindErlang module. |
| pthreads | yes | Linked via CMake's `Threads::Threads`. |

Build-time dependencies are platform-specific; the recipes below
match what CI installs.

## Linux (Ubuntu 24.04 amd64 / arm64)

```bash
sudo apt-get install -y build-essential cmake
# Erlang/OTP 28 from erlef setup-beam (manual install also fine).
asdf install erlang 28.0 && asdf local erlang 28.0
asdf install rebar 3.25.0 && asdf local rebar 3.25.0
rebar3 compile
```

OpenMP is intentionally disabled in `c_src/CMakeLists.txt`
(`set(GGML_OPENMP OFF ...)`); the system `libgomp.a` ships without
`-fPIC` on stock Ubuntu, which would break the shared NIF link with
`R_X86_64_TPOFF32 against hidden symbol gomp_tls_data`. Disabling
OpenMP at the ggml level avoids that entirely; the GPU paths
(Metal/CUDA) are unaffected.

CUDA is off by default. Enable with:

```bash
ERLLAMA_OPTS=-DGGML_CUDA=ON rebar3 compile
```

## macOS (Apple Silicon and Intel)

```bash
brew install erlang@28 rebar3 cmake
echo 'export PATH="$(brew --prefix erlang@28)/bin:$PATH"' >> ~/.zshrc
rebar3 compile
```

Metal and Apple BLAS (Accelerate) are auto-detected and on by
default. Compile is ~30 s after the first ggml build is cached.

## FreeBSD (14.2 / 14.4)

```sh
# The cached FreeBSD VM image (or a freshly-installed system) ships
# an older libpcre2 than the git package in the latest pkg repo
# expects (PCRE2_10.47 not defined). Refresh first so git can load.
pkg install -y pcre2

# erllama needs OTP 28+; the base `erlang` package is 26.x.
# erlang-runtime28 installs OTP 28 under /usr/local/lib/erlang28.
pkg install -y erlang-runtime28 cmake bash gmake git

export PATH="/usr/local/lib/erlang28/bin:/usr/local/bin:$PATH"

# llama.cpp's build-info cmake script invokes `git rev-parse`. When
# the build directory's owner differs from the user (typical inside
# CI VMs), git refuses with "dubious ownership" — allow the path.
git config --global --add safe.directory "$PWD"

# rebar3 isn't always available as a pkg; fetch it once.
fetch https://github.com/erlang/rebar3/releases/download/3.25.0/rebar3 -o rebar3
chmod +x rebar3

./rebar3 compile
```

## Erlang ERTS detection

The build needs `erl_nif.h` from the Erlang installation. erllama
uses `c_src/CMake/FindErlang.cmake` (adopted from erlang-rocksdb)
which runs `erl -noshell -eval` to read `code:lib_dir/0` /
`code:root_dir/0` and exports `ERLANG_ERTS_INCLUDE_PATH`. If the
caller pre-sets the `ERTS_INCLUDE_DIR` environment variable, that
takes precedence (useful for cross-compilation or pinned headers).

## What the build produces

- `priv/erllama_nif.so` — the single NIF, statically linked against
  the vendored `c_src/llama.cpp` (libllama, libggml, ggml-cpu, plus
  the platform GPU/BLAS backends) and `c_src/crc32c.c`.
- `_build/default/lib/erllama/ebin/*.beam` — Erlang modules.
- `_build/cmake/` — CMake build dir; cached for incremental builds.

## Common build issues

- **`'erl_nif.h' file not found`** — `ERTS_INCLUDE_DIR` is wrong.
  `FindErlang.cmake` should resolve it automatically; if it fails,
  set the env var explicitly:
  `ERTS_INCLUDE_DIR=$(erl -noshell -eval 'io:format("~s",[filename:join([code:root_dir(),"erts-"++erlang:system_info(version),"include"])]),halt().') rebar3 compile`.
- **`R_X86_64_TPOFF32 against hidden symbol gomp_tls_data`** — your
  `libgomp.a` is non-PIC. erllama's CMakeLists already sets
  `GGML_OPENMP OFF` to avoid this. If you re-enabled OpenMP, build
  a PIC `libgomp` or leave it off.
- **`PCRE2_10.47 not defined`** when running git on FreeBSD — refresh
  `pcre2` first: `pkg install -y pcre2`. The cached VM image lags
  the latest repo.
- **macOS metal init slow on first model load** — the lazy
  `llama_backend_init` runs on the first `erllama:load_model/1` call
  and discovers Metal devices. eunit cases that load a model need
  a generator timeout >5 s; see
  `test/erllama_nif_tests.erl:load_model_rejects_non_existent_path_test_/0`
  for the pattern.

## Verifying the build

```bash
rebar3 fmt --check
rebar3 compile
rebar3 xref
rebar3 dialyzer
rebar3 lint
rebar3 eunit       # 162 tests, 0 failures
rebar3 ct          # 7 stub-backend cases pass; 6 real-model cases skip
```

End-to-end against a real GGUF:

```bash
LLAMA_TEST_MODEL=/path/to/tinyllama-1.1b-chat.gguf \
    rebar3 ct --suite=test/erllama_real_model_SUITE
```

Without the env var the suite skips, so default `rebar3 ct` stays
green on machines without a model file.

## Bumping the vendored llama.cpp

See [UPDATE_LLAMA.md](../UPDATE_LLAMA.md) at the project root.