# Emily — roadmap
Active and future work. Shipped milestones live in [`PLAN.md`](https://github.com/ausimian/emily/blob/main/PLAN.md)
as the historical record; the current shape of the library is in
[`ARCHITECTURE.md`](ARCHITECTURE.md).
## Goals
* **Correctness over performance at every layer.** Every layer has
its own oracle.
* **No synchronous C++ → BEAM calls and no NIF that blocks on a
BEAM operation.** No GenServer on the hot path.
* **Bumblebee-first.** DistilBERT, Qwen3, ViT, Whisper, plus the
Bumblebee 0.7 family (NomicBERT, SmolLM3, ModernBERT).
* **Shippable at every milestone.** Backend-only mode is useful on
its own; the Defn compiler is additive.
## Non-goals
* Ahead-of-time compilation (`mlx::export` / IREE-style).
Complementary, separate effort.
* Windows or non-Apple-Silicon Linux GPU. CPU-only Linux is a
nice-to-have for CI.
* Distributed training (`mlx::distributed::*`), a native optimizer
library, FSDP / ring allreduce. Autodiff + small-scale training
loops are in scope; large-scale distributed is not.
* Drop-in replacement for EMLX. We borrow where it's clearly right,
but we're not constrained by its API.
* User-level GPU kernel JIT from Elixir (`fast::metal_kernel` /
`fast::cuda_kernel`). Orthogonal to Emily's "Nx backend, not a
framework" stance.
## Deferred to post-1.0
Each line summarises a deferred milestone; the rationale and full
revisit plan stays in `PLAN.md` so readers can find the exact scope
that was deferred and why.
* **Typed exception hierarchy** (`Emily.ShapeError`,
`Emily.DtypeError`, `Emily.MLXError`). Re-evaluate at the 2.x
line. See PLAN M19.
* **GPU interop pointers** (`from_pointer` / `to_pointer` on
`Nx.Backend`, plus a public `include/emily.h` for downstream
NIFs). Revisit when a concrete downstream consumer asks. See
PLAN M20.
* **`mix emily.doctor` extensions for source-build diagnostics.**
The Mix task itself shipped in 0.4.x for the precompiled-NIF
path; the broader source-build probe set (Xcode CLT, CMake
version skew, MLX source-tree state) is deferred until adoption
surfaces a pattern of failures that `elixir_make` errors don't
already explain. See PLAN M21.
## In-roadmap MLX capability gaps
Catalogued from the 2026-04-22 audit against MLX 0.31.1+69. Items
already shipped (`einsum`, SDPA sinks, microscaled quantization
modes) are recorded in PLAN. The remaining open items:
| # | Capability | Status | Trigger to revisit |
| --- | ------------------------------------------------------------------------- | --------------------- | ------------------------------------------------- |
| B3 | Sparse / MoE matmuls: `gather_qmm`, `gather_mm`, `block_masked_mm`, `segmented_mm` | Deferred | First MoE model target (e.g. a Qwen3-MoE variant) |
| B4b | FP8 dtype (`to_fp8` / `from_fp8`) | Blocked on Nx upstream | Nx gains FP8, or M16 surfaces a concrete user story |
| B5 | `ThreadLocalStream` / `set_default_stream` | Investigative | Spike to confirm whether it simplifies the per-worker model |
A defn-callable fallback for `Emily.Fast.einsum/2` (currently
eager-only) is also open if a user surfaces cross-backend
composability needs — see PLAN M27.
## 1.0 release
Tracking checklist:
* API docs and HexDocs reviewed for stale references — see issue #96.
* `CHANGELOG.md` accumulated across releases (it is, since 0.3.0).
* `MAINTAINING.md` reflects the precompiled-NIF release flow (it
does, since 0.3.0).
* Worked Bumblebee + quantized-Qwen3 examples in `notebooks/`
(present and grouped in the HexDocs Notebooks section).