# Architecture
This document describes how the v0.1 OSS library compiles a TOML manifest into a
running Broadway supervision tree. It's the engineering map of the system; for
the vocabulary it uses (node, port, effect, schema, …) read
[Core concepts](concepts.md) first.
## The compile pipeline at a glance
```
┌─────────────────────────────────────────────────────────────┐
│ .bloccs TOML files (source of truth) │
│ │
│ nodes/enrich.bloccs networks/events.bloccs │
└────────────┬──────────────────────────────┬─────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌────────────────────┐
│ Bloccs.Parser │ │ Bloccs.Parser │
│ .parse_node/1 │ │ .parse_network/1 │
└────────┬─────────┘ └─────────┬──────────┘
│ │
│ %Bloccs.Manifest.Node{} │ %Bloccs.Manifest.Network{}
│ │ (nodes loaded recursively)
▼ ▼
┌──────────────────────────────────────────────────────┐
│ Bloccs.Validator │
│ • edge schemas match across the wire │
│ • DAG-only (cycle detection) │
│ • effects ∈ {http,db,time,random} │
│ • pure_core / effect_shell MFAs well-formed │
└────────────┬─────────────────────────────────────────┘
│ :ok | {:error, [%Issue{}]}
▼
┌──────────────────────────────────────────────────────┐
│ Bloccs.Compiler │
│ .compile/2 → source files on disk │
│ │
│ _build/<env>/bloccs_generated/<network>/ │
│ nodes/<id>.ex (Broadway pipeline modules) │
│ supervisor.ex (Supervisor + edge table) │
└────────────┬─────────────────────────────────────────┘
│ Code.require_file/1
▼
┌──────────────────────────────────────────────────────┐
│ <Network>.Supervisor.start_link/1 │
│ ├── Broadway pipeline per node │
│ │ ├── Bloccs.Producer │
│ │ │ (registered in Bloccs.Registry) │
│ │ ├── Broadway.Topology.Processor │
│ │ │ (calls Context.new → pure_core → │
│ │ │ effect_shell → Router.dispatch) │
│ │ └── Broadway internals │
│ └── (Router registered the edge table at init) │
└──────────────────────────────────────────────────────┘
```
Each box maps to a module under `app/bloccs/lib/bloccs/`. Anything emitted by
the compiler is a real `.ex` file you can read, diff, and step through with
the debugger — the "legible IR" thesis applies to compilation output too.
## Module map
| Module | Responsibility | File |
|---|---|---|
| `Bloccs.Schema` | Versioned schema registry (`Name@N`), `:persistent_term`-backed | `lib/bloccs/schema.ex` |
| `Bloccs.Manifest.{Node,Network,Edge,Port,Effects,Contract,…}` | Typed structs that everything downstream consumes | `lib/bloccs/manifest/{node,network}.ex` |
| `Bloccs.Parser` | TOML → typed struct. Handles node manifests + network manifests, recursive node load, structured error accumulation | `lib/bloccs/parser.ex` |
| `Bloccs.Validator` | Contract enforcement: edge schema-match, DAG-only, effect declaration, MFA well-formedness | `lib/bloccs/validator.ex` |
| `Bloccs.Node` (macro) | `use Bloccs.Node, manifest: "..."`: compile-time parse + validate, `@after_compile` arity check, AST walk warning on undeclared `ctx.effects.X` | `lib/bloccs/node.ex` |
| `Bloccs.Context` + `Bloccs.Effects` | Per-message context with `Capabilities` struct; HTTP/DB/Time/Random adapters with allowlist enforcement; mock backends | `lib/bloccs/context.ex`, `lib/bloccs/effects.ex`, `lib/bloccs/effects/*.ex` |
| `Bloccs.Producer` | Push-based GenStage producer; registered in `Bloccs.Registry` under `{network,node,in_port}` | `lib/bloccs/producer.ex` |
| `Bloccs.Router` | Edge-table dispatch + sink subscriptions for observability/testing | `lib/bloccs/router.ex` |
| `Bloccs.Compiler.{Node,Network}` | Emit Broadway pipeline + Supervisor source files | `lib/bloccs/compiler/{node,network}.ex` |
| `Bloccs.Coverage` | Structural coverage: enumerate port + edge obligations, report reached vs unreached against a trace | `lib/bloccs/coverage.ex` |
| `Bloccs.Trace` | Records a run from telemetry into a `.bloccs-trace`; the coverage reached-set is derived from it | `lib/bloccs/trace.ex` |
| `Mix.Tasks.Bloccs.{New,Validate,Compile,Run,Coverage}` | CLI surface | `lib/mix/tasks/bloccs.*.ex` |
## The unit: a node
A node is one TOML manifest plus one implementation module that `use Bloccs.Node`'s
the manifest. The macro:
1. **Reads** the `.bloccs` file at compile time via `Bloccs.Parser.parse_node/1`.
2. **Validates** via `Bloccs.Validator.validate_node/1`. Bad manifest = `CompileError`.
3. **Generates** module attributes (`@bloccs_manifest`, `@bloccs_declared_effects`,
etc.) plus accessor functions (`__bloccs_manifest__/0`, `__bloccs_pure_core_ref__/0`).
4. **Hooks** `@after_compile` to check that the declared `pure_core` and
`effect_shell` functions exist with the right arity, and to AST-walk the
shell looking for `ctx.effects.X` calls where `X` is not in `[effects]`.
The runtime contract:
- `pure_core(message, ctx) :: {:ok, intermediate} | {:error, reason}`
- `effect_shell(intermediate, ctx) :: {:emit, port, payload} | {:error, reason}`
Pure core is allowed nothing but pure computation. Effect shell receives a
`%Bloccs.Context{}` whose `effects` field is a `%Bloccs.Effects.Capabilities{}`
struct — each declared axis (`http`, `db`, `time`, `random`) is a concrete
adapter; each *undeclared* axis binds to a denied-capability stub whose every
method raises `Bloccs.Effects.Denied`. That's the runtime half of the
capability guarantee.
## The composition: a network
A network manifest is a TOML file declaring `[nodes]` (each `use`-ing a node
manifest), `[[edges]]` (a list of wires), `[expose]` (maps the network's public
ports for use as a subgraph, and resolves the entry/exit ports for
`mix bloccs.run`), `[supervision]` (strategy + restart policy), and `[deploy]`
(per-node concurrency overrides).
Parser produces a `%Bloccs.Manifest.Network{}` with every referenced node
manifest loaded recursively. The validator then runs all per-node checks
plus the network-wide ones (edge endpoints exist, edge schemas match
end-to-end, the graph is acyclic, exposes reference real ports, supervision
strategy is one of `:one_for_one | :one_for_all | :rest_for_one`).
## Compilation strategy
`Bloccs.Compiler.compile/2` writes `.ex` source files into
`_build/<env>/bloccs_generated/<network_id>/`:
- One Broadway pipeline module per node (`nodes/<local_id>.ex`)
- One Supervisor module (`supervisor.ex`)
The generated supervisor's `init/1` registers the edge table with
`Bloccs.Router` before starting children, so dispatch is always wired before
any pipeline can emit.
Each generated pipeline module:
```elixir
defmodule Bloccs.Compiled.<Network>.<Node> do
use Broadway
def start_link(opts) do
Broadway.start_link(__MODULE__,
name: <pipeline_name>,
producer: [
module: {Bloccs.Producer, [name: <producer_name>]},
concurrency: 1
],
processors: [default: [concurrency: <from manifest [deploy]>]]
)
end
def handle_message(_, msg, _) do
manifest = <ImplModule>.__bloccs_manifest__()
ctx = Bloccs.Context.new(
effects: Bloccs.Effects.bind(manifest),
received_at: DateTime.utc_now()
)
with {:ok, intermediate} <- <ImplModule>.transform(msg.data, ctx),
{:emit, port, payload} <- <ImplModule>.execute(intermediate, ctx) do
Bloccs.Router.dispatch(<network>, <local_id>, port, payload)
msg
else
{:error, reason} -> Broadway.Message.failed(msg, reason)
end
end
end
```
Why source files instead of `Module.create/3` in memory? Three reasons:
1. **Debuggable stack traces.** Errors point at real file paths.
2. **PR-reviewable.** A topology change shows up as a diff in
`_build/bloccs_generated/`, not as opaque bytecode.
3. **Agent-readable.** The legibility thesis applies to the compiled artifact
as much as to the manifest.
## Routing model
```
upstream node Router downstream node
────────────────── ───── ─────────────────
effect_shell ──{:emit, port, payload}──► dispatch/4
│
┌─── lookup({network, node, port}) → targets
│
├── for each {to_node, to_port}:
│ producer = Bloccs.Registry.lookup(...)
│ GenStage.cast(producer, {:push, payload})
│
└── for any sink subscriber on the SOURCE port:
send(pid, {:bloccs_sink, network, node, port, payload})
```
The sink subscription model exists for two reasons:
- **Tests.** The integration test in `test/integration/events_test.exs`
registers itself as a sink listener on the exposed output ports and asserts
on the messages it sees.
- **Trace recording.** `Bloccs.Trace` listens on telemetry spans to record a
run into a `.bloccs-trace`, which `mix bloccs.coverage` reads back.
## Producer / Broadway interaction quirk
Broadway names producer processes itself
(`<pipeline_name>.Broadway.Producer_0`). To give the router a stable address
we ALSO register the canonical name in `Bloccs.Registry` when
`Bloccs.Producer` starts. This sidesteps the limitation that a BEAM process
can only carry one `Process.register`-style atom name. See
`lib/bloccs/producer.ex` for the registration logic.
## Effect capability model
Two layers of guarantee, by design:
1. **Compile-time** (`Bloccs.Node` `@after_compile`): AST-walk the
`effect_shell` function, find every `ctx.effects.X.*` access, warn if `X`
is not declared in `[effects]`. Today this is a `IO.warn/2`; promoting it
to a hard `CompileError` is a v0.2 goal once we trust the AST inference.
2. **Runtime** (`Bloccs.Effects.bind/1`): builds a `Capabilities` struct
where every declared axis is a real adapter and every undeclared axis is a
denied-capability stub whose every method raises `Bloccs.Effects.Denied`.
Calls to declared adapters still enforce the per-call allowlist (HTTP
host, DB scope).
The runtime layer is the load-bearing guarantee. The compile-time layer is
ergonomics — a fast feedback loop telling you "you forgot to declare that
effect."
## How to test the whole thing
```bash
cd app/bloccs
mix check # format + warnings + tests + dialyzer
mix test test/integration/events_test.exs # the headline end-to-end assertion
```
The integration test parses `examples/events/networks/events.bloccs`,
validates it, compiles it to source files, starts the generated supervisor,
registers itself as a sink listener on the three exposed output ports
(`persist.stored`, `notify.notified`, `deadletter.recorded`), pushes a webhook
event into `ingest.received`, and asserts the right messages arrive — a known
event fans out to `persist` + `notify`, an unknown type lands in `deadletter`,
and a replayed id is deduped by the enrich node's idempotency.
## Shipped since the initial v0.1 cut
- **Flow primitives** — filter + multi-emit (`:drop` / `{:emit, [...]}`), merge
(fan-in), `[batch]` windows, `[join]` (multi-input correlation by key), and
`[rate]` / `[delay]`.
- **Subgraph composition** — a `[nodes]` entry may `use` a network manifest; the
parser flattens it into namespaced leaf nodes.
- **`.bloccs-trace` recording + real coverage** — `Bloccs.Trace` records a run
from telemetry; `mix bloccs.coverage` reports real structural coverage.
## Deliberately not in v0.1
- **Phoenix LiveView canvas** (v0.3+) — the manifest is canonical; the
canvas is a view.
- **MCP server** (v0.2) — for agent authoring of `.bloccs` files.
- **Pro / encrypted package distribution** (post-v0.1) — modeled on Oban Pro;
separate private repo.
- **Polyglot `pure_core`** (HTTP / WASM sidecar refs) — opens up non-BEAM
cores for the optimization-camp interop story.
- **Cyclic networks** — v0.1 is DAG-only; cycles unblock the
self-reflection/iterative-refinement patterns the ACG literature
catalogues.
- **Full Dialyzer-level effect proof** — v0.1 ships runtime guarantee +
compile-time warning; hard `CompileError` is a v0.2+ goal once type-row
inference matures.