# Hooks and the turn lifecycle
## The turn as a pipeline
A turn runs as a pipeline; the mental model to steal is Plug. Each hook receives the
accumulated state and returns continue/halt, and the runtime threads them in order.
Two hook surfaces run at different points, plus one optional surface on the token
stream.
## The hook context object — `%Agentix.Turn{}`
Hooks (and `:server` tools) receive and return a thin struct, not a bare ReqLLM
`Context`:
```
%Agentix.Turn{
context: %ReqLLM.Context{}, # the conversation so far
user_message: %ReqLLM.Message{}, # the message that opened this turn
turn_ref: term, # correlates live events + audit
scope: %Agentix.Scope{} # Phoenix 1.8 Scope-style: current_user, etc.
}
```
`scope` is where ambient state lives, so `:server` tools get their context argument
from the same place hooks do — one mechanism, not two.
## Pre-message hooks (injection)
Run after a user message arrives, before the model is called. Each returns
`{:cont, turn}` or `{:halt, reason}`. This is the "inject context based on the user
message" requirement: a hook does retrieval, pulls memory, or appends `ContentPart`s
for this turn only.
### Resolved: concurrency — ordered list with an opt-in parallel group
Default to an ordered pipeline (predictable). A hook may be declared in a parallel
batch when it is independent, so retrieval calls that each do I/O don't serialize
their latency before the first token. **No dependency DAG** — that's overengineering
for v0; ordered-with-parallel-groups covers nearly everything.
**Merge rule for parallel batches:** concurrent hooks can't thread the turn, so
they are **append-only** — each returns ContentParts to add, merged in declaration
order; they may not otherwise mutate the turn. Mutation stays exclusive to ordered
hooks. Without this rule, "what does a parallel group return" becomes an
implementation-time argument.
### Resolved: tool-availability mutation — deferred to v0.1
Letting a hook change the tool list per turn is powerful but the tool list is part of
the cacheable prefix, so per-turn mutation fights prompt caching and complicates the
type model. v0: tools are fixed per agent config.
## Post-message hooks
Run after the assistant turn resolves: persistence, triggering async summarization,
memory write-back, guardrail checks. A guardrail that rejects an output and requests
a regen is a transition back into a model call — design it as an explicit loop with a
bounded retry count, not unbounded recursion.
## Stream-transformer hooks (optional, hot path)
A hook on the token stream itself: redaction, PII scrubbing, citation parsing. The
hardest surface (it runs in the hot path), but design the seam now even if
unimplemented in v0 — retrofitting it later means re-plumbing the streaming path.
## Durable vs transient output
The key property a hook declares: is its output **durable** (joins the canonical log,
future turns depend on it) or **transient** (per-turn scaffolding, re-derivable)?
- A retrieval hook is transient — augmentation for this turn only, not in the message
history. At most it lands in the optional audit record (see `04`).
- A hook writing a running summary, or appending a tool result later turns read, is
durable history and joins the log.
Same pipeline, two destinations, decided by a flag on the hook rather than by where in
the code it runs. Keeps "where does this get stored" next to the thing that knows.
## Relationship to compaction: none
Injection (pre-hooks) and reduction (compaction, see `05`) are **independent
subsystems** — no shared mechanism, trigger, or data flow. A pre-hook adds per-turn
augmentation; compaction evicts from the window. The library must not couple them.
The **only** shared thing is the token budget, computed once over the final rendered
context after both have run. If someone writes a retrieval hook that happens to read
compaction's summary output, that is application-level composition they assemble —
not something Agentix wires together.
Two rules at the seam (full layout rationale in `05`):
- **Overflow** — reduction targets `budget − injection_reserve`. A hook whose
injected content blows the reserve is a loud per-hook error, never a silent
truncation: compaction has already run and is not re-entered after hooks.
- **Placement** — injected per-turn content goes at the **tail**, adjacent to the
user message. Never before the history: that invalidates the provider cache
prefix every turn and undoes prefix-ward compaction (see `05`).
## Resolved decisions recap
- Concurrency: ordered pipeline + opt-in parallel groups, no DAG. Parallel
batches are append-only (ContentParts merged in declaration order).
- Tool-availability mutation: deferred to v0.1; tools fixed per agent in v0.
- Context object: `%Agentix.Turn{}` carrying context, user_message, turn_ref, scope.