# Observability & Custom Telemetry
This guide shows how to build custom observability middleware that emits telemetry events, OpenTelemetry spans, or any other instrumentation from your agent's LLM and tool execution lifecycle.
## Why Middleware?
Sagents provides a `callbacks/1` middleware callback that lets you hook into every LLM event — token usage, tool execution, message processing, errors, and more. Rather than building a one-size-fits-all telemetry integration, Sagents leaves this to you as a custom middleware because:
- **Your telemetry metadata is unique** — customer IDs, tenant context, billing tiers, feature flags
- **Your instrumentation stack is unique** — `:telemetry`, OpenTelemetry, StatsD, Prometheus, Datadog
- **Your event shapes are unique** — what you measure and how you label it depends on your domain
The `callbacks/1` callback gives you direct access to LLM lifecycle events, and your middleware's `init/1` config is the natural place to carry your application-specific context.
## Quick Start
Here's a minimal observability middleware that emits `:telemetry` events for token usage:
```elixir
defmodule MyApp.Middleware.Observability do
@behaviour Sagents.Middleware
@impl true
def init(opts) do
{:ok, %{service_name: Keyword.get(opts, :service_name, "agent")}}
end
@impl true
def callbacks(config) do
%{
on_llm_token_usage: fn chain, usage ->
%{model: model_name} = chain.llm
:telemetry.execute(
[:myapp, :llm, :token_usage],
%{input: usage.input, output: usage.output},
%{service: config.service_name, model: model_name}
)
end
}
end
end
```
Add it to your agent:
```elixir
{:ok, agent} = Sagents.Agent.new(%{
model: model,
middleware: [{MyApp.Middleware.Observability, service_name: "support-agent"}]
})
```
That's it. Every LLM call made by this agent will now emit a `[:myapp, :llm, :token_usage]` telemetry event with your service name attached.
## Passing Application-Specific Context
The real power of middleware-based observability is that `init/1` receives your application context and `callbacks/1` closes over it. This means every event you emit can carry metadata that is specific to your system — customer IDs, user IDs, billing tiers, or anything else.
```elixir
defmodule MyApp.Middleware.Observability do
@behaviour Sagents.Middleware
@impl true
def init(opts) do
{:ok, %{
service_name: Keyword.get(opts, :service_name, "agent"),
customer_id: Keyword.fetch!(opts, :customer_id),
user_id: Keyword.fetch!(opts, :user_id)
}}
end
@impl true
def callbacks(config) do
%{
on_llm_token_usage: fn chain, usage ->
%{model: model_name} = chain.llm
:telemetry.execute(
[:myapp, :llm, :token_usage],
%{input: usage.input, output: usage.output},
%{
service: config.service_name,
model: model_name,
customer_id: config.customer_id,
user_id: config.user_id
}
)
end,
on_tool_execution_started: fn _chain, tool_call, _function ->
:telemetry.execute(
[:myapp, :tool, :started],
%{system_time: System.system_time()},
%{
tool: tool_call.name,
customer_id: config.customer_id,
user_id: config.user_id
}
)
end,
on_tool_execution_completed: fn _chain, tool_call, _tool_result ->
:telemetry.execute(
[:myapp, :tool, :completed],
%{system_time: System.system_time()},
%{
tool: tool_call.name,
customer_id: config.customer_id,
user_id: config.user_id
}
)
end,
on_tool_execution_failed: fn _chain, tool_call, error ->
:telemetry.execute(
[:myapp, :tool, :failed],
%{system_time: System.system_time()},
%{
tool: tool_call.name,
error: inspect(error),
customer_id: config.customer_id,
user_id: config.user_id
}
)
end
}
end
end
```
When creating the agent, pass the context from your application:
```elixir
# In your Coordinator, Factory, or wherever agents are created
{:ok, agent} = Sagents.Agent.new(%{
model: model,
middleware: [
{MyApp.Middleware.Observability,
service_name: "support-agent",
customer_id: customer.id,
user_id: current_user.id},
# ... other middleware
]
})
```
Now every telemetry event carries the customer and user context, so you can slice metrics by customer, track per-user usage for billing, or correlate tool failures with specific accounts.
## Available Callback Keys
The `callbacks/1` function returns a map of callback keys to handler functions. Here is the complete list of available keys with their signatures:
### Model-Level Callbacks
| Key | Signature | When it fires |
|-----|-----------|---------------|
| `:on_llm_new_delta` | `fn chain, [delta] -> any()` | Each streaming token/delta received |
| `:on_llm_new_message` | `fn chain, message -> any()` | Complete message from LLM (non-streaming) |
| `:on_llm_ratelimit_info` | `fn chain, info_map -> any()` | Rate limit headers from provider |
| `:on_llm_token_usage` | `fn chain, token_usage -> any()` | Token usage for the request |
| `:on_llm_response_headers` | `fn chain, headers_map -> any()` | Raw HTTP response headers |
### Chain-Level Callbacks
| Key | Signature | When it fires |
|-----|-----------|---------------|
| `:on_message_processed` | `fn chain, message -> any()` | Message fully processed (streaming or not) |
| `:on_message_processing_error` | `fn chain, message -> any()` | Error while processing a message |
| `:on_error_message_created` | `fn chain, message -> any()` | Automated error response message created |
| `:on_tool_call_identified` | `fn chain, tool_call, function -> any()` | Tool call detected during streaming |
| `:on_tool_execution_started` | `fn chain, tool_call, function -> any()` | Tool begins executing |
| `:on_tool_execution_completed` | `fn chain, tool_call, tool_result -> any()` | Tool finished successfully |
| `:on_tool_execution_failed` | `fn chain, tool_call, error -> any()` | Tool execution errored |
| `:on_tool_response_created` | `fn chain, message -> any()` | Tool response message created |
| `:on_retries_exceeded` | `fn chain -> any()` | Max retries exhausted |
All handler return values are discarded. Callbacks are for observation only — they cannot modify the chain or its state.
> **Tip: Extracting the model name.** The `chain` argument is an `LLMChain` struct. You can extract the model name with `%{model: model_name} = chain.llm`. This works regardless of which chat model provider is being used (Anthropic, OpenAI, etc.), since they all have a `:model` field. This is especially useful in token usage callbacks for cost tracking across different models.
## Fan-Out Behavior
When multiple middleware declare callbacks, all handlers fire for each event. If two middleware both define `on_llm_token_usage`, both handlers execute. This means you can have separate middleware for metrics, logging, and tracing without them interfering with each other.
```elixir
{:ok, agent} = Sagents.Agent.new(%{
model: model,
middleware: [
{MyApp.Middleware.Metrics, customer_id: customer.id},
{MyApp.Middleware.AuditLog, user_id: user.id},
# ... other middleware
]
})
```
Both `Metrics` and `AuditLog` can declare `callbacks/1` and both will fire.
## Sub-Agent Propagation
By default, an agent's middleware stack is passed down to any sub-agents it spawns. This means your observability middleware automatically covers the entire agent tree — the parent agent and all of its sub-agents — without any extra configuration.
If your parent agent is configured with:
```elixir
{:ok, agent} = Sagents.Agent.new(%{
model: model,
middleware: [
{MyApp.Middleware.Observability,
service_name: "support-agent",
customer_id: customer.id,
user_id: current_user.id},
Sagents.Middleware.SubAgent,
# ... other middleware
]
})
```
When this agent spawns sub-agents, those sub-agents inherit the same middleware stack including your observability middleware. Token usage, tool execution, and errors from sub-agents all emit the same telemetry events with the same customer and user context as the parent. You get full visibility across the entire agent interaction without any additional wiring.
## Full Example: Comprehensive Observability
Here's a more complete example that covers the most useful events for production observability:
```elixir
defmodule MyApp.Middleware.Observability do
@behaviour Sagents.Middleware
require Logger
@impl true
def init(opts) do
{:ok, %{
service_name: Keyword.get(opts, :service_name, "agent"),
customer_id: Keyword.get(opts, :customer_id),
user_id: Keyword.get(opts, :user_id)
}}
end
@impl true
def callbacks(config) do
metadata = %{
service: config.service_name,
customer_id: config.customer_id,
user_id: config.user_id
}
%{
# Track token usage for cost monitoring and billing
on_llm_token_usage: fn chain, usage ->
%{model: model_name} = chain.llm
:telemetry.execute(
[:myapp, :llm, :token_usage],
%{input: usage.input, output: usage.output},
Map.put(metadata, :model, model_name)
)
end,
# Track rate limits to detect throttling
on_llm_ratelimit_info: fn _chain, info ->
:telemetry.execute(
[:myapp, :llm, :ratelimit],
info,
metadata
)
end,
# Track tool execution lifecycle
on_tool_execution_started: fn _chain, tool_call, _function ->
:telemetry.execute(
[:myapp, :tool, :started],
%{system_time: System.system_time()},
Map.put(metadata, :tool, tool_call.name)
)
end,
on_tool_execution_completed: fn _chain, tool_call, _tool_result ->
:telemetry.execute(
[:myapp, :tool, :completed],
%{system_time: System.system_time()},
Map.put(metadata, :tool, tool_call.name)
)
end,
on_tool_execution_failed: fn _chain, tool_call, error ->
Logger.warning("Tool #{tool_call.name} failed: #{inspect(error)}",
customer_id: config.customer_id
)
:telemetry.execute(
[:myapp, :tool, :failed],
%{system_time: System.system_time()},
Map.merge(metadata, %{tool: tool_call.name, error: inspect(error)})
)
end,
# Track when retries are exhausted (potential reliability issue)
on_retries_exceeded: fn _chain ->
:telemetry.execute(
[:myapp, :llm, :retries_exceeded],
%{count: 1},
metadata
)
end
}
end
end
```
### Attaching Telemetry Handlers
Wire up the telemetry events in your application startup:
```elixir
# In your Application.start/2 or a dedicated Telemetry module
:telemetry.attach_many(
"myapp-agent-metrics",
[
[:myapp, :llm, :token_usage],
[:myapp, :tool, :started],
[:myapp, :tool, :completed],
[:myapp, :tool, :failed],
[:myapp, :llm, :retries_exceeded]
],
&MyApp.Telemetry.handle_event/4,
nil
)
```
## Testing Your Observability Middleware
Test that your callbacks fire and emit the expected telemetry events:
```elixir
defmodule MyApp.Middleware.ObservabilityTest do
use ExUnit.Case
alias MyApp.Middleware.Observability
test "callbacks/1 returns expected callback keys" do
{:ok, config} = Observability.init(
service_name: "test",
customer_id: "cust-1",
user_id: "user-1"
)
callbacks = Observability.callbacks(config)
assert is_function(callbacks[:on_llm_token_usage], 2)
assert is_function(callbacks[:on_tool_execution_started], 3)
assert is_function(callbacks[:on_tool_execution_completed], 3)
assert is_function(callbacks[:on_tool_execution_failed], 3)
end
test "on_llm_token_usage emits telemetry event" do
{:ok, config} = Observability.init(
service_name: "test",
customer_id: "cust-1",
user_id: "user-1"
)
ref = :telemetry_test.attach_event_handlers(self(), [
[:myapp, :llm, :token_usage]
])
callbacks = Observability.callbacks(config)
usage = %LangChain.TokenUsage{input: 100, output: 50}
callbacks.on_llm_token_usage.(nil, usage)
assert_received {[:myapp, :llm, :token_usage], ^ref,
%{input: 100, output: 50},
%{service: "test", customer_id: "cust-1", user_id: "user-1"}}
end
end
```