# Telemetry
ExGram emits [`:telemetry`](https://hexdocs.pm/telemetry) events at key points
in its lifecycle, letting you integrate with any observability tool that speaks
the `:telemetry` protocol - including Prometheus (via
[`TelemetryMetrics`](https://hexdocs.pm/telemetry_metrics)),
[`OpenTelemetry`](https://opentelemetry.io/), or simple structured logging.
> ### OpenTelemetry {: .tip}
>
> To have OpenTelemetry traces and spans in your bots, use the [`OpentelemetryExGram`](https://hex.pm/packages/opentelemetry_ex_gram) library.
> It attaches to the events below and creates OTel spans automatically, with full
> context propagation across GenServer and spawn boundaries.
>
> See the [OpenTelemetry guide](opentelemetry.md) for setup instructions and
> available spans.
## Events overview
| Event | Description |
|---|---|
| `[:ex_gram, :bot, :init, :start\|:stop]` | Bot dispatcher initialization span |
| `[:ex_gram, :bot, :shutdown]` | Bot dispatcher shutting down |
| `[:ex_gram, :updates, :init, :start\|:stop]` | Updates worker initialization span (polling, webhook, noup, test) |
| `[:ex_gram, :updates, :shutdown]` | Updates worker shutting down |
| `[:ex_gram, :request, :start\|:stop\|:exception]` | Outbound Telegram API call |
| `[:ex_gram, :update, :start\|:stop\|:exception]` | Incoming update dispatched to the bot |
| `[:ex_gram, :handler, :start\|:stop\|:exception]` | Your `handle/2` callback invocation |
| `[:ex_gram, :middleware, :start\|:stop\|:exception]` | Each middleware in the pipeline |
| `[:ex_gram, :polling, :start\|:stop\|:exception]` | One polling cycle (fetch + dispatch) |
All events are spans and carry `duration` in `:native` time units on the `:stop` event.
The `:shutdown` events are point-in-time and carry `system_time`. Convert to milliseconds with:
```elixir
System.convert_time_unit(duration, :native, :millisecond)
```
See `ExGram.Telemetry` for the full metadata reference for each event.
## Attaching handlers
Use `:telemetry.attach/4` (one event) or `:telemetry.attach_many/4` (multiple
events) to subscribe. A handler is a 4-arity function:
`(event_name, measurements, metadata, config) -> any`.
```elixir
defmodule MyApp.Telemetry do
require Logger
def setup do
:telemetry.attach_many(
"my-app-ex-gram",
[
[:ex_gram, :bot, :init, :start],
[:ex_gram, :bot, :init, :stop],
[:ex_gram, :bot, :shutdown],
[:ex_gram, :updates, :init, :start],
[:ex_gram, :updates, :init, :stop],
[:ex_gram, :updates, :shutdown],
[:ex_gram, :request, :start],
[:ex_gram, :request, :stop],
[:ex_gram, :request, :exception],
[:ex_gram, :update, :start],
[:ex_gram, :update, :stop],
[:ex_gram, :handler, :start],
[:ex_gram, :handler, :stop],
[:ex_gram, :handler, :exception],
[:ex_gram, :middleware, :start],
[:ex_gram, :middleware, :stop],
[:ex_gram, :polling, :start],
[:ex_gram, :polling, :stop],
],
&__MODULE__.handle_event/4,
nil
)
end
def handle_event(event, measurements, metadata, _config) do
Logger.debug("Telemetry: #{inspect(event)} #{inspect(measurements)} #{inspect(metadata)}")
end
end
```
Call `MyApp.Telemetry.setup/0` in your `Application.start/2` before starting
your supervision tree.
## Tracking bot and updates lifecycle
The `:init` span and `:shutdown` events let you log when bots and their updates
workers come up or go down - useful for auditing restarts in production. The `:stop`
event carries a `duration` measurement so you can alert on slow startups:
```elixir
def handle_event([:ex_gram, :bot, :init, :start], _measurements, metadata, _) do
Logger.info("[ExGram] bot initializing bot=#{metadata.bot}")
end
def handle_event([:ex_gram, :bot, :init, :stop], measurements, metadata, _) do
duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
Logger.info("[ExGram] bot ready bot=#{metadata.bot} init_ms=#{duration_ms}")
end
def handle_event([:ex_gram, :bot, :shutdown], _measurements, metadata, _) do
Logger.warning("[ExGram] bot stopping bot=#{metadata.bot}")
end
def handle_event([:ex_gram, :updates, :init, :start], _measurements, metadata, _) do
Logger.info("[ExGram] updates worker initializing bot=#{metadata.bot} method=#{metadata.method}")
end
def handle_event([:ex_gram, :updates, :init, :stop], measurements, metadata, _) do
duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
Logger.info("[ExGram] updates worker ready bot=#{metadata.bot} method=#{metadata.method} init_ms=#{duration_ms}")
end
def handle_event([:ex_gram, :updates, :shutdown], _measurements, metadata, _) do
Logger.warning("[ExGram] updates worker stopping bot=#{metadata.bot} method=#{metadata.method}")
end
```
## Logging API requests
Log every outbound Telegram API call with its duration and result:
```elixir
def handle_event([:ex_gram, :request, :stop], measurements, metadata, _) do
duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
case metadata.result do
{:ok, _} ->
Logger.info("[ExGram] #{metadata.method} OK in #{duration_ms}ms bot=#{metadata.bot}")
{:error, error} ->
Logger.warning(
"[ExGram] #{metadata.method} ERROR in #{duration_ms}ms " <>
"code=#{error.code} message=#{error.message} bot=#{metadata.bot}"
)
end
end
def handle_event([:ex_gram, :request, :exception], measurements, metadata, _) do
duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
Logger.error(
"[ExGram] #{metadata.method} EXCEPTION in #{duration_ms}ms " <>
"kind=#{metadata.kind} reason=#{inspect(metadata.reason)} bot=#{metadata.bot}"
)
end
```
## Tracking update processing latency
Measure the full pipeline (middleware + routing) for each update:
```elixir
def handle_event([:ex_gram, :update, :stop], measurements, metadata, _) do
duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
Logger.info(
"[ExGram] update processed in #{duration_ms}ms " <>
"bot=#{metadata.bot} halted=#{metadata.halted}"
)
end
```
## Alerting on handler exceptions
The `[:ex_gram, :handler, :exception]` event fires when your `handle/2`
callback raises. Use it to alert or capture errors:
```elixir
def handle_event([:ex_gram, :handler, :exception], _measurements, metadata, _) do
Logger.error(
"[ExGram] handler raised in bot=#{metadata.bot} " <>
"handler=#{metadata.handler} kind=#{metadata.kind} " <>
"reason=#{inspect(metadata.reason)}"
)
# Forward to your error tracker (e.g. Sentry, Honeybadger)
# Sentry.capture_exception(metadata.reason, stacktrace: metadata.stacktrace)
end
```
## Prometheus metrics with `telemetry_metrics`
Add `{:telemetry_metrics, "~> 1.0"}` and `{:telemetry_poller, "~> 1.0"}` to
your app's deps, then define metrics:
```elixir
defmodule MyApp.Telemetry do
import Telemetry.Metrics
def metrics do
[
# Count bot starts and stops
counter("ex_gram.bot.init.stop.count", tags: [:bot]),
counter("ex_gram.bot.shutdown.count", tags: [:bot]),
# Distribution of bot initialization duration
distribution("ex_gram.bot.init.stop.duration",
unit: {:native, :millisecond},
tags: [:bot]
),
# Count updates worker starts and stops
counter("ex_gram.updates.init.stop.count", tags: [:bot, :method]),
counter("ex_gram.updates.shutdown.count", tags: [:bot, :method]),
# Count every API call
counter("ex_gram.request.stop.count",
tags: [:method, :bot]
),
# Distribution of API call durations
distribution("ex_gram.request.stop.duration",
unit: {:native, :millisecond},
tags: [:method, :bot],
reporter_options: [buckets: [10, 50, 100, 500, 1000, 5000]]
),
# Count API errors
counter("ex_gram.request.stop.error_count",
keep: &match?({:error, _}, &1.result),
tags: [:method, :bot]
),
# Count incoming updates
counter("ex_gram.update.stop.count",
tags: [:bot]
),
# Distribution of update processing time
distribution("ex_gram.update.stop.duration",
unit: {:native, :millisecond},
tags: [:bot]
),
# Count handler exceptions
counter("ex_gram.handler.exception.count",
tags: [:bot, :handler]
),
# Distribution of polling cycle durations
distribution("ex_gram.polling.stop.duration",
unit: {:native, :millisecond},
tags: [:bot]
),
]
end
end
```
## Middleware timing
If your bot has multiple middlewares, you can measure each one individually:
```elixir
def handle_event([:ex_gram, :middleware, :stop], measurements, metadata, _) do
duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
Logger.debug(
"[ExGram] middleware #{inspect(metadata.middleware)} in #{duration_ms}ms " <>
"bot=#{metadata.bot} halted=#{metadata.halted}"
)
end
```
## Handler vs. update events
It is worth understanding the distinction between update and handler events:
- **`[:ex_gram, :update, ...]`** - spans middleware execution and update routing.
When using async dispatch (the default), the stop event fires after the handler
_process is spawned_, not after the handler finishes. The duration reflects
middleware + routing overhead only.
- **`[:ex_gram, :handler, ...]`** - spans the actual `handle/2` callback. This
event always fires in the process executing the handler (the spawned process
for async dispatch, or the GenServer process for sync dispatch).
For end-to-end latency tracking, subscribe to `[:ex_gram, :handler, :stop]`.