# Deployment & Production
## Environment variables
Use `{:system, "VAR"}` in config to read from environment at runtime:
```elixir
# config/runtime.exs
import Config
if config_env() == :prod do
config :phoenix_micro,
transport: String.to_existing_atom(System.get_env("MESSAGE_TRANSPORT", "kafka")),
transports: [
kafka: [
brokers: parse_brokers(System.get_env("KAFKA_BROKERS", "localhost:9092")),
group_id: System.get_env("KAFKA_GROUP_ID", "my_app"),
client_id: System.get_env("KAFKA_CLIENT_ID", "my_app"),
acks: String.to_integer(System.get_env("KAFKA_ACKS", "1"))
],
nats: [
host: System.get_env("NATS_HOST", "localhost"),
port: String.to_integer(System.get_env("NATS_PORT", "4222")),
queue_group: System.get_env("NATS_QUEUE_GROUP", "my_app")
],
rabbitmq: [
url: {:system, "RABBITMQ_URL"},
exchange: System.get_env("RABBITMQ_EXCHANGE", "my_app")
],
redis_streams: [
url: {:system, "REDIS_URL"},
consumer_group: System.get_env("REDIS_CONSUMER_GROUP", "my_app")
]
]
end
defp parse_brokers(str) do
str
|> String.split(",")
|> Enum.map(fn hp ->
case String.split(String.trim(hp), ":") do
[host, port] -> {host, String.to_integer(port)}
[host] -> {host, 9092}
end
end)
end
```
## Recommended production consumer stack
```elixir
defmodule MyApp.Payments.CreatedConsumer do
use PhoenixMicro.Consumer
topic "payments.created"
concurrency 20
pipeline :broadway # explicit Broadway backpressure
retry max_attempts: 5, base_delay: 1_000, max_delay: 60_000
dead_letter_topic "payments.created.dlq"
middleware [
PhoenixMicro.Middleware.Logger,
PhoenixMicro.Middleware.Metrics,
{PhoenixMicro.Middleware.CircuitBreaker,
threshold: 10,
window_ms: 60_000,
reset_timeout_ms: 60_000},
{PhoenixMicro.Middleware.Idempotency,
store: MyApp.Middleware.RedisIdempotencyStore},
PhoenixMicro.Middleware.Tracing
]
end
```
## Health checks
Add to your router for load-balancer health probes:
```elixir
# router.ex
scope "/" do
forward "/health", PhoenixMicro.Phoenix.HealthPlug
end
```
Returns HTTP 200 with JSON when healthy, HTTP 503 when degraded
(open circuit breakers or disconnected transport).
CI / pre-deploy health gate:
```bash
mix phoenix_micro.health \
--url "${APP_URL}/health" \
--exit-code \
--format json
```
Returns exit code 1 if status is not `"ok"` — useful in deploy pipelines.
## LiveDashboard
Add PhoenixMicro's metrics page to LiveDashboard:
```elixir
# router.ex (dev / staging only)
if Application.compile_env(:my_app, :dev_routes) do
import Phoenix.LiveDashboard.Router
scope "/dev" do
pipe_through :browser
live_dashboard "/dashboard",
metrics: MyAppWeb.Telemetry,
additional_pages: [
phoenix_micro: PhoenixMicro.LiveDashboard.Page
]
end
end
```
Shows: transport connectivity, active consumers, circuit breaker states, saga metrics,
message throughput graphs (auto-refreshes every 2 seconds).
## Telemetry in production
Wire `PhoenixMicro.Telemetry.metrics/0` into your reporter:
```elixir
# application.ex
def start(_type, _args) do
children = [
{TelemetryMetricsPrometheus, metrics: metrics()},
MyAppWeb.Endpoint
]
Supervisor.start_link(children, strategy: :one_for_one)
end
defp metrics do
# Your app metrics +
PhoenixMicro.Telemetry.metrics()
end
```
## Kafka production checklist
- Set `acks: -1` (all replicas) for durability-critical topics
- Set `session_timeout_ms: 60_000` for stable group membership under load
- Monitor consumer lag — commit offset only after successful handler return
- Use separate `group_id` per application / deployment environment
- Keep `heartbeat_ms` ≤ `session_timeout_ms / 3`
```elixir
config :phoenix_micro,
transport: :kafka,
transports: [
kafka: [
url: "kafka://broker1:9092,broker2:9092,broker3:9092",
group_id: System.get_env("KAFKA_GROUP_ID"),
acks: -1,
ack_timeout_ms: 15_000,
session_timeout_ms: 60_000,
heartbeat_ms: 15_000,
fetch_wait_ms: 250,
max_bytes: 5_242_880 # 5 MB
]
]
```
## Clustering considerations
- Each node runs its own consumer processes
- **Kafka:** consumer group coordinator handles partition assignment automatically
- **NATS:** queue groups distribute load automatically
- **Redis Streams:** each node needs a unique `consumer_name`
- **RabbitMQ:** competing consumers on the same queue distribute load automatically
- The circuit breaker state is **per-node** (ETS) — tune thresholds accordingly
- The outbox Relay runs **per-node** — use a distributed lock if you want single-relay:
```elixir
# Use :global or a Redlock to elect one relay per cluster
{:ok, _} = :global.register_name({:outbox_relay, node()}, self())
```
## Graceful shutdown
PhoenixMicro transports trap exits and close broker connections cleanly.
Broadway drains in-flight messages before stopping.
Set an appropriate shutdown timeout in your release:
```elixir
# rel/config.exs (or config/releases.exs in newer Phoenix)
release :my_app do
shutdown_timeout: 30_000 # ms — allow in-flight messages to drain
end
```
## Docker / OTP releases
```dockerfile
FROM hexpm/elixir:1.16.3-erlang-26.2.5-alpine-3.20.0 AS build
WORKDIR /app
COPY mix.exs mix.lock ./
RUN mix deps.get --only prod
RUN MIX_ENV=prod mix deps.compile
COPY . .
RUN MIX_ENV=prod mix release
FROM alpine:3.20.0
RUN apk add --no-cache libstdc++ openssl ncurses-libs
WORKDIR /app
COPY --from=build /app/_build/prod/rel/my_app ./
ENV PHX_SERVER=true \
MESSAGE_TRANSPORT=kafka \
KAFKA_BROKERS=kafka:9092 \
KAFKA_GROUP_ID=my_app_prod
CMD ["bin/my_app", "start"]
```
## Outbox in production
Monitor the relay to catch stuck rows:
```elixir
# Add to your telemetry/alerting
def check_outbox_health(repo, schema) do
pending = repo.aggregate(schema, :count, :id,
where: [relayed_at: nil, failed_at: nil])
failed = repo.aggregate(schema, :count, :id,
where: [failed_at: {:not, nil}])
if pending > 1_000 do
MyApp.Alerts.warn(:outbox_backlog, %{count: pending})
end
if failed > 0 do
MyApp.Alerts.error(:outbox_failures, %{count: failed})
end
%{pending: pending, failed: failed}
end
```