# Deploying
OpenResponses is a standard Phoenix application. Any deployment approach that works for Phoenix works here.
## Environment variables
At minimum, set the API keys for the providers you use:
```bash
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=AIza...
SECRET_KEY_BASE=... # Phoenix requirement
PHX_HOST=your-domain.com
```
## Release builds
Build a release with `mix release`:
```bash
MIX_ENV=prod mix assets.deploy
MIX_ENV=prod mix release
```
The release is self-contained. Run it:
```bash
PHX_HOST=your-domain.com \
OPENAI_API_KEY=sk-... \
_build/prod/rel/open_responses/bin/open_responses start
```
## Docker
```dockerfile
FROM hexpm/elixir:1.18.3-erlang-27.3-alpine-3.21.0 AS build
WORKDIR /app
RUN mix do local.hex --force, local.rebar --force
COPY mix.exs mix.lock ./
RUN MIX_ENV=prod mix deps.get --only prod
RUN MIX_ENV=prod mix deps.compile
COPY lib lib
COPY priv priv
COPY guides guides
COPY config config
RUN MIX_ENV=prod mix assets.deploy
RUN MIX_ENV=prod mix release
FROM alpine:3.21.0 AS runtime
RUN apk add --no-cache libstdc++ openssl ncurses-libs
WORKDIR /app
COPY --from=build /app/_build/prod/rel/open_responses ./
ENV MIX_ENV=prod
ENV PHX_SERVER=true
EXPOSE 4000
ENTRYPOINT ["bin/open_responses"]
CMD ["start"]
```
## Fly.io
```bash
fly launch
fly secrets set OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-...
fly deploy
```
## Clustering
OpenResponses uses `Phoenix.PubSub` for event broadcasting within a node. In a multi-node deployment, PubSub must be configured for distribution:
```elixir
config :open_responses, OpenResponses.PubSub,
adapter: Phoenix.PubSub.PG2
```
And use `dns_cluster` (already included) for automatic node discovery on platforms like Fly.io:
```elixir
config :open_responses, :dns_cluster_query, System.get_env("DNS_CLUSTER_QUERY")
```
## Response cache in a cluster
The default Cachex cache is per-node in-memory. In a cluster, a `previous_response_id` might reference a response stored on a different node.
For production clustering, switch to a distributed cache or enable the AshPostgres persistence layer (Phase 3):
```elixir
# Option 1: Cachex with distributed adapter (Nebulex)
# Option 2: AshPostgres — responses stored in Postgres, accessible across nodes
```
Until then, use sticky sessions to route a user's requests to the same node.
## Scaling considerations
| Concern | Guidance |
|---|---|
| Concurrent requests | The BEAM handles thousands of simultaneous loops comfortably. No special config needed. |
| Long-running agentic loops | Use streaming. Non-streaming requests hold a connection open. |
| Provider rate limits | Add a rate-limiting middleware (`MyApp.Middleware.RateLimit`). |
| Memory | Each active loop holds the response state in memory. Monitor `open_responses_loop_iterations_total` to track concurrency. |
| Ollama | Run Ollama on the same host or a fast private network. GPU latency dominates. |
## Health check
The Phoenix endpoint is healthy when it responds to HTTP requests. Add a health check route:
```elixir
# router.ex
get "/health", OpenResponsesWeb.HealthController, :check
```
```elixir
defmodule OpenResponsesWeb.HealthController do
use OpenResponsesWeb, :controller
def check(conn, _params) do
json(conn, %{status: "ok"})
end
end
```