# Conversation History
OpenResponses maintains conversation history automatically using `previous_response_id`. You never need to replay prior messages — just reference the last response ID and OpenResponses reconstructs the full context.
## How it works
When a response completes, OpenResponses stores it in `ResponseCache` (backed by Cachex). On the next request, if `previous_response_id` is present, the loop loads the prior response and prepends its `input` and `output` to the new request's input before sending to the provider.
```
Request 2: previous_response_id = "resp_01"
│
┌─────────────┘
▼
ResponseCache.get("resp_01")
│
▼
prev.input + prev.output + new_input
│
▼
sent to provider
```
## Basic usage
```bash
# Turn 1
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"input": [{"role": "user", "content": "My favourite language is Elixir."}]
}'
# → {"id": "resp_abc", "status": "completed", ...}
# Turn 2 — no history needed in the request
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"previous_response_id": "resp_abc",
"input": [{"role": "user", "content": "What is my favourite language?"}]
}'
# → Model knows it's Elixir
```
## Cache configuration
By default, responses are cached for 24 hours in memory. To change the TTL, responses are stored via `Cachex` which you can configure at startup:
```elixir
# application.ex
{Cachex, name: :response_cache, limit: 10_000}
```
For cross-node or cross-restart persistence (Phase 3), add `AshPostgres` as a data layer and responses will be stored durably.
## What gets cached
For each completed response, the cache stores:
- `id` — the response ID
- `model` — the model used
- `status` — terminal state (`completed`, `failed`, or `incomplete`)
- `input` — the original input sent by the client
- `output` — all output items produced by the model
- `usage` — token counts
- `created_at` — timestamp
Responses in `failed` or `incomplete` states are cached but their output may be partial.
## Chaining multiple turns
Each turn only needs to reference the immediately preceding response — not the entire chain. OpenResponses handles the reconstruction:
```
resp_001 ← resp_002 ← resp_003 ← resp_004 (current)
```
When processing `resp_004`, OpenResponses loads `resp_003` from cache. `resp_003`'s own context was already reconstructed when it was created, so its `input` field contains the full accumulated history up to that point.
## Branching conversations
Because `previous_response_id` is just a reference, you can branch at any point:
```
resp_001
├── resp_002a (branch A)
│ └── resp_003a
└── resp_002b (branch B)
└── resp_003b
```
Both branches reference `resp_001` but diverge from there. This is useful for showing users alternative continuations or implementing undo.