# Getting Started
This guide walks through making your first requests after completing [Installation](installation.html).
## Your first request
Send a non-streaming request to any configured provider:
```bash
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": [
{"role": "user", "content": "Explain the BEAM in one sentence."}
]
}'
```
Response:
```json
{
"id": "01950000-0000-0000-0000-000000000000",
"object": "response",
"model": "gpt-4o",
"status": "completed",
"output": [
{
"type": "message",
"role": "assistant",
"content": [{"type": "output_text", "text": "The BEAM is..."}],
"status": "completed"
}
],
"usage": {}
}
```
## Streaming responses
Add `"stream": true` to receive Server-Sent Events as the model generates:
```bash
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{
"model": "gpt-4o",
"stream": true,
"input": [
{"role": "user", "content": "Write a haiku about Elixir."}
]
}'
```
You'll receive a stream of events:
```
event: response.created
data: {"id":"01950000...","status":"queued",...}
event: response.in_progress
data: {"type":"response.in_progress","sequence_number":0}
event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"Concurrent","sequence_number":3}
event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":" streams flow","sequence_number":4}
event: response.completed
data: {"id":"01950000...","status":"completed",...}
data: [DONE]
```
See [Streaming](streaming.html) for the full event catalogue and client examples.
## Using tools
Define tools in the request and the model will call them when appropriate:
```bash
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-6",
"input": [
{"role": "user", "content": "What time is it in Tokyo?"}
],
"tools": [
{
"type": "function",
"name": "get_time",
"description": "Get the current time in a given timezone",
"parameters": {
"type": "object",
"properties": {
"timezone": {"type": "string", "description": "IANA timezone name"}
},
"required": ["timezone"]
}
}
]
}'
```
When the model decides to call `get_time`, OpenResponses emits a `function_call` item in the output. You then submit the result in a follow-up request using `previous_response_id`. See [Tool Dispatch](tool_dispatch.html) for the full flow.
## Multi-turn conversations
Use `previous_response_id` to continue a conversation. OpenResponses automatically reconstructs the full context from the cache:
```bash
# First turn
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": [{"role": "user", "content": "My name is Alice."}]
}'
# → {"id": "resp_001", ...}
# Second turn — no need to repeat the history
curl -X POST http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"previous_response_id": "resp_001",
"input": [{"role": "user", "content": "What is my name?"}]
}'
```
See [Conversation History](conversation_history.html) for caching behaviour and TTL configuration.
## Choosing a model
The model name determines which provider adapter is used:
| Model prefix | Provider |
|---|---|
| `gpt-*` | OpenAI |
| `claude-*` | Anthropic |
| `gemini-*` | Google Gemini |
| `llama*`, `mistral*`, `phi*`, `qwen*` | Ollama (local) |
See [Providers](providers.html) to add API keys and customise routing.