guides/cerebras.md
# Cerebras
Ultra-fast inference with OpenAI-compatible API and Cerebras-specific optimizations.
## Configuration
```bash
CEREBRAS_API_KEY=csk_...
```
## Provider Options
No custom provider options - uses OpenAI-compatible defaults with Cerebras-specific handling.
Passed via standard ReqLLM options:
- `temperature`, `max_tokens`, `top_p`
- `tools` (with automatic `strict: true` for non-Qwen models)
- `tool_choice` (`"auto"` or `"none"` only)
## Implementation Notes
### System Messages
System messages have stronger influence compared to OpenAI's implementation.
### Tool Calling
- Requires `strict: true` in tool schemas (automatically added)
- Qwen models do NOT support `strict: true` (automatically excluded)
- Only supports `tool_choice: "auto"` or `"none"` (not function-specific)
### Streaming Limitations
Streaming not supported with:
- Reasoning models in JSON mode
- Tool calling scenarios
### Unsupported OpenAI Features
These fields will result in a 400 error:
- `frequency_penalty`
- `logit_bias`
- `presence_penalty`
- `parallel_tool_calls`
- `service_tier`
All restrictions handled automatically by ReqLLM.
## Resources
- [Cerebras Documentation](https://docs.cerebras.ai/)