# Smart Router
The Router provides intelligent multi-provider orchestration with automatic failover, load balancing, and task-based routing.
## Overview
The Router selects the best LLM provider for each request based on configurable strategies, handles failures gracefully, and tracks provider health.
## Creating a Router
```elixir
alias Rag.Router
# With specific providers
{:ok, router} = Router.new(providers: [:gemini, :claude, :codex])
# Auto-detect available providers
{:ok, router} = Router.new(auto_detect: true)
# With specific strategy
{:ok, router} = Router.new(
providers: [:gemini, :claude],
strategy: :fallback
)
```
### Auto-Strategy Selection
When `strategy: :auto` or not specified:
| Provider Count | Strategy | Reason |
|----------------|----------|--------|
| 3+ providers | `:specialist` | Task-based routing |
| 2 providers | `:fallback` | Reliability via retry |
| 1 provider | `:fallback` | Passthrough |
## Core API
### Execute Request
```elixir
# Text generation
{:ok, response, router} = Router.execute(router, :text, "Hello", [])
# With options
{:ok, response, router} = Router.execute(router, :text, "Hello",
system_prompt: "You are helpful.",
temperature: 0.7
)
# Embeddings
{:ok, embeddings, router} = Router.execute(router, :embeddings, ["text1", "text2"], [])
# Streaming
{:ok, stream, router} = Router.execute(router, :text, "Count to 10", stream: true)
Enum.each(stream, &IO.write/1)
```
### Route Only (No Execution)
```elixir
# Get selected provider without executing
{:ok, provider, router} = Router.route(router, :text, "Hello", [])
# provider is :gemini, :claude, or :codex
```
### Query Providers
```elixir
# Available providers
Router.available_providers(router) # [:gemini, :claude]
# Get provider capabilities
{:ok, caps} = Router.get_provider(router, :gemini)
# %{embeddings: true, streaming: true, max_context: 1_000_000, ...}
```
## Routing Strategies
### Fallback Strategy
Tries providers in order until one succeeds.
```elixir
{:ok, router} = Router.new(
providers: [:gemini, :claude, :codex],
strategy: :fallback,
fallback_order: [:claude, :gemini, :codex], # Custom order
max_failures: 3, # Skip after 3 failures
failure_decay_ms: 60_000 # Reset after 60s
)
```
**Behavior:**
1. Try first provider in order
2. On failure, mark and try next
3. Skip providers with >= `max_failures` recent failures
4. Automatically recover after `failure_decay_ms`
5. Success resets failure count
**Configuration:**
- `fallback_order` - Custom provider order (default: providers list order)
- `max_failures` - Failures before skipping (default: 3)
- `failure_decay_ms` - Time to reset failures (default: 60,000ms)
### Round-Robin Strategy
Distributes load evenly across providers.
```elixir
{:ok, router} = Router.new(
providers: [:gemini, :claude, :codex],
strategy: :round_robin,
weights: %{gemini: 3, codex: 2, claude: 1}, # Optional weighted
max_consecutive_failures: 3,
recovery_cooldown_ms: 30_000
)
```
**Behavior:**
1. Rotate through providers
2. With weights: `{gemini: 2, codex: 1}` produces gemini, gemini, codex, gemini, gemini, codex...
3. Skip unavailable providers
4. Mark unavailable after consecutive failures
5. Recover after cooldown period
**Configuration:**
- `weights` - Provider weights (default: equal)
- `max_consecutive_failures` - Before marking unavailable (default: 3)
- `recovery_cooldown_ms` - Recovery time (default: 30,000ms)
### Specialist Strategy
Routes based on task type to best-suited provider.
```elixir
{:ok, router} = Router.new(
providers: [:gemini, :claude, :codex],
strategy: :specialist,
task_mappings: %{
embeddings: :gemini,
code_generation: :codex,
analysis: :claude
},
fallback_provider: :gemini,
max_failures: 3
)
```
**Default Task Mappings:**
| Task | Provider | Reason |
|------|----------|--------|
| `:embeddings` | Gemini | Embedding support |
| `:long_context` | Gemini | 1M token window |
| `:multimodal` | Gemini | Image/audio |
| `:cost` | Gemini | Cheapest |
| `:speed` | Gemini | Fastest |
| `:code_generation` | Codex | Code optimized |
| `:code_review` | Codex | Code understanding |
| `:structured_output` | Codex | JSON generation |
| `:analysis` | Claude | Deep reasoning |
| `:writing` | Claude | Best prose |
| `:agentic` | Claude | Multi-step |
| `:reasoning` | Claude | Complex logic |
| `:safety` | Claude | Safety focus |
**Task Inference:**
- Explicit: `Router.execute(router, :text, prompt, task: :code_generation)`
- Automatic: Infers from prompt keywords
- Code keywords: "write", "implement", "function", "code", "class"
- Analysis keywords: "analyze", "explain", "review", "compare"
**Configuration:**
- `task_mappings` - Task to provider mapping
- `fallback_provider` - Default if preferred unavailable
- `max_failures` - Before marking unavailable
## Error Handling
```elixir
case Router.execute(router, :text, "Hello", []) do
{:ok, response, updated_router} ->
# Success - use updated_router for subsequent calls
IO.puts(response)
{:error, :all_providers_failed} ->
# All providers failed
IO.puts("No providers available")
{:error, reason} ->
# Other error
IO.puts("Error: #{inspect(reason)}")
end
```
### Manual Result Reporting
```elixir
# Report success/failure for custom execution
router = Router.report_result(router, :gemini, {:ok, "response"})
router = Router.report_result(router, :gemini, {:error, :timeout})
# Get next provider after failure
{:ok, next_provider, router} = Router.next_provider(router, :gemini)
```
## Execution Flow
```
User Request
|
v
Router.execute()
|
v
Strategy.select_provider()
|
v
Get/Create Provider Instance
|
v
Provider.generate_text() or .generate_embeddings()
|
+---> Success: return {:ok, result, router}
|
+---> Failure: report_result() -> next_provider() -> retry
|
+--> All exhausted: {:error, :all_providers_failed}
```
## Health Tracking
Each strategy tracks provider health differently:
### Fallback
- Counts consecutive failures per provider
- Skips provider when count >= `max_failures`
- Resets count after `failure_decay_ms` or on success
### Round-Robin
- Counts consecutive failures per provider
- Marks unavailable when count >= `max_consecutive_failures`
- Recovers after `recovery_cooldown_ms`
- Resets count on success
### Specialist
- Counts total failures per provider
- Marks unavailable when count >= `max_failures`
- Falls back to fallback_provider
- Resets count on success
## Provider Instance Caching
The router caches provider instances:
```elixir
# First use creates instance
{:ok, response, router} = Router.execute(router, :text, "Hello", [])
# Subsequent calls reuse cached instance
{:ok, response, router} = Router.execute(router, :text, "World", [])
```
## Best Practices
1. **Use fallback for reliability** - When uptime is critical
2. **Use round-robin for load balancing** - When distributing load matters
3. **Use specialist for optimization** - When matching task to provider matters
4. **Always use updated router** - Router state changes after each call
5. **Handle all_providers_failed** - Have a fallback plan
6. **Configure timeouts** - Prevent hanging on slow providers
## Example: Complete Setup
```elixir
alias Rag.Router
# Configure with all strategies
{:ok, router} = Router.new(
providers: [:gemini, :claude, :codex],
strategy: :specialist,
task_mappings: %{
embeddings: :gemini,
code_generation: :codex,
analysis: :claude,
general: :gemini
},
fallback_provider: :gemini,
max_failures: 3
)
# Embeddings go to Gemini
{:ok, embeddings, router} = Router.execute(router, :embeddings, ["text"], [])
# Code tasks go to Codex
{:ok, code, router} = Router.execute(router, :text,
"Write a fibonacci function",
task: :code_generation
)
# Analysis goes to Claude
{:ok, analysis, router} = Router.execute(router, :text,
"Analyze this architecture decision",
task: :analysis
)
```
## Next Steps
- [LLM Providers](providers.md) - Learn about each provider
- [Embeddings](embeddings.md) - Embedding generation service