# Arcanum
Provider-agnostic AI inference library for Elixir.
## Overview
Arcanum provides a unified interface for chat completion, streaming, embeddings, tool use, and media generation across multiple AI providers. Model capabilities are declared upfront via profiles — no runtime detection or error-code fallbacks.
## Supported Providers
| Provider | API Format | Features |
|----------|-----------|----------|
| OpenAI | OpenAI | Chat, stream, tools, vision, image generation |
| Anthropic | Anthropic | Chat, stream, tools, vision |
| DeepSeek | OpenAI | Chat, stream, tools |
| GitHub Copilot | OpenAI | Chat, stream, tools, vision (OAuth device flow) |
| OpenRouter | OpenAI | Chat, stream, tools |
| ZAI / Zhipu | OpenAI | Chat, stream, tools |
| LM Studio | OpenAI | Chat, stream, tools (auto model loading) |
## Installation
```elixir
def deps do
[
{:arcanum, "~> 0.1.1"}
]
end
```
## Usage
All inference goes through `Arcanum.Gateway`. Callers never touch adapters directly.
### Provider Map
Every Gateway function takes a provider map describing the endpoint:
```elixir
provider = %{
base_url: "https://api.openai.com",
api_key: "sk-...",
kind: "openai",
api_format: :openai,
type: :cloud
}
```
| Key | Type | Description |
|-----|------|-------------|
| `base_url` | `String.t()` | Required. Provider API base URL. |
| `api_key` | `String.t() \| nil` | API key. Not needed for local providers or Copilot. |
| `api_format` | `:openai \| :anthropic \| :custom` | Determines which adapter handles the request. |
| `kind` | `String.t()` | Provider ID (e.g. `"openai"`, `"anthropic"`, `"ollama"`, `"github-copilot"`). Used for profile resolution and provider-specific behavior. |
| `type` | `:cloud \| :local` | Used by `Arcanum.Probe` to skip TCP checks for cloud providers. |
| `extra_headers` | `[{String.t(), String.t()}] \| nil` | Additional HTTP headers (injected automatically for Copilot). |
### Chat Completion
```elixir
alias Arcanum.{Gateway, Intent}
intent = %Intent{
model: "gpt-4o",
messages: [
%{role: :system, content: Intent.text("You are a helpful assistant.")},
%{role: :user, content: Intent.text("What is Elixir?")}
],
temperature: 0.7,
max_tokens: 1024
}
{:ok, %Arcanum.Response{content: content}} = Gateway.chat(provider, intent)
```
### Streaming
```elixir
{:ok, stream} = Gateway.stream(provider, intent)
Enum.each(stream, fn
{:data, %Arcanum.Response{content: chunk}} -> IO.write(chunk || "")
:done -> IO.puts("\n--- done ---")
{:error, reason} -> IO.puts("Error: #{inspect(reason)}")
end)
```
### Tool Use
Pass tools in the intent. Arcanum handles native, XML-text, and JSON-text tool call formats transparently based on the model profile.
```elixir
intent = %Intent{
model: "gpt-4o",
messages: [%{role: :user, content: Intent.text("What is the weather in Berlin?")}],
tools: [
%{
type: "function",
function: %{
name: "get_weather",
description: "Get current weather for a location",
parameters: %{
"type" => "object",
"properties" => %{
"location" => %{"type" => "string", "description" => "City name"}
},
"required" => ["location"]
}
}
}
]
}
{:ok, %Arcanum.Response{tool_calls: tool_calls}} = Gateway.chat(provider, intent)
# tool_calls is a list of:
# %{id: "call_abc", function: %{name: "get_weather", arguments: "{\"location\":\"Berlin\"}"}}
```
Models that don't support native tool calls (e.g. some Ollama models) automatically get XML-text or JSON-text extraction based on their profile's `tool_call_format`.
### Vision (Multimodal)
```elixir
intent = %Intent{
model: "gpt-4o",
messages: [
%{role: :user, content: [
%{type: :text, text: "What's in this image?"},
%{type: :image_url, url: "https://example.com/photo.jpg"}
]}
]
}
# Or with base64:
%{type: :image_base64, media_type: "image/png", data: "iVBOR..."}
```
### Embeddings
```elixir
{:ok, embeddings} = Gateway.embed(provider, "gpt-4o", "Hello world")
# embeddings is a list of floats
```
Supported by OpenAI and Ollama adapters. Returns `{:error, :not_supported}` for adapters that don't override the default.
### Image Generation
```elixir
alias Arcanum.MediaIntent
media_intent = %MediaIntent{
model: "gpt-image-1",
prompt: "A cat wearing a wizard hat",
size: "1024x1024",
quality: "auto",
n: 1,
format: "png"
}
{:ok, %Arcanum.MediaResponse{items: items}} = Gateway.generate_image(provider, media_intent)
# Each item: %{data: binary(), url: nil, revised_prompt: "...", content_type: "image/png"}
```
### Video Generation
```elixir
{:ok, %Arcanum.MediaResponse{items: items}} = Gateway.generate_video(provider, media_intent)
```
Both `generate_image/3` and `generate_video/3` return `{:error, :not_supported}` for adapters that don't override the default implementation.
### List Models
```elixir
{:ok, models} = Gateway.list_models(provider)
# ["gpt-4o", "gpt-4o-mini", "gpt-4.1", ...]
```
### Probe Availability
```elixir
Arcanum.Probe.probe_provider(provider)
# :online | :offline
```
Cloud providers always return `:online`. Local providers get a TCP connect check (2s timeout).
### Ensure Model Loaded (LM Studio)
```elixir
:ok = Arcanum.EnsureModel.ensure_loaded(provider, "qwen2.5-coder", context_length: 32_768)
```
Pre-loads a model on LM Studio with the specified context length. No-op for all other providers.
### GitHub Copilot Authentication
```elixir
alias Arcanum.Auth.Copilot
# 1. Start device flow
{:ok, flow} = Copilot.start_device_flow()
# flow.verification_uri -> "https://github.com/login/device"
# flow.user_code -> "ABCD-1234"
# 2. User visits URL and enters code, then:
{:ok, access_token} = Copilot.poll_for_token(flow)
# 3. Use the token as the provider's api_key
provider = %{
base_url: Copilot.base_url(),
api_key: access_token,
kind: "github-copilot",
api_format: :openai,
type: :cloud,
extra_headers: Copilot.copilot_headers(access_token)
}
```
For non-blocking flows, use `Copilot.poll_once/1` for single-attempt polling (e.g. from an Oban job).
## Configuration
### Application Config
```elixir
# Required for GitHub Copilot OAuth
config :arcanum, copilot_client_id: "your-github-oauth-client-id"
# Optional: override HTTP client (defaults to Req)
config :arcanum, http_client: MyCustomClient
```
### Model Profile System
Every model gets a `ModelProfile` that declares its capabilities upfront. Profiles drive serialization, normalization, and feature gating — the adapter never guesses.
```elixir
%Arcanum.ModelProfile{
supports_system_role: true, # can the model accept system messages?
supports_tools: true, # native tool call support?
supports_vision: false, # multimodal image input?
supports_image_generation: false, # image generation capability?
supports_video_generation: false, # video generation capability?
tool_call_format: :native, # :native | :xml_text
reasoning_field: nil, # atom — where the model puts thinking (e.g. :reasoning_content)
thinking_param: nil, # map sent to provider to enable thinking (e.g. %{type: "enabled"})
preserve_reasoning: false, # keep thinking content in response?
max_context: 131_072, # maximum context window
max_images_per_message: 4, # vision: max images per message
max_outputs_per_request: 4, # media generation: max outputs
supported_sizes: [], # media generation: allowed dimensions
supported_formats: [], # media generation: allowed formats
provider_routing: nil # provider-specific routing metadata
}
```
### Profile Resolution
Profiles are resolved automatically by `Gateway` via `Arcanum.ModelProfile.Resolver`. Resolution follows a strict priority chain:
```
1. User overrides (highest — caller-provided fields)
2. Overlay (provider/model-specific, from priv/overlays.json)
3. Registry (models.dev cache — single source of truth)
4. Provider default (fallback for local providers not in models.dev)
5. Global default (lowest — assumes weakest capabilities)
```
#### Registry (models.dev)
The `Arcanum.ModelProfile.Registry` GenServer fetches model capabilities from [models.dev](https://models.dev) and caches them in ETS. Refreshes hourly. Falls back gracefully if the fetch fails.
Default providers fetched: `openai`, `anthropic`, `deepseek`, `openrouter`, `xai`, `zai`, `zhipuai`, `github-copilot`, `lmstudio`.
```elixir
# Lookup a cached profile (returns nil if not found)
Arcanum.ModelProfile.Registry.lookup("openai", "gpt-4o")
# List all cached provider IDs
Arcanum.ModelProfile.Registry.cached_providers()
```
#### Overlays (`priv/overlays.json`)
Overlays patch capabilities that models.dev doesn't track (vision, image generation, reasoning params). They are compiled into the Resolver at build time.
```json
{
"overlays": {
"openai": {
"gpt-4o": { "supports_vision": true },
"gpt-image-1": {
"supports_image_generation": true,
"supported_sizes": ["1024x1024", "1024x1536", "1536x1024", "auto"],
"supported_formats": ["png", "webp", "jpeg"],
"max_outputs_per_request": 4
}
},
"deepseek": {
"deepseek-r1": { "preserve_reasoning": true }
}
},
"provider_defaults": {
"ollama": {
"supports_system_role": true,
"supports_tools": false,
"tool_call_format": "xml_text",
"max_context": 32768
}
}
}
```
#### Provider Defaults
For local providers not in models.dev (Ollama, LM Studio, vLLM), provider defaults from `priv/overlays.json` are used as the base profile. These assume conservative capabilities.
#### Profile Overrides
Callers can override any profile field at call time via the `:profile_overrides` option. Overrides take the highest priority in the resolution chain.
```elixir
# Force a model to use XML text tool calls
Gateway.chat(provider, intent, profile_overrides: %{tool_call_format: :xml_text})
# Override context window for a specific call
Gateway.chat(provider, intent, profile_overrides: %{max_context: 65_536})
# Enable vision for a model not in the registry
Gateway.chat(provider, intent, profile_overrides: %{supports_vision: true})
# Multiple overrides
Gateway.chat(provider, intent,
profile_overrides: %{
supports_tools: false,
tool_call_format: :xml_text,
max_context: 16_384
}
)
```
Any field from `ModelProfile` can be overridden. The override map is merged on top of the resolved profile, so you only need to specify the fields you want to change.
### Gateway Options
All `Gateway.chat/3` and `Gateway.stream/3` calls accept an opts keyword list:
| Option | Type | Description |
|--------|------|-------------|
| `:profile_overrides` | `map()` | Override any `ModelProfile` fields for this call. |
| `:adapter` | `module()` | Override the adapter module (useful for testing). |
## Architecture
```
Gateway (single public entry point)
-> Auth resolution (API key, Copilot OAuth headers)
-> Profile resolution (Resolver: overrides > overlay > registry > provider default > global default)
-> Adapter dispatch (OpenAI, Anthropic, Ollama)
-> Response normalization (Normalizer: content fallback, think-tag stripping, tool-call extraction)
```
### Core Modules
| Module | Purpose |
|--------|---------|
| `Arcanum.Gateway` | Single entry point for all inference calls. |
| `Arcanum.Intent` | Canonical request struct. Content is always `[content_block()]`. |
| `Arcanum.Response` | Canonical response struct (content, thinking, tool_calls, usage). |
| `Arcanum.MediaIntent` | Request struct for image/video generation. |
| `Arcanum.MediaResponse` | Response struct for generated media (items with data/url). |
| `Arcanum.ModelProfile` | Declares model capabilities (tools, vision, reasoning, context). |
| `Arcanum.ModelProfile.Resolver` | Multi-layer profile resolution with override support. |
| `Arcanum.ModelProfile.Registry` | ETS cache backed by models.dev, refreshed hourly. |
| `Arcanum.Response.Normalizer` | Profile-driven post-processing (XML/JSON tool extraction, think tags). |
| `Arcanum.Provider` | Behaviour + macro (`use Arcanum.Provider`) with defoverridable defaults. |
| `Arcanum.Probe` | TCP availability check for local providers. |
| `Arcanum.EnsureModel` | Pre-loads models on LM Studio before inference. |
| `Arcanum.Auth.Copilot` | GitHub Copilot OAuth device code flow (RFC 8628). |
### Adapters
| Adapter | Behaviour Callbacks |
|---------|-------------------|
| `Arcanum.Adapters.OpenAI` | `chat`, `stream`, `list_models`, `embed`, `generate_image` |
| `Arcanum.Adapters.Anthropic` | `chat`, `stream`, `list_models` |
| `Arcanum.Adapters.Ollama` | `chat`, `stream`, `list_models`, `embed` |
### Error Handling
All Gateway functions return `{:ok, result}` or `{:error, reason}`. Error shapes:
| Error | Meaning |
|-------|---------|
| `{:error, {:api_error, status, body}}` | HTTP error from the provider. |
| `{:error, :context_overflow}` | Input exceeded the model's context window. |
| `{:error, :not_supported}` | Adapter doesn't implement the requested callback. |
| `{:error, :copilot_auth_required}` | Copilot provider needs OAuth authentication. |
| `{:error, term()}` | Network or other transient errors. |
Transient HTTP errors (429, 502, 503, 529) are retried automatically up to 3 times by the adapters.
## Design Principles
- **Profile-driven.** Model capabilities are declared upfront, never discovered via error codes.
- **Everything has a limit.** Retries, timeouts, model counts, poll attempts — all bounded.
- **Callers never touch adapters directly.** Gateway is the only public interface.
- **Two-layer separation.** Adapters handle wire protocol faithfully. Normalizer handles model-specific post-processing.
## License
MIT