Skip to main content

guides/multi_model.md

# Multi-Model Workflows

`Baton.MultiModel` helps when different steps should run on different LLMs, or
when you want to run the *same* step across several models and combine the
results. It builds on the normal workflow API — see the
[building a workflow guide](building_a_workflow.md) for the basics.

## Per-step models

Hardcoding model strings inside each worker is awkward to change and test.
Instead, declare a model map on the workflow with `configure/2`, and let each
worker read its model from the job's args.

`configure/2` records the map on the workflow; `Baton.add/4` then injects the
matching model into each step's args automatically as `"workflow_model"`:

```elixir
Baton.new(workflow_name: "analysis")
|> Baton.MultiModel.configure(%{
  parse:  "claude-sonnet-4-20250514",
  assess: "claude-opus-4-20250514",
  report: "claude-sonnet-4-20250514"
})
|> Baton.add(:parse,  ParseDoc.new(%{text: text}))
|> Baton.add(:assess, Assess.new(%{}),  deps: [:parse])
|> Baton.add(:report, Report.new(%{}),  deps: [:assess])
|> Baton.insert!()
```

In a worker, read the model with `model_for/2`, which falls back to a default
when no model was configured for that step:

```elixir
def perform_workflow(%Oban.Job{} = job) do
  model = Baton.MultiModel.model_for(job, "claude-sonnet-4-20250514")
  Baton.Debug.call_llm(job, messages, model: model)
end
```

> #### Steps not in the map are untouched {: .info}
>
> Only steps whose name appears in the model map get `workflow_model` injected;
> everything else is added exactly as a normal `Baton.add/4` call.

## Fan-out across models, then synthesize

To run one analysis across several models in parallel and merge the outputs, use
`fan_out/4`. It generates one step per model plus an optional synthesis step
that depends on all of them:

```elixir
Baton.new(workflow_name: "multi-model-quality")
|> Baton.add(:parse, ParseDoc.new(%{text: text}))
|> Baton.MultiModel.fan_out(:assess, Assess,
     models: [
       "claude-sonnet-4-20250514",
       "claude-opus-4-20250514",
       "gpt-4o"
     ],
     args: %{doc_id: "doc-123"},
     deps: [:parse],
     synthesize_with: SynthesizeAssessments,
     synthesize_model: "claude-opus-4-20250514"
   )
|> Baton.add(:report, Report.new(%{}), deps: [:assess_synthesis])
|> Baton.insert!()
```

This produces, all depending on `:parse`:

```
:assess_sonnet_4   ─┐
:assess_opus_4     ─┼─ :assess_synthesis ── :report
:assess_gpt_4o     ─┘
```

- Step names are derived from the model string: `{base}_{short}` (e.g.
  `assess_sonnet_4`), and the synthesis step is `{base}_synthesis`.
- Each fan-out step runs `Assess` with its model injected and the shared `args`.
- The synthesis step runs `SynthesizeAssessments` once all models have finished.

### Options

| Option | Required | Purpose |
|--------|----------|---------|
| `:models` | yes | model strings to fan out across |
| `:args` | no | base args passed to each model's worker (`%{}` default) |
| `:deps` | no | upstream dependencies shared by all fan-out steps |
| `:synthesize_with` | no | worker module for the synthesis step; omit to skip it |
| `:synthesize_model` | no | model for the synthesis step (default: first model) |
| `:synthesize_args` | no | extra args merged into the synthesis step |

If you omit `:synthesize_with`, no synthesis step is added — add your own
downstream step that depends on the generated fan-out step names instead.

### The synthesis worker

In the synthesis step, `collect_fan_results/1` returns each model's result keyed
by its model string, so you can compare or vote across them:

```elixir
defmodule SynthesizeAssessments do
  use Baton.Worker, queue: :default

  @impl true
  def perform_workflow(%Oban.Job{} = job) do
    by_model = Baton.MultiModel.collect_fan_results(job)
    # => %{
    #   "claude-sonnet-4-20250514" => %{"score" => 7, ...},
    #   "claude-opus-4-20250514"   => %{"score" => 8, ...},
    #   "gpt-4o"                    => %{"score" => 6, ...}
    # }

    {:ok, %{"consensus" => MyApp.Vote.median(by_model)}}
  end
end
```

## Tracking cost

Fanning out multiplies your LLM spend, so cost visibility matters. If a worker's
result map includes an `"llm_usage"` key, `Baton.LLMWorker` strips it from the
result and records per-step token counts and cost to `workflow_step_stats`
automatically. Query it with `Baton.Stats`:

```elixir
Baton.Stats.workflow_totals(workflow_id)
# => %{input_tokens: ..., output_tokens: ..., cost_usd: #Decimal<...>, ...}

Baton.Stats.cost_by_model(from_dt, to_dt)
```

Cost is computed via the configured pricing module
(`config :baton, pricing: MyApp.Pricing`) — see `Baton.Pricing`. Provide your
own and keep it current; the built-in `Baton.Pricing.Default` is a starting
point only.

See `Baton.MultiModel` for the full API.