# Braintrust
> ⚠️ **Work in Progress** - This package is under active development. The README below describes the target API design and may not reflect current functionality.
An unofficial Elixir client for the [Braintrust](https://braintrust.dev) AI evaluation and observability platform.
Braintrust is an end-to-end platform for evaluating, monitoring, and improving AI applications. This Hex package provides Elixir/Phoenix applications with access to Braintrust's REST API for managing projects, experiments, datasets, logs, and prompts.
## Installation
Add `braintrust` to your list of dependencies in `mix.exs`:
```elixir
def deps do
[
{:braintrust, "~> 0.0.1"}
]
end
```
## Configuration
Set your API key via environment variable:
```bash
export BRAINTRUST_API_KEY="sk-your-api-key"
```
Or configure in your application:
```elixir
# config/config.exs
config :braintrust, api_key: System.get_env("BRAINTRUST_API_KEY")
# Or at runtime
Braintrust.configure(api_key: "sk-xxx")
```
API keys can be created at [braintrust.dev/app/settings](https://www.braintrust.dev/app/settings?subroute=api-keys).
## Usage
### Projects
```elixir
# List all projects
{:ok, projects} = Braintrust.Project.list()
# Create a project
{:ok, project} = Braintrust.Project.create(%{name: "my-project"})
# Get a project by ID
{:ok, project} = Braintrust.Project.get(project_id)
# Delete a project
:ok = Braintrust.Project.delete(project_id)
```
### Logging Traces
Log production traces for observability:
```elixir
{:ok, _} = Braintrust.Log.insert(project_id, %{
events: [
%{
input: %{messages: [%{role: "user", content: "Hello"}]},
output: "Hi there!",
scores: %{quality: 0.9},
metadata: %{model: "gpt-4"}
}
]
})
```
### Experiments
Run evaluations and track results:
```elixir
# Create an experiment
{:ok, experiment} = Braintrust.Experiment.create(project_id, %{name: "baseline-v1"})
# Insert experiment events
{:ok, _} = Braintrust.Experiment.insert(experiment_id, %{
events: [
%{
input: %{question: "What is 2+2?"},
output: "4",
scores: %{accuracy: 1.0}
}
]
})
# Get experiment summary with scores
{:ok, summary} = Braintrust.Experiment.summarize(experiment_id)
```
### Datasets
Manage test data for evaluations:
```elixir
# Create a dataset
{:ok, dataset} = Braintrust.Dataset.create(project_id, %{name: "test-cases"})
# Insert test cases
{:ok, _} = Braintrust.Dataset.insert(dataset_id, %{
events: [
%{input: %{question: "What is 2+2?"}, expected: "4"},
%{input: %{question: "What is 3+3?"}, expected: "6"}
]
})
# Fetch dataset records
{:ok, records} = Braintrust.Dataset.fetch(dataset_id)
```
### Prompts
Version-controlled prompt management:
```elixir
# List prompts
{:ok, prompts} = Braintrust.Prompt.list(project_name: "my-project")
# Get a prompt by ID
{:ok, prompt} = Braintrust.Prompt.get(prompt_id)
```
### Pagination
Results are automatically paginated. Use streams for lazy iteration:
```elixir
Braintrust.Project.list()
|> Braintrust.Pagination.stream()
|> Stream.take(100)
|> Enum.to_list()
```
### Error Handling
All API functions return `{:ok, result}` or `{:error, %Braintrust.Error{}}`:
```elixir
case Braintrust.Project.get(project_id) do
{:ok, project} ->
handle_project(project)
{:error, %Braintrust.Error{type: :not_found}} ->
handle_not_found()
{:error, %Braintrust.Error{type: :rate_limit, retry_after: ms}} ->
Process.sleep(ms)
retry()
{:error, %Braintrust.Error{type: :authentication}} ->
handle_auth_error()
{:error, %Braintrust.Error{} = error} ->
Logger.error("API error: #{error.message}")
handle_error(error)
end
```
## Features
- **Projects** - Manage AI projects containing experiments, datasets, and logs
- **Experiments** - Run evaluations and compare results across runs
- **Datasets** - Version-controlled test data with support for pinning evaluations to specific versions
- **Logging/Tracing** - Production observability with span-based tracing
- **Prompts** - Version-controlled prompt management with caching
- **Functions** - Access to tools, scorers, and callable functions
- **Automatic Retry** - Exponential backoff for rate limits and transient errors
- **Pagination Streams** - Lazy iteration over paginated results
## API Coverage
| Resource | Endpoint | Status |
|----------|----------|--------|
| Projects | `/v1/project` | 🚧 Planned |
| Experiments | `/v1/experiment` | 🚧 Planned |
| Datasets | `/v1/dataset` | 🚧 Planned |
| Logs | `/v1/project_logs` | 🚧 Planned |
| Prompts | `/v1/prompt` | 🚧 Planned |
| Functions | `/v1/function` | 🚧 Planned |
| BTQL | `/btql` | 🚧 Planned |
## Resources
- [Braintrust Documentation](https://www.braintrust.dev/docs)
- [API Reference](https://www.braintrust.dev/docs/api-reference/introduction)
- [OpenAPI Specification](https://github.com/braintrustdata/braintrust-openapi)
## License
MIT