# QuackDB
[](https://hex.pm/packages/quackdb)
[](https://hexdocs.pm/quackdb)
DuckDB for Elixir applications, over DuckDB's experimental Quack protocol.
QuackDB gives Elixir applications an OTP-supervised DuckDB process, a DBConnection client, an Ecto adapter and query DSL for analytical DuckDB workflows, native append APIs, Explorer dataframe writes, Geo/WKB spatial integration, Table.Reader results, telemetry, and a managed DuckDB binary installer.
> [!WARNING]
> QuackDB targets DuckDB's experimental Quack protocol and is not production-ready yet. Public APIs, result shapes, Ecto adapter behavior, and supported protocol coverage may still change as DuckDB and QuackDB evolve. Validate behavior against your DuckDB version before relying on it for critical workloads.
```elixir
defmodule MyApp.Analytics do
use QuackDB.Ecto
alias QuackDB.Source
def category_latency do
source = Source.parquet("s3://bucket/events/*.parquet", hive_partitioning: true)
from event in source,
group_by: event.category,
select: %{
category: event.category,
p95: quantile_cont(event.duration_ms, 0.95),
median: median(event.duration_ms),
events: count()
}
end
end
```
## Why QuackDB?
DuckDB is already excellent at analytical SQL. QuackDB focuses on the Elixir side:
- run DuckDB as a supervised process during development, tests, notebooks, examples, or local apps;
- use DBConnection semantics for pooled sessions, transactions, streams, and query lifecycle;
- compose DuckDB analytical queries with Ecto instead of assembling SQL strings;
- use Elixir-native values such as `Duration`, `%Geo.*{}`, `Date.Range`, maps, lists, structs, and Explorer dataframes where possible;
- append rows, columns, and Explorer dataframes through DuckDB's native append path;
- expose results to Livebook and dataframe tooling through `Table.Reader`;
- keep raw SQL available when DuckDB-specific syntax is clearer or not represented by Ecto.
## Elixir integrations
| Elixir layer | QuackDB integration |
| --- | --- |
| OTP | supervised local DuckDB server, managed binary, restartable child specs |
| DBConnection | pooled Quack sessions, queries, streams, transactions |
| Ecto | adapter, query DSL, analytical helpers, migrations, writes |
| Explorer | dataframe append and dataframe-friendly results |
| Geo | `%Geo.*{}` params and WKB/GeoJSON workflows |
| Table.Reader | Livebook/dataframe-friendly result consumption |
| Telemetry | query, append, and fetch spans |
| Mix | `quackdb.install` task for managed DuckDB binaries |
## Installation
Add QuackDB to your dependencies:
```elixir
def deps do
[
{:quackdb, "~> 0.3.0"}
]
end
```
Optional integrations are enabled when their packages are present:
```elixir
def deps do
[
{:quackdb, "~> 0.3.0"},
{:ecto_sql, "~> 3.13"},
{:explorer, "~> 0.11"},
{:geo, "~> 4.1"}
]
end
```
DuckDB's Quack protocol is experimental. For local development, use DuckDB 1.5.3 or newer with the `quack` extension.
## Supervised DuckDB
For development, tests, examples, and notebooks, QuackDB can supervise DuckDB's CLI process and start a matching client pool. `child_specs/1` generates one random token and injects it into both children.
```elixir
children =
QuackDB.Server.child_specs(
server: [
name: MyApp.DuckDB,
duckdb: :managed,
endpoint: "quack:localhost:9494"
],
client: [
name: MyApp.QuackDB,
pool_size: System.schedulers_online()
]
)
```
`duckdb: :managed` downloads DuckDB's official CLI binary on first use, verifies known checksums for QuackDB's pinned DuckDB version, and caches it. QuackDB never downloads DuckDB during dependency compilation. Use `QUACKDB_BINARY_PATH`, `QUACKDB_BINARY_CACHE_DIR`, `duckdb: "/path/to/duckdb"`, or run the `quackdb.install` Mix task when you want explicit control. See the [managed DuckDB guide](guides/managed-duckdb.md).
You can also start DuckDB manually:
```sh
duckdb -interactive -init /dev/null \
-cmd "LOAD quack; CALL quack_serve('quack:localhost', token='super_secret');"
```
`quack:localhost` often binds on IPv6 localhost, so examples use `http://[::1]:9494`.
## DBConnection client
QuackDB can be used directly through its DBConnection-backed client.
```elixir
{:ok, conn} =
QuackDB.start_link(
uri: "http://[::1]:9494",
token: "super_secret"
)
{:ok, result} = QuackDB.query(conn, "SELECT ? AS name, ? AS n", ["duck", 42])
result.columns
#=> ["name", "n"]
result.rows
#=> [["duck", 42]]
```
Use `QuackDB.stream/4` for large result sets, or `QuackDB.columns/4` when a column-oriented shape is more convenient for analytics tooling.
## DuckDB workflows as Ecto queries
QuackDB exposes common DuckDB analytical workflows as Ecto-compatible helpers so they compose with normal queries.
### Analytical aggregates
```elixir
defmodule MyApp.Analytics do
use QuackDB.Ecto
def category_scores do
from event in "events",
group_by: event.category,
select: %{
category: event.category,
p95: quantile_cont(event.duration_ms, 0.95),
median: median(event.duration_ms),
precise_sum: fsum(event.duration_ms),
mode: mode(event.duration_ms),
weighted_average: weighted_avg(event.duration_ms, event.weight),
values: list(event.duration_ms, order_by: [desc_nulls_last: event.duration_ms]),
slow_events: filter(count(event.id), event.duration_ms > 1_000),
distinct_users: count(event.user_id, :distinct),
average_duration: coalesce(avg(event.duration_ms), 0),
events: count()
}
end
end
```
### Date and timestamp series
```elixir
use QuackDB.Ecto
from day in series(Date.range(~D[2024-01-01], ~D[2024-01-31])),
left_join: event in "events",
on: event.occurred_on == day.value,
group_by: day.value,
order_by: day.value,
select: %{
day: day.value,
events: count(event.id)
}
```
Timestamp series use `Duration` steps:
```elixir
from bucket in series(
~N[2024-01-01 00:00:00],
~N[2024-01-02 00:00:00],
step: Duration.new!(hour: 1)
),
select: bucket.value
```
### Source scans
DuckDB can query data where it already lives. QuackDB source helpers can be used directly as Ecto sources.
```elixir
use QuackDB.Ecto
alias QuackDB.Source
source = Source.parquet("s3://bucket/events/*.parquet", hive_partitioning: true)
from event in source,
group_by: event.category,
select: %{
category: event.category,
events: count(),
avg_score: avg(event.score)
}
```
QuackDB does not upload local files for you. The DuckDB server must be able to see the path, URL, object store, or lakehouse catalog. See the [sources guide](guides/sources.md).
### CTAS and full-text search
External data can be materialized with `CREATE TABLE AS`, indexed with DuckDB FTS, and queried with BM25 from Ecto.
```elixir
use QuackDB.Ecto
alias QuackDB.{DDL, FTS, Source}
query =
from doc in Source.parquet("s3://bucket/docs/*.parquet"),
select: %{
id: doc.id,
title: doc.title,
body: doc.body
}
MyApp.AnalyticsRepo.query!(DDL.create_table("docs", as: query, temporary: true))
MyApp.AnalyticsRepo.query!(FTS.create_index("docs", :id, [:title, :body], overwrite: true))
schema = FTS.schema_name("main.docs")
search = "duckdb analytics"
from doc in "docs",
where: bm25(^schema, doc.id, ^search) > 0,
order_by: [desc: bm25(^schema, doc.id, ^search)],
limit: 10,
select: %{
id: doc.id,
title: doc.title,
score: bm25(^schema, doc.id, ^search)
}
```
See the [full-text search guide](guides/full-text-search.md).
### Text and regex predicates
DuckDB text and RE2 regular-expression helpers compose with Ecto filters and aggregate `FILTER` clauses. Shared `contains/2` dispatches obvious string calls to DuckDB `contains` and spatial helper expressions to `ST_Contains`; ambiguous calls raise so `contains_text/2` and `st_contains/2` are available when you want to be explicit.
```elixir
use QuackDB.Ecto
from event in "events",
where: contains(event.name, "duck") and regexp_matches(event.name, ~r/^duck/i),
select: %{
slug: regexp_replace(event.name, ~r/\s+/, "-", "g"),
parts: string_split(event.tags, ",")
}
```
DuckDB regexes use RE2, so `~r` literals are intended for the syntax subset shared with Elixir regexes.
### Spatial queries
DuckDB Spatial works with Ecto queries and `%Geo.*{}` structs when the optional `:geo` package is installed.
```elixir
use QuackDB.Ecto
import QuackDB.Ecto.Spatial
alias QuackDB.Spatial
MyApp.AnalyticsRepo.query!(Spatial.load())
point = %Geo.Point{coordinates: {13.405, 52.52}, srid: nil}
from place in "places",
where: intersects(place.geom, ^point) and distance(place.geom, ^point) < 1_000,
select: %{
id: place.id,
name: place.name,
wkt: as_text(place.geom)
}
```
`GEOMETRY` values decode as WKB-compatible bytes for tested DuckDB Spatial values. `QuackDB.Geometry` can convert to/from `%Geo.*{}` structs when the optional `:geo` package is installed. See the [spatial guide](guides/spatial.md) and the [Spatial WMS example](https://github.com/elixir-vibe/quackdb/tree/master/examples/spatial_wms).
## Writes and dataframes
QuackDB can write through DuckDB's native append protocol instead of generating huge `INSERT VALUES` statements.
```elixir
QuackDB.insert_rows!(conn, "events", [
[id: 1, name: "duck", tags: ["bird", "wetland"]],
[id: 2, name: "goose", tags: ["bird", "loud"]]
])
QuackDB.insert_columns!(conn, "measurements", [
id: [1, 2, 3],
temperature: [12.5, 13.0, 12.8]
])
```
Explicit MAP columns accept ordinary Elixir maps while plain map inference stays STRUCT-shaped:
```elixir
QuackDB.insert_rows!(conn, "events", [[id: 1, labels: %{env: "prod", region: "eu"}]],
columns: [id: :integer, labels: {:map, :varchar, :varchar}]
)
```
When Explorer is installed, dataframes can be appended directly:
```elixir
alias Explorer.DataFrame
alias QuackDB.Explorer, as: QuackExplorer
frame = DataFrame.new(id: [1, 2], name: ["duck", "goose"])
QuackExplorer.insert_dataframe!(conn, "events", frame)
```
Enumerable rows can be streamed into native append batches:
```elixir
File.stream!("events.ndjson")
|> Stream.map(&Jason.decode!/1)
|> QuackDB.insert_stream!(conn, "events", chunk_every: 10_000)
```
Any `Table.Reader`-compatible data can be appended through the same column append path:
```elixir
QuackDB.insert_table!(conn, "events", %{id: [1, 2], name: ["duck", "goose"]})
```
Append supports explicit types, batching, scalar DuckDB values, and nested `LIST`, `STRUCT`, `ARRAY`, and `MAP` values. See the [type support guide](guides/type-support.md) and the [Explorer guide](guides/explorer.md).
## Results, Livebook, and telemetry
`QuackDB.Result` and `QuackDB.Columns` implement `Table.Reader`, so they can be consumed by Livebook and other Table-aware tooling. When Explorer is installed, query results can be turned into dataframes:
```elixir
result = QuackDB.query!(conn, "SELECT * FROM events")
Explorer.DataFrame.new(result)
```
QuackDB emits telemetry spans for query, append, and fetch operations:
- `[:quackdb, :query, :start | :stop]`
- `[:quackdb, :append, :start | :stop]`
- `[:quackdb, :fetch, :start | :stop]`
Metadata includes connection/session information, command details, append batch counts, and client query IDs. Params are not included unless you opt in with `telemetry_params: true`. See the [telemetry guide](guides/telemetry.md).
## Ecto coverage
QuackDB includes an optional Ecto SQL adapter for applications that want Ecto query composition, schema-based reads/writes, migrations, and raw SQL through `Repo.query/3`.
```elixir
defmodule MyApp.AnalyticsRepo do
use Ecto.Repo,
otp_app: :my_app,
adapter: Ecto.Adapters.QuackDB
end
```
The adapter currently covers:
- raw SQL via `Repo.query/3`;
- schema-backed full selects and `Repo.get!/2`;
- analytical reads with joins, filters, grouping, windows, CTEs, combinations, locks, fragments, and QuackDB helper macros;
- `Repo.insert/2`, `Repo.insert_all/3`, `RETURNING`, `ON CONFLICT DO NOTHING`, and common `DO UPDATE` upserts;
- explicit native append fast path via `insert_method: :append`;
- `update_all`, `delete_all`, schema `update/delete`, and transaction usage;
- `Ecto.Adapters.SQL.explain/4`;
- basic migration DDL through Ecto migrator: create/drop/alter tables, columns, references, indexes, primary keys, check constraints, and renames.
DuckDB-specific SQL that Ecto cannot model cleanly should still use `Repo.query/3`. See the [Ecto coverage matrix](docs/ecto-analytical-coverage.md).
## Examples
The repository includes runnable scripts, a Livebook notebook, and a small WMS app:
- [`examples/ecto_analytics.exs`](https://github.com/elixir-vibe/quackdb/blob/master/examples/ecto_analytics.exs) — Ecto analytical aggregates, text predicates, and DuckDB `SUMMARIZE` profiling.
- [`examples/source_sampling.exs`](https://github.com/elixir-vibe/quackdb/blob/master/examples/source_sampling.exs) — JSON source scanning, `USING SAMPLE`, Ecto composition, and sampled source profiling.
- [`examples/dataframe_analytics.exs`](https://github.com/elixir-vibe/quackdb/blob/master/examples/dataframe_analytics.exs) — derive DDL from an Ecto schema, append an Explorer dataframe, query with Ecto DSL, and return a dataframe.
- [`examples/full_text_search.exs`](https://github.com/elixir-vibe/quackdb/blob/master/examples/full_text_search.exs) — materialize a source scan, build a DuckDB FTS index, and query BM25 search through direct helpers and Ecto.
- [`examples/spatial_wms/`](https://github.com/elixir-vibe/quackdb/tree/master/examples/spatial_wms) — an Ash + Ecto + Plug/Bandit app serving DuckDB Spatial rows through a WMS-like GeoJSON endpoint.
- [`examples/query_observability.exs`](https://github.com/elixir-vibe/quackdb/blob/master/examples/query_observability.exs) — attach telemetry handlers and print query, append, and fetch timings.
- [`examples/append_benchmark.exs`](https://github.com/elixir-vibe/quackdb/blob/master/examples/append_benchmark.exs) — compares SQL inserts, native row/column append, Explorer append, and Ecto insert paths.
- [`examples/livebook_analytics.livemd`](https://github.com/elixir-vibe/quackdb/blob/master/examples/livebook_analytics.livemd) — interactive analytics with DuckDB SQL, Explorer, Table.Reader, VegaLite, and telemetry.
Run scripts from outside the Mix project so `Mix.install/2` can load the local package:
```sh
cd /tmp
elixir /path/to/quackdb/examples/dataframe_analytics.exs
```
## Current boundaries
QuackDB is intentionally focused on DuckDB analytics over Quack:
- the Quack wire protocol is experimental and may change;
- unsupported vector/logical types raise explicit protocol errors;
- Ecto coverage focuses on analytical workflows and common write/setup paths, not every adapter edge case;
- QuackDB does not stage/upload local files to a remote server;
- Arrow IPC / zero-copy columnar handoff is research for now;
- managed DuckDB binary downloads currently cover Linux/macOS targets, not Windows.
## Documentation
- [Getting started](guides/getting-started.md)
- [Type support](guides/type-support.md)
- [Examples](guides/examples.md)
- [Managed DuckDB](guides/managed-duckdb.md)
- [Explorer](guides/explorer.md)
- [Sources](guides/sources.md)
- [Spatial](guides/spatial.md)
- [Full-text search](guides/full-text-search.md)
- [Telemetry](guides/telemetry.md)
- [Protocol coverage](docs/protocol/coverage.md)
- [Ecto coverage](docs/ecto-analytical-coverage.md)
## Development
```sh
mix deps.get
mix ci
```
See [CONTRIBUTING.md](CONTRIBUTING.md) for local checks, example smoke tests, package audit steps, and release dry-run notes.
Integration tests are skipped by default. To run them against a Quack server:
```sh
QUACKDB_TEST_URI='http://[::1]:9494' \
QUACKDB_TEST_TOKEN=super_secret \
mix test --include integration
```