Skip to main content

README.md

# ExSQL

A SQLite implementation in pure Elixir — no NIFs, no ports, no C.

ExSQL follows the architecture of SQLite's C source (tokenizer → parser →
execution → storage) but reshapes each stage for the BEAM: the engine is a
pure functional core over immutable data, with an optional GenServer
connection for stateful, `sqlite3`-style use.

## Usage

```elixir
{:ok, conn} = ExSQL.open()

ExSQL.execute!(conn, """
CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT NOT NULL, age INTEGER);
INSERT INTO users (name, age) VALUES ('alice', 34), ('bob', 29), ('carol', 41);
""")

result = ExSQL.query!(conn, "SELECT name, age FROM users WHERE age > 30 ORDER BY age DESC")
result.columns #=> ["name", "age"]
result.rows    #=> [["carol", 41], ["alice", 34]]
```

Or skip the process entirely and thread the database value yourself:

```elixir
db = ExSQL.Database.new()
{:ok, _, db} = ExSQL.Executor.run(db, "CREATE TABLE t (x INTEGER)")
{:ok, _, db} = ExSQL.Executor.run(db, "INSERT INTO t VALUES (1), (2), (3)")
{:ok, [result], _db} = ExSQL.Executor.run(db, "SELECT sum(x) FROM t")
result.rows #=> [[6]]
```

## Using with Ecto

ExSQL ships an Ecto adapter, `Ecto.Adapters.ExSQL`, so it can back an Ecto repo
as a drop-in SQL adapter — application code stays standard Ecto, and a database
can be in-memory or persisted to a real SQLite file.

```elixir
# config/config.exs
config :my_app, MyApp.Repo,
  adapter: Ecto.Adapters.ExSQL,
  database: "priv/my_app.db", # or :memory for an in-memory database
  pool_size: 1                # required — see below

# lib/my_app/repo.ex
defmodule MyApp.Repo do
  use Ecto.Repo, otp_app: :my_app, adapter: Ecto.Adapters.ExSQL
end
```

With a schema, ordinary Ecto queries work:

```elixir
import Ecto.Query

MyApp.Repo.insert!(%MyApp.User{name: "alice", age: 34})

MyApp.Repo.all(from u in MyApp.User, where: u.age > 30, order_by: [desc: u.age])
```

> **`pool_size: 1` is required.** Each connection holds its own immutable
> database value, so a second writer connection would clobber the first's
> writes. The intended model is a single writer with lock-free snapshot reads
> (each read runs against a consistent, immutable point-in-time value).

## Compared with Exqlite

Exqlite is the mature, production-oriented choice when you want the real
SQLite engine from Elixir. It binds to `sqlite3` through NIFs, so it inherits
SQLite's full feature set, planner, file locking, and performance profile.
ExSQL has a different goal: it is a SQLite-compatible engine written in
Elixir, with ordinary BEAM data structures and no native boundary. That makes
some tradeoffs better for BEAM applications, while other SQLite/exqlite limits
remain or are not solved yet.

| Exqlite limitation / behavior | ExSQL status |
|-------------------------------|--------------|
| Prepared statements are not cached | **Not solved yet.** ExSQL's DBConnection query object is an immutable Elixir struct, but the engine still parses and executes the SQL text for each call. A parser/plan cache is a future optimization. |
| Prepared statements are mutable handles and must not be manipulated concurrently | **Improved by design.** ExSQL has no mutable `sqlite3_stmt*` handle; query structs and parsed data are normal immutable Elixir values. The stateful connection still serializes execution, but there is no shared native statement object to corrupt. |
| Simultaneous writes are not supported | **Same practical limit.** The Ecto adapter requires `pool_size: 1`. ExSQL can provide immutable snapshot reads, but there is still one authoritative writer for a database path. |
| Native calls run through the Dirty NIF scheduler | **Avoided.** ExSQL is pure Elixir: no NIFs, no ports, and no C calls. Long queries consume BEAM reductions like other Elixir code instead of occupying Dirty NIF scheduler work. |
| Datetimes are stored without offsets | **Improved for Ecto values.** The Ecto adapter encodes `DateTime` values with `DateTime.to_iso8601/1` and decodes `:utc_datetime` values with `DateTime.from_iso8601/1`, so offset-bearing ISO-8601 text round-trips through ExSQL. SQLite-compatible SQL date/time functions still follow SQLite semantics and normalize timezone offsets during calculation. |
| BLOB values require `{:blob, binary}` or they are treated as strings | **Same explicit representation.** ExSQL also represents TEXT as plain Elixir binaries and BLOB as `{:blob, binary}`. Use SQL blob literals such as `x'CAFE'` or the tagged tuple when binding/returning BLOB values. |

In short: ExSQL can improve on Exqlite's native-boundary and mutable-handle
constraints, and it can use immutable snapshots as a BEAM-native read model.
It does not yet match SQLite/exqlite for maturity, full SQL coverage, query
planning, or raw engine performance.

## Architecture

The module layout mirrors SQLite's pipeline:

| stage | SQLite (C) | ExSQL | approach |
|-------|------------|-------|----------|
| lexing | `tokenize.c` (char-class table + keyword hash) | `ExSQL.Tokenizer` | binary pattern matching |
| parsing | `parse.y` (Lemon LALR) | `ExSQL.Parser` | recursive descent, precedence climbing |
| values | `vdbemem.c` (`Mem` cells) | `ExSQL.Value` | storage classes as native terms |
| execution | codegen + VDBE bytecode VM (`vdbe.c`) | `ExSQL.Executor` | tree-walking interpreter |
| storage | `btree.c` + `pager.c` | `ExSQL.Table` / `ExSQL.Database` | immutable in-memory maps keyed by rowid |
| connection | `sqlite3*` handle + mutex | `ExSQL.Connection` | GenServer serializing statements |

SQLite semantics implemented so far:

- **Type affinity** (`INTEGER`/`TEXT`/`REAL`/`NUMERIC`/`BLOB`) with SQLite's
  declared-type derivation rules — `VARCHAR(40)` is TEXT affinity, `DECIMAL`
  is NUMERIC, untyped columns are BLOB — and §4.2 **comparison affinity**:
  a numeric-affinity operand coerces the other side to NUMERIC, a TEXT one
  coerces a no-affinity side to TEXT, literals compared to literals stay
  untouched (so `'500' = 500` is false but `t = 500` matches a TEXT column);
  applied across `=`/`<`/`IS`/`IN`/`BETWEEN`/`CASE`/`USING`, with `IN
  (SELECT ...)` using the subquery column's affinity
- **Storage classes** mapped to native terms: `nil`, integers, floats,
  binaries (TEXT), `{:blob, binary}` — with SQLite's cross-class comparison
  ordering (NULL < numeric < TEXT < BLOB)
- **Three-valued logic**: `NULL AND 0` is `0`, `NULL OR 0` is `NULL`,
  `x = NULL` never matches, `IS [NOT]` is the NULL-safe comparison
- **`INTEGER PRIMARY KEY` is the rowid**: inserting `NULL` auto-assigns,
  explicit values re-seed the counter, `rowid`/`oid`/`_rowid_` resolve to it
- **Constraints**: `NOT NULL`, `UNIQUE`, `PRIMARY KEY`, `DEFAULT`, with
  statement-level atomicity on violation (ABORT semantics)
- **Statements**: `CREATE TABLE [IF NOT EXISTS]`, `DROP TABLE [IF EXISTS]`,
  `INSERT` (multi-row `VALUES`, `INSERT INTO ... SELECT`, `DEFAULT VALUES`,
  explicit `rowid` targets), `REPLACE INTO` / `INSERT OR REPLACE/IGNORE` /
  `UPDATE OR ...` conflict handling, bare `VALUES (...)` selects, `SELECT`
  (`WHERE`, `GROUP BY`/`HAVING` — including alias and position terms,
  `ORDER BY` — column / alias / position / aggregate expression,
  `LIMIT`/`OFFSET` including the `LIMIT x, y` form, `DISTINCT`), `UPDATE`,
  `DELETE`
- **Transactions**: `BEGIN`/`COMMIT`/`ROLLBACK` and
  `SAVEPOINT`/`RELEASE`/`ROLLBACK TO` — a snapshot stack on the immutable
  database value, with SQLite's error messages (`cannot start a transaction
  within a transaction`, …)
- **Compound selects**: `UNION [ALL]`, `INTERSECT`, `EXCEPT`, with SQLite's
  rules — ORDER BY/LIMIT only after the last component, terms resolved
  against output columns (by position, name, or any component's column
  name), distinct ops emerging sorted as from the temp B-tree
- **Joins**: comma, `[INNER | CROSS] JOIN`, `NATURAL`, `LEFT [OUTER] JOIN`,
  `ON`/`USING`, table aliases, subqueries in `FROM`, qualified `t.*` — with
  SQLite's column-hiding rules for `NATURAL`/`USING` in `*` expansion, its
  join-type validation (`unknown join type: INNER OUTER`), and ambiguous
  column detection
- **Subqueries**: scalar `(SELECT ...)`, `EXISTS`, `IN (SELECT ...)`, all
  correlated — free columns resolve against the enclosing row, including in
  `UPDATE`/`DELETE` `WHERE` clauses
- **Expressions**: full operator precedence, `LIKE`/`GLOB`, `IN`, `BETWEEN`,
  `CASE`, `CAST` (§4.1 forced-conversion rules, longest-numeric-prefix text
  parsing), bitwise `&`/`|`/`<<`/`>>`/`~` (int64 wrap, negative shifts
  reverse), string/blob/hex literals, comments
- **Functions**: aggregates (`count(*)`, `count`, `sum`, `total`, `avg`,
  `min`, `max`, `group_concat`/`string_agg`) and scalars (`abs`, `length`,
  `octet_length`, `lower`, `upper`, `coalesce`, `ifnull`, `nullif`,
  `substr`/`substring`, `round`, `typeof`, multi-arg `min`/`max`, `replace`,
  `trim`/`ltrim`/`rtrim`, `instr`, `hex`, `quote`, `char`, `unicode`,
  `sign`, `iif`, `zeroblob`, `concat`/`concat_ws`, function-form
  `like()`/`glob()`), with SQLite's "no such function" vs "wrong number of
  arguments" errors

## Tests

`test/sqlite/` holds ExUnit translations of SQLite's own TCL test suite
(`sqlite/test/*.test`), one module per source file with the original test
ids preserved — currently `select3.test` (aggregates, GROUP BY/HAVING),
`select4.test` (compound selects), `join.test`, `in.test`, `subquery.test`,
`cast.test`, `expr.test` (operators), `func.test`, `insert.test`,
`trans.test`, `savepoint.test`, and `types2.test` (comparison affinity).
`ExSQL.SQLiteCase` provides
`execsql`/`execsql2`/`catchsql` helpers mirroring `tester.tcl`, so a TCL
case translates nearly word for word. As features from the roadmap land,
the matching test file gets translated alongside.

## Benchmarks

[BENCHMARKS.md](BENCHMARKS.md) compares ExSQL against the C `sqlite3` engine
over the sqllogictest corpus, and measures the file-backed read path, with full
methodology and commands to reproduce. The scripts live in [`bench/`](bench/).

## Roadmap

The near-term order of attack:

1. **ALTER TABLE, CHECK constraints, composite PK/UNIQUE**
2. **Views**, then **indexes + planner**
3. **Bytecode VM** — compile the AST to instructions à la VDBE
   (`vdbe.c` has 190 opcodes; a register machine maps cleanly onto a
   recursive interpreter over immutable register maps)
4. **File format** — read/write the on-disk format (`SQLite format 3\0`
   header, B-tree pages, varint record encoding from `btreeInt.h`), which
   would make ExSQL databases interchangeable with real SQLite

## Development

```sh
mix test       # run the suite
mix precommit  # format + compile with warnings-as-errors + test
```

## License

BSD 3-Clause. See [LICENSE](LICENSE).