README.md

Select File:
# Marmot - Type-Safe SQL for SQLite in Gleam

[![Package Version](https://img.shields.io/hexpm/v/marmot)](https://hex.pm/packages/marmot)
[![Hex Docs](https://img.shields.io/badge/hex-docs-ffaff3)](https://hexdocs.pm/marmot/)

Marmot is a code generator for Gleam that turns plain `.sql` files into type-safe
functions for SQLite. Write your queries in SQL, point Marmot at your database,
and it generates the Gleam functions, row types, and decoders you need. No ORM,
no query builder, no manual decoder boilerplate. Heavily inspired by
[Squirrel](https://github.com/giacomocavalieri/squirrel) (which does the same
for Postgres).

> If you are an LLM, see [llms.txt](https://raw.githubusercontent.com/pairshaped/marmot/refs/heads/master/llms.txt) for a condensed context document.

## Why Marmot?

If you need to talk to a SQLite database in Gleam, you'll have to write something
like this:

```gleam
import gleam/dynamic/decode
import sqlight

pub type FindUserRow {
  FindUserRow(name: String, email: String)
}

pub fn find_user(db: sqlight.Connection, username: String) {
  let decoder = {
    use name <- decode.field(0, decode.string)
    use email <- decode.field(1, decode.string)
    decode.success(FindUserRow(name:, email:))
  }

  sqlight.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}
```

This is fine for a small project but gets hairy when you have a lot of queries:

- Embedding SQL in strings means no syntax highlighting, no formatting, and
  no way to run queries on their own with external tools.
- You have to keep the decoder in sync with the query output manually.

Instead of hiding SQL behind an ORM, Marmot _embraces it._ Write your queries
in plain `*.sql` files and Marmot generates the corresponding functions.

Instead of the hand-written example above, write the following query:

```sql
-- we're in file `src/users/sql/find_user.sql`
select name, email
from users
where username = ?
```

Run `gleam run -m marmot`. You now have a type-safe
function `find_user` you can use as expected:

```gleam
import sqlight
import generated/sql/users_sql

pub fn main() {
  use db <- sqlight.with_connection("app.sqlite")
  let assert Ok([user]) = users_sql.find_user(db: db, username: "alice")
  // user.name, user.email are fully typed
}
```

Generated functions use labelled arguments
(`users_sql.find_user(db: db, username: "alice")`) so call sites are
self-documenting.

- Marmot keeps encoders and decoders in sync with the query output for you.
- Type safety is preserved: Marmot connects to your SQLite database and uses
  `PRAGMA table_info` and `EXPLAIN` to understand query types.
- Each query is a standalone `*.sql` file, so you can `explain` or lint it
  independently.
- No external tools required. No `sqlc` binary, no `sqlite3` CLI. Marmot uses
  `sqlight` directly.

## Usage

Add Marmot as a dev dependency:

```sh
gleam add marmot --dev
```

Add `sqlight` as a runtime dependency (the generated code calls it):

```sh
gleam add sqlight
```

Generate code by running the `marmot` module:

```sh
gleam run -m marmot
```

As long as you follow a couple of conventions, Marmot will work out of the box:

- Marmot looks for all `*.sql` files in any `sql` directory under your
  project's `src` directory.
- Each `sql` directory becomes a single Gleam module containing a function
  for each `*.sql` file inside it. Generated modules are placed in
  `src/generated/sql/` by default, with a `_sql` suffix to avoid import
  aliasing conflicts.
- Each `*.sql` file _must contain a single SQL query._ The filename becomes
  the generated function name.

> Imagine you have a Gleam project that looks like this
>
> ```txt
> src
> ├── users
> │   └── sql
> │       ├── find_user.sql
> │       └── list_users.sql
> ├── users.gleam
> ├── generated
> │   └── sql
> │       └── users_sql.gleam   -- generated by marmot
> └── app.gleam
> ```
>
> Running `gleam run -m marmot` generates `src/generated/sql/users_sql.gleam`
> with two functions `find_user` and `list_users` you can then import and use:
>
> ```gleam
> import generated/sql/users_sql
>
> users_sql.find_user(db: db, id: 1)
> ```

### Talking to the database

In order to understand the types of your queries, Marmot needs to open the
SQLite database file where your schema is defined. Marmot reads the database
path with the following precedence:

1. `DATABASE_URL` environment variable
2. `--database` CLI flag
3. `database` field in `[tools.marmot]` section of `gleam.toml`

```sh
# Environment variable
DATABASE_URL=dev.sqlite gleam run -m marmot

# CLI flag
gleam run -m marmot -- --database dev.sqlite

# gleam.toml
# [tools.marmot]
# database = "dev.sqlite"
```

If no database is configured, Marmot shows an error message listing all three
options.

### Configuring the output directory

By default, generated modules are placed in `src/generated/sql/`. If you
prefer a different location, set it in `gleam.toml` or via CLI flag:

```toml
[tools.marmot]
output = "src/server/generated/sql"
```

```sh
gleam run -m marmot -- --output src/server/generated/sql
```

The output directory must be under `src/` (Gleam compiles modules from there).

### Custom SQL directory (`sql_dir`)

By default, Marmot finds `.sql` files by walking `src/` for directories named `sql`.
If your project uses a different convention, point `sql_dir` at a specific directory
and Marmot will find all directories under it that contain `.sql` files:

```toml
[tools.marmot]
sql_dir = "src/queries"
```

With this config, Marmot recursively finds all directories under `src/queries/`
that contain `.sql` files. This is useful when your SQL files live outside `sql/`
directories or when you want to limit generation to a specific subtree.

### Custom query wrapper (`query_function`)

By default the generated code calls `sqlight.query` directly. If you want to
add logging, timing, or other instrumentation without forking Marmot, you can
point `query_function` at your own wrapper:

```toml
[tools.marmot]
database = "dev.sqlite"
query_function = "app/db.query"
```

With that config, the generated code imports your module and calls your
function instead of `sqlight.query`:

```gleam
// Without query_function (default):
import sqlight

pub fn find_user(db db: sqlight.Connection, username username: String) {
  sqlight.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}

// With query_function = "app/db.query":
import app/db
import sqlight

pub fn find_user(db db: sqlight.Connection, username username: String) {
  db.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}
```

Your wrapper must match `sqlight.query`'s labelled signature exactly:

```gleam
import gleam/dynamic/decode
import sqlight

pub fn query(
  sql: String,
  on connection: sqlight.Connection,
  with parameters: List(sqlight.Value),
  expecting decoder: decode.Decoder(a),
) -> Result(List(a), sqlight.Error) {
  // Your logging / timing / instrumentation here, then delegate:
  sqlight.query(sql, on: connection, with: parameters, expecting: decoder)
}
```

## Shared return types

When multiple queries in the same `sql/` directory return the same shape, you
can annotate them to share a single Row type and decoder:

```sql
-- src/app/sql/get_org.sql
-- returns: OrgRow
SELECT id, name FROM orgs WHERE id = @id
```

```sql
-- src/app/sql/list_orgs.sql
-- returns: OrgRow
SELECT id, name FROM orgs
```

Both queries now share one `pub type OrgRow` and one decoder function in the
generated module. This eliminates the need for adapter boilerplate when multiple
queries return the same columns.

**Rules:**
- Annotation must appear before the first SQL statement
- Type name must end in `Row` and be valid PascalCase (`OrgRow`, `UserProfileRow`)
- All queries with the same annotation must return the exact same columns (names,
  types, nullability, order). Mismatch is a generation-time error.
- Scope is per-directory: `OrgRow` in `admin/sql/` is distinct from `OrgRow` in
  `public/sql/`
- Unannotated queries keep their per-query Row type (backwards compatible)

## Supported types

Marmot maps SQLite column types to Gleam types. The types that are currently
supported are:

| SQLite declared type       | Gleam type                                                                                                        | Notes                       |
| -------------------------- | ----------------------------------------------------------------------------------------------------------------- | --------------------------- |
| `INTEGER`, `INT`           | `Int`                                                                                                             |                             |
| `REAL`, `FLOAT`, `DOUBLE`  | `Float`                                                                                                           |                             |
| `TEXT`, `VARCHAR`, `CHAR`   | `String`                                                                                                          |                             |
| `BLOB`                     | `BitArray`                                                                                                        |                             |
| `BOOLEAN`, `BOOL`          | `Bool`                                                                                                            | Stored as `0`/`1`           |
| `TIMESTAMP`, `DATETIME`    | [`timestamp.Timestamp`](https://hexdocs.pm/gleam_time/gleam/time/timestamp.html#Timestamp)                        | Stored as Unix seconds      |
| `DATE`                     | [`calendar.Date`](https://hexdocs.pm/gleam_time/gleam/time/calendar.html#Date)                                    | Stored as ISO 8601 text     |
| nullable column            | `Option(T)`                                                                                                       |                             |

### Parameter nullability

Marmot treats read parameters and write parameters differently.

| SQL pattern | Meaning | Generated parameter |
| ----------- | ------- | ------------------- |
| `WHERE account_id = @account_id` against a nullable column | Read filter. Callers pass a concrete value. | `account_id: Int` |
| `WHERE season IS @season` against a nullable column | Read filter using SQL's NULL-aware comparison. Marmot still follows the read rule. | `season: String` |
| `WHERE @from_date IS NULL OR created_at >= @from_date` | Optional-filter idiom in SQL. Marmot does not infer optional read parameters from this shape. | Required parameter, with the type Marmot can infer from the predicate |
| `INSERT INTO tasks (deleted_at) VALUES (@deleted_at)` where `deleted_at` is nullable | Write position. SQLite accepts NULL for this column. | `deleted_at: Option(Int)` |
| `UPDATE tasks SET deleted_at = @deleted_at` where `deleted_at` is nullable | Write position. SQLite accepts NULL for this column. | `deleted_at: Option(Int)` |

This mirrors Squirrel's surface for reads: a parameter in a read predicate is
required unless Marmot adds an explicit override syntax in a future release.
For writes, Marmot follows the schema because the database really accepts or
rejects NULL at that position. `INTEGER PRIMARY KEY` rowid aliases are nullable
on INSERT writes because binding NULL asks SQLite to assign the rowid.

## Known Limitations

### Expression inference gaps

Marmot's type inference for expressions is incremental. Some shapes resolve
to `StringType` when Marmot can't trace the underlying column.

- **Complex SELECT expressions and CTE result columns.** Subqueries in the
  SELECT list, deeply nested `COALESCE`/`CASE`, and columns coming out of
  CTEs may resolve to `StringType` rather than the underlying column's type.
  Use `CAST(... AS TYPE)` in the SQL or alias with `!`/`?` for nullability.

### Limited scope of alias resolution

Alias-aware column resolution applies to the current parsed statement scope.

- **Subquery scoping.** Parameters inside subqueries use a fallback
  inference path that doesn't model nested scopes. Bare references inside
  a subquery may need explicit qualification or `CAST` to get the right
  type. Correlated columns referencing an outer scope are not handled.
- **`USING` joins.** `JOIN x USING (col)` is supported, but bare `col`
  references in `WHERE` may be reported as ambiguous. Workaround: use
  `ON` syntax or qualify the column.

### Design decisions, not bugs

These work by choice. We could change them, but it would be a design shift.

- **`TIMESTAMP` and `DATETIME` columns are stored as Unix seconds (integer).**
  Sub-second precision (nanoseconds) is not preserved. We picked integer
  seconds for simplicity and interop with `strftime('%s', ...)`. Preserving
  nanos would need a new storage format or type mapping.

### Hard limits

Not Marmot's to fix.

- **Repeated anonymous `?` placeholders** that refer to the same value
  generate a separate function argument for each occurrence
  (`WHERE org_id = ? AND ... WHERE org_id = ?` produces `org_id` and
  `org_id_2`). This is a SQLite protocol limitation: anonymous `?` are
  always distinct bind slots. **Use named parameters (`@name` or `:name`)
  instead.** SQLite deduplicates them natively, so `WHERE org_id = @org_id
  AND ... WHERE org_id = @org_id` generates a single `org_id` argument.
  Named parameters are also self-documenting and generally preferable.
- **`WHERE id IN (?)` with a dynamic list is not supported.** SQLite has no
  native array parameter type, so there is no way to bind a list to a single
  `?`. Workarounds:
  - Write separate queries for known list sizes
  - Use `json_each(?)` with a JSON array string:
    `WHERE id IN (SELECT value FROM json_each(?))`
  - Build the query string dynamically in your application code (outside Marmot)

## Differences from Squirrel

Marmot mirrors [Squirrel's](https://github.com/giacomocavalieri/squirrel)
ergonomics but targets SQLite instead of Postgres. The surface conventions
(SQL file layout, code generation, formatting) are deliberately similar so
switching between the two feels seamless. The type-inference engine underneath
had to be built from scratch because Postgres and SQLite expose type
information in fundamentally incompatible ways.

**Where Marmot diverges on the surface (might change over time):**

- **Output directory**: Marmot defaults to `src/generated/sql/` with a `_sql`
  filename suffix (e.g., `users_sql.gleam`). Squirrel places `sql.gleam` as a
  sibling of the `sql/` directory.
- **Named parameters**: Marmot supports `@name`, `:name`, and `$name`
  placeholders natively. Squirrel uses Postgres `$1` positional parameters and
  generates `arg_1`, `arg_2` names.
- **Return type signatures**: Generated functions include explicit
  `-> Result(List(RowType), sqlight.Error)` return types.
- **Formatting**: Generated code is run through `gleam format` automatically,
  so it never causes diffs when users run the formatter.
- **Shared return types**: Queries in the same `sql/` directory can share a
  single Row type and decoder via a `-- returns: EntityRow` annotation.
  Eliminates adapter boilerplate when multiple queries return the same shape.

**Where it diverges under the hood:**

- Postgres' wire protocol returns full type info for parameters and result
  columns when a statement is prepared. SQLite doesn't.
- SQLite columns have *affinity*, not types. A `TEXT` column can store an
  `INTEGER`, and the engine never commits to a type for a result slot.
- Result columns have to be traced through `EXPLAIN` opcodes
  (`OpenRead` / `Column` / `Rowid` / `ResultRow`) back to physical table columns,
  then joined against `PRAGMA table_info` for the declared type.
- Nullability has to be derived from joins, subqueries, and `COALESCE` usage
  instead of reported by the engine. Marmot tracks nullable-cursor sets
  through the opcode trace.
- Computed columns (`COUNT`, `CAST`, `COALESCE`, `CASE`, `SUM`, window
  functions) carry no type information and need a shape-based inference
  fallback that pattern-matches on expression text.
- Parameter types are reverse-engineered from usage sites (`WHERE col = ?`,
  `LIMIT ?`, `CAST(? AS TEXT)`) instead of reported.
- Repeated `@name` parameters need a dedup/unification pass to pick a single
  inferred type across occurrences.
- SQLite-specific quirks (`WITHOUT ROWID`, `STRICT` tables, `INTEGER PRIMARY
  KEY` aliasing `rowid`, affinity rules) have no Postgres analogue and
  required their own handling.

## License

[MIT](https://github.com/pairshaped/marmot/blob/master/LICENSE)