README.md

Select File:
# Marmot - Type-Safe SQL for SQLite in Gleam

[![Package Version](https://img.shields.io/hexpm/v/marmot)](https://hex.pm/packages/marmot)
[![Hex Docs](https://img.shields.io/badge/hex-docs-ffaff3)](https://hexdocs.pm/marmot/)

Marmot is a code generator for Gleam that turns plain `.sql` files into type-safe
functions for SQLite. Write your queries in SQL, point Marmot at your database,
and it generates the Gleam functions, row types, and decoders you need. No ORM,
no query builder, no manual decoder boilerplate. Heavily inspired by
[Squirrel](https://github.com/giacomocavalieri/squirrel) (which does the same
for Postgres).

> If you are an LLM, see [llms.txt](https://raw.githubusercontent.com/pairshaped/marmot/refs/heads/master/llms.txt) for a condensed context document.

## Why Marmot?

If you need to talk to a SQLite database in Gleam, you'll have to write something
like this:

```gleam
import gleam/dynamic/decode
import sqlight

pub type FindUserRow {
  FindUserRow(name: String, email: String)
}

pub fn find_user(db: sqlight.Connection, username: String) {
  let decoder = {
    use name <- decode.field(0, decode.string)
    use email <- decode.field(1, decode.string)
    decode.success(FindUserRow(name:, email:))
  }

  sqlight.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}
```

This is fine for a small project but gets hairy when you have a lot of queries:

- Embedding SQL in strings means no syntax highlighting, no formatting, and
  no way to run queries on their own with external tools.
- You have to keep the decoder in sync with the query output manually.

Instead of hiding SQL behind an ORM, Marmot _embraces it._ Write your queries
in plain `*.sql` files and Marmot generates the corresponding functions.

Instead of the hand-written example above, write the following query:

```sql
-- we're in file `src/users/sql/find_user.sql`
select name, email
from users
where username = ?
```

Run `gleam run -m marmot`. You now have a type-safe
function `find_user` you can use as expected:

```gleam
import sqlight
import generated/sql/users_sql

pub fn main() {
  use db <- sqlight.with_connection("app.sqlite")
  let assert Ok([user]) = users_sql.find_user(db: db, username: "alice")
  // user.name, user.email are fully typed
}
```

Generated functions use labelled arguments
(`users_sql.find_user(db: db, username: "alice")`) so call sites are
self-documenting.

- Marmot keeps encoders and decoders in sync with the query output for you.
- Type safety is preserved: Marmot connects to your SQLite database and uses
  `PRAGMA table_info` and `EXPLAIN` to understand query types.
- Each query is a standalone `*.sql` file, so you can `explain` or lint it
  independently.
- No external tools required. No `sqlc` binary, no `sqlite3` CLI. Marmot uses
  `sqlight` directly.

## Usage

Add Marmot as a dev dependency:

```sh
gleam add marmot --dev
```

Add `sqlight` as a runtime dependency (the generated code calls it):

```sh
gleam add sqlight
```

Generate code by running the `marmot` module:

```sh
gleam run -m marmot
```

As long as you follow a couple of conventions, Marmot will work out of the box:

- Marmot looks for all `*.sql` files in any `sql` directory under your
  project's `src` directory.
- Each `sql` directory becomes a single Gleam module containing a function
  for each `*.sql` file inside it. Generated modules are placed in
  `src/generated/sql/` by default, with a `_sql` suffix to avoid import
  aliasing conflicts.
- Each `*.sql` file _must contain a single SQL query._ The filename becomes
  the generated function name.

> Imagine you have a Gleam project that looks like this
>
> ```txt
> src
> ├── users
> │   └── sql
> │       ├── find_user.sql
> │       └── list_users.sql
> ├── users.gleam
> ├── generated
> │   └── sql
> │       └── users_sql.gleam   -- generated by marmot
> └── app.gleam
> ```
>
> Running `gleam run -m marmot` generates `src/generated/sql/users_sql.gleam`
> with two functions `find_user` and `list_users` you can then import and use:
>
> ```gleam
> import generated/sql/users_sql
>
> users_sql.find_user(db: db, id: 1)
> ```

### Talking to the database

In order to understand the types of your queries, Marmot needs to open the
SQLite database file where your schema is defined. Marmot reads the database
path with the following precedence:

1. `DATABASE_URL` environment variable
2. `--database` CLI flag
3. `database` field in `[tools.marmot]` section of `gleam.toml`

```sh
# Environment variable
DATABASE_URL=dev.sqlite gleam run -m marmot

# CLI flag
gleam run -m marmot -- --database dev.sqlite

# gleam.toml
# [tools.marmot]
# database = "dev.sqlite"
```

If no database is configured, Marmot shows an error message listing all three
options.

### Configuring the output directory

By default, generated modules are placed in `src/generated/sql/`. If you
prefer a different location, set it in `gleam.toml` or via CLI flag:

```toml
[tools.marmot]
output = "src/server/generated/sql"
```

```sh
gleam run -m marmot -- --output src/server/generated/sql
```

The output directory must be under `src/` (Gleam compiles modules from there).

### Custom SQL directory (`sql_dir`)

By default, Marmot finds `.sql` files by walking `src/` for directories named `sql`.
If your project uses a different convention, point `sql_dir` at a specific directory
and Marmot will find all directories under it that contain `.sql` files:

```toml
[tools.marmot]
sql_dir = "src/queries"
```

With this config, Marmot recursively finds all directories under `src/queries/`
that contain `.sql` files. This is useful when your SQL files live outside `sql/`
directories or when you want to limit generation to a specific subtree.

### Custom query wrapper (`query_function`)

By default the generated code calls `sqlight.query` directly. If you want to
add logging, timing, or other instrumentation without forking Marmot, you can
point `query_function` at your own wrapper:

```toml
[tools.marmot]
database = "dev.sqlite"
query_function = "app/db.query"
```

With that config, the generated code imports your module and calls your
function instead of `sqlight.query`:

```gleam
// Without query_function (default):
import sqlight

pub fn find_user(db db: sqlight.Connection, username username: String) {
  sqlight.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}

// With query_function = "app/db.query":
import app/db
import sqlight

pub fn find_user(db db: sqlight.Connection, username username: String) {
  db.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}
```

Your wrapper must match `sqlight.query`'s labelled signature exactly:

```gleam
import gleam/dynamic/decode
import sqlight

pub fn query(
  sql: String,
  on connection: sqlight.Connection,
  with parameters: List(sqlight.Value),
  expecting decoder: decode.Decoder(a),
) -> Result(List(a), sqlight.Error) {
  // Your logging / timing / instrumentation here, then delegate:
  sqlight.query(sql, on: connection, with: parameters, expecting: decoder)
}
```

## Shared return types

When multiple queries in the same `sql/` directory return the same shape, you
can annotate them to share a single Row type and decoder:

```sql
-- src/app/sql/get_org.sql
-- returns: OrgRow
SELECT id, name FROM orgs WHERE id = @id
```

```sql
-- src/app/sql/list_orgs.sql
-- returns: OrgRow
SELECT id, name FROM orgs
```

Both queries now share one `pub type OrgRow` and one decoder function in the
generated module. This eliminates the need for adapter boilerplate when multiple
queries return the same columns.

**Rules:**
- Annotation must appear before the first SQL statement
- Type name must end in `Row` and be valid PascalCase (`OrgRow`, `UserProfileRow`)
- All queries with the same annotation must return the exact same columns (names,
  types, nullability, order). Mismatch is a generation-time error.
- Scope is per-directory: `OrgRow` in `admin/sql/` is distinct from `OrgRow` in
  `public/sql/`
- Unannotated queries keep their per-query Row type (backwards compatible)

## Supported types

Marmot maps SQLite column types to Gleam types. The types that are currently
supported are:

| SQLite declared type       | Gleam type                                                                                                        | Notes                       |
| -------------------------- | ----------------------------------------------------------------------------------------------------------------- | --------------------------- |
| `INTEGER`, `INT`           | `Int`                                                                                                             |                             |
| `REAL`, `FLOAT`, `DOUBLE`  | `Float`                                                                                                           |                             |
| `TEXT`, `VARCHAR`, `CHAR`   | `String`                                                                                                          |                             |
| `BLOB`                     | `BitArray`                                                                                                        |                             |
| `BOOLEAN`, `BOOL`          | `Bool`                                                                                                            | Stored as `0`/`1`           |
| `TIMESTAMP`, `DATETIME`    | [`timestamp.Timestamp`](https://hexdocs.pm/gleam_time/gleam/time/timestamp.html#Timestamp)                        | Stored as Unix seconds      |
| `DATE`                     | [`calendar.Date`](https://hexdocs.pm/gleam_time/gleam/time/calendar.html#Date)                                    | Stored as ISO 8601 text     |
| nullable column            | `Option(T)`                                                                                                       |                             |

## Known Limitations

### Fixable, worth doing eventually

These are limitations of Marmot's current implementation, not fundamental
constraints. Contributions welcome.

- **Table names containing SQL keywords** (`RETURNING`, `INTO`) may confuse
  the string-based parser. Use simple table names for now. A proper SQL
  tokenizer would fix this and unlocks the next two limitations too.
- **`INSERT INTO t VALUES (?, ?)` without an explicit column list** will not
  infer parameter names or types correctly. Always specify columns:
  `INSERT INTO t (col1, col2) VALUES (?, ?)`. Marmot could look up the table
  schema and match positionally.
- **Complex expressions** (subqueries, CTEs, `COALESCE`) may not have their
  types inferred. Use `CAST(... AS TYPE)` to help Marmot. Incremental
  improvements are possible per expression kind.

### Design decisions, not bugs

These work by choice. We could change them, but it would be a design shift.

- **`TIMESTAMP` and `DATETIME` columns are stored as Unix seconds (integer).**
  Sub-second precision (nanoseconds) is not preserved. We picked integer
  seconds for simplicity and interop with `strftime('%s', ...)`. Preserving
  nanos would need a new storage format or type mapping.

### Hard limits

Not Marmot's to fix.

- **Repeated anonymous `?` placeholders** that refer to the same value
  generate a separate function argument for each occurrence
  (`WHERE org_id = ? AND ... WHERE org_id = ?` produces `org_id` and
  `org_id_2`). This is a SQLite protocol limitation: anonymous `?` are
  always distinct bind slots. **Use named parameters (`@name` or `:name`)
  instead.** SQLite deduplicates them natively, so `WHERE org_id = @org_id
  AND ... WHERE org_id = @org_id` generates a single `org_id` argument.
  Named parameters are also self-documenting and generally preferable.
- **`WHERE id IN (?)` with a dynamic list is not supported.** SQLite has no
  native array parameter type, so there is no way to bind a list to a single
  `?`. Workarounds:
  - Write separate queries for known list sizes
  - Use `json_each(?)` with a JSON array string:
    `WHERE id IN (SELECT value FROM json_each(?))`
  - Build the query string dynamically in your application code (outside Marmot)

## Differences from Squirrel

Marmot mirrors [Squirrel's](https://github.com/giacomocavalieri/squirrel)
ergonomics but targets SQLite instead of Postgres. The surface conventions
(SQL file layout, code generation, formatting) are deliberately similar so
switching between the two feels seamless. The type-inference engine underneath
had to be built from scratch because Postgres and SQLite expose type
information in fundamentally incompatible ways.

**Where Marmot diverges on the surface (might change over time):**

- **Output directory**: Marmot defaults to `src/generated/sql/` with a `_sql`
  filename suffix (e.g., `users_sql.gleam`). Squirrel places `sql.gleam` as a
  sibling of the `sql/` directory.
- **Named parameters**: Marmot supports `@name`, `:name`, and `$name`
  placeholders natively. Squirrel uses Postgres `$1` positional parameters and
  generates `arg_1`, `arg_2` names.
- **Return type signatures**: Generated functions include explicit
  `-> Result(List(RowType), sqlight.Error)` return types.
- **Formatting**: Generated code is run through `gleam format` automatically,
  so it never causes diffs when users run the formatter.
- **Shared return types**: Queries in the same `sql/` directory can share a
  single Row type and decoder via a `-- returns: EntityRow` annotation.
  Eliminates adapter boilerplate when multiple queries return the same shape.

**Where it diverges under the hood:**

- Postgres' wire protocol returns full type info for parameters and result
  columns when a statement is prepared. SQLite doesn't.
- SQLite columns have *affinity*, not types. A `TEXT` column can store an
  `INTEGER`, and the engine never commits to a type for a result slot.
- Result columns have to be traced through `EXPLAIN` opcodes
  (`OpenRead` / `Column` / `Rowid` / `ResultRow`) back to physical table columns,
  then joined against `PRAGMA table_info` for the declared type.
- Nullability has to be derived from joins, subqueries, and `COALESCE` usage
  instead of reported by the engine. Marmot tracks nullable-cursor sets
  through the opcode trace.
- Computed columns (`COUNT`, `CAST`, `COALESCE`, `CASE`, `SUM`, window
  functions) carry no type information and need a shape-based inference
  fallback that pattern-matches on expression text.
- Parameter types are reverse-engineered from usage sites (`WHERE col = ?`,
  `LIMIT ?`, `CAST(? AS TEXT)`) instead of reported.
- Repeated `@name` parameters need a dedup/unification pass to pick a single
  inferred type across occurrences.
- SQLite-specific quirks (`WITHOUT ROWID`, `STRICT` tables, `INTEGER PRIMARY
  KEY` aliasing `rowid`, affinity rules) have no Postgres analogue and
  required their own handling.

## License

[MIT](https://github.com/pairshaped/marmot/blob/master/LICENSE)