README.md

# Marmot - Type-Safe SQL for SQLite in Gleam

[![Package Version](https://img.shields.io/hexpm/v/marmot)](https://hex.pm/packages/marmot)
[![Hex Docs](https://img.shields.io/badge/hex-docs-ffaff3)](https://hexdocs.pm/marmot/)

Marmot is a code generator for Gleam that turns plain `.sql` files into type-safe
functions for SQLite. Write your queries in SQL, point Marmot at your database,
and it generates the Gleam functions, row types, and decoders you need. No ORM,
no query builder, no manual decoder boilerplate. Inspired by
[Squirrel](https://github.com/giacomocavalieri/squirrel) (which does the same
for Postgres).

> If you are an LLM, see [llms.txt](https://raw.githubusercontent.com/pairshaped/marmot/refs/heads/master/llms.txt) for a condensed context document.

## What's Marmot?

If you need to talk to a SQLite database in Gleam you'll have to write something
like this:

```gleam
import gleam/dynamic/decode
import sqlight

pub type FindUserRow {
  FindUserRow(name: String, email: String)
}

pub fn find_user(db: sqlight.Connection, username: String) {
  let decoder = {
    use name <- decode.field(0, decode.string)
    use email <- decode.field(1, decode.string)
    decode.success(FindUserRow(name:, email:))
  }

  sqlight.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}
```

This is probably fine if you have a few small queries but it can become quite
the burden when you have a lot of queries:

- The SQL query you write is just a plain string, you do not get syntax
  highlighting, auto formatting, suggestions... all the little niceties you
  would otherwise get if you were writing a plain `*.sql` file.
- This also means you lose the ability to run these queries on their own with
  other external tools, inspect them and so on.
- You have to manually keep in sync the decoder with the query's output.

One might be tempted to hide all of this by reaching for something like an ORM.
Marmot proposes a different approach: instead of trying to hide the SQL it
_embraces it and leaves you in control._
You write the SQL queries in plain old `*.sql` files and Marmot will take care
of generating all the corresponding functions.

A code snippet is worth a thousand words, so let's have a look at an example.
Instead of the hand written example shown earlier you can instead just write the
following query:

```sql
-- we're in file `src/my_app/sql/find_user.sql`
select name, email
from users
where username = ?
```

And run `gleam run -m marmot`. Just like magic you'll now have a type-safe
function `find_user` you can use just as you'd expect:

```gleam
import sqlight
import my_app/sql

pub fn main() {
  use db <- sqlight.with_connection("my_app.sqlite")
  let assert Ok([user]) = sql.find_user(db: db, username: "alice")
  // user.name, user.email are fully typed
}
```

Behind the scenes Marmot generates the decoders and functions you need. Generated functions use labelled
arguments (`sql.find_user(db: db, username: "alice")`) so call sites are
self-documenting.
So now you get the best of both worlds:

- You don't have to take care of keeping encoders and decoders in sync, Marmot
  does that for you.
- You're not compromising on type safety either: Marmot connects to your
  SQLite database and uses `PRAGMA table_info` and `EXPLAIN` to understand the
  types of your queries.
- You can stick to writing plain SQL in `*.sql` files.
- You can run each query on its own: need to `explain` a query?
  No big deal, it's just a plain old `*.sql` file.
- No external tools required. No `sqlc` binary, no `sqlite3` CLI. Marmot uses
  `sqlight` directly.

## Usage

First you'll need to add Marmot to your project as a dev dependency:

```sh
gleam add marmot --dev
```

You'll also need `sqlight` as a runtime dependency (the generated code calls it):

```sh
gleam add sqlight
```

Then you can ask it to generate code running the `marmot` module:

```sh
gleam run -m marmot
```

And that's it! As long as you follow a couple of conventions Marmot will just
work:

- Marmot will look for all `*.sql` files in any `sql` directory under your
  project's `src` directory.
- Each `sql` directory will be turned into a single Gleam module containing a
  function for each `*.sql` file inside it. Generated modules are placed in
  `src/generated/sql/` by default, with a `_sql` suffix to avoid import
  aliasing conflicts.
- Each `*.sql` file _must contain a single SQL query._ And the name of the file
  is going to be the name of the corresponding Gleam function to run that query.

> Let's make an example. Imagine you have a Gleam project that looks like this
>
> ```txt
> ├── src
> │   ├── my_app
> │   │   └── sql
> │   │       ├── find_user.sql
> │   │       └── list_users.sql
> │   ├── generated
> │   │   └── sql
> │   │       └── my_app_sql.gleam   -- generated by marmot
> │   └── my_app.gleam
> └── test
>     └── my_app_test.gleam
> ```
>
> Running `gleam run -m marmot` generates `src/generated/sql/my_app_sql.gleam`
> with two functions `find_user` and `list_users` you can then import and use:
>
> ```gleam
> import generated/sql/my_app_sql
>
> my_app_sql.find_user(db, id: 1)
> ```

### Talking to the database

In order to understand the types of your queries, Marmot needs to open the
SQLite database file where your schema is defined. Marmot reads the database
path with the following precedence:

1. `DATABASE_URL` environment variable
2. `--database` CLI flag
3. `database` field in `[marmot]` section of `gleam.toml`

```sh
# Environment variable
DATABASE_URL=dev.sqlite gleam run -m marmot

# CLI flag
gleam run -m marmot -- --database dev.sqlite

# gleam.toml
# [marmot]
# database = "dev.sqlite"
```

If no database is configured, Marmot will show a helpful error message listing
all three options.

### Configuring the output directory

By default, generated modules are placed in `src/generated/sql/`. If you
prefer a different location, set it in `gleam.toml` or via CLI flag:

```toml
[marmot]
output = "src/server/generated/sql"
```

```sh
gleam run -m marmot -- --output src/server/generated/sql
```

The output directory must be under `src/` (Gleam compiles modules from there).

### Custom query wrapper (`query_function`)

By default the generated code calls `sqlight.query` directly. If you want to
add logging, timing, or other instrumentation without forking Marmot, you can
point `query_function` at your own wrapper:

```toml
[marmot]
database = "dev.sqlite"
query_function = "app/db.query"
```

With that config, the generated code imports your module and calls your
function instead of `sqlight.query`:

```gleam
// Without query_function (default):
import sqlight

pub fn find_user(db db: sqlight.Connection, username username: String) {
  sqlight.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}

// With query_function = "app/db.query":
import app/db
import sqlight

pub fn find_user(db db: sqlight.Connection, username username: String) {
  db.query(
    "select name, email from users where username = ?",
    on: db,
    with: [sqlight.text(username)],
    expecting: decoder,
  )
}
```

Your wrapper must match `sqlight.query`'s labelled signature exactly:

```gleam
import gleam/dynamic/decode
import sqlight

pub fn query(
  sql: String,
  on connection: sqlight.Connection,
  with parameters: List(sqlight.Value),
  expecting decoder: decode.Decoder(a),
) -> Result(List(a), sqlight.Error) {
  // Your logging / timing / instrumentation here, then delegate:
  sqlight.query(sql, on: connection, with: parameters, expecting: decoder)
}
```

## Supported types

Marmot maps SQLite column types to Gleam types. The types that are currently
supported are:

| SQLite declared type       | Gleam type                                                                                                        | Notes                       |
| -------------------------- | ----------------------------------------------------------------------------------------------------------------- | --------------------------- |
| `INTEGER`, `INT`           | `Int`                                                                                                             |                             |
| `REAL`, `FLOAT`, `DOUBLE`  | `Float`                                                                                                           |                             |
| `TEXT`, `VARCHAR`, `CHAR`   | `String`                                                                                                          |                             |
| `BLOB`                     | `BitArray`                                                                                                        |                             |
| `BOOLEAN`, `BOOL`          | `Bool`                                                                                                            | Stored as `0`/`1`           |
| `TIMESTAMP`, `DATETIME`    | [`timestamp.Timestamp`](https://hexdocs.pm/gleam_time/gleam/time/timestamp.html#Timestamp)                        | Stored as Unix seconds      |
| `DATE`                     | [`calendar.Date`](https://hexdocs.pm/gleam_time/gleam/time/calendar.html#Date)                                    | Stored as ISO 8601 text     |
| nullable column            | `Option(T)`                                                                                                       |                             |

## Shared return types

When multiple queries in the same `sql/` directory return the same shape, you
can annotate them to share a single Row type and decoder:

```sql
-- src/app/sql/get_org.sql
-- returns: OrgRow
SELECT id, name FROM orgs WHERE id = @id
```

```sql
-- src/app/sql/list_orgs.sql
-- returns: OrgRow
SELECT id, name FROM orgs
```

Both queries now share one `pub type OrgRow` and one decoder function in the
generated module. This eliminates the need for adapter boilerplate when multiple
queries return the same columns.

**Rules:**
- Annotation must appear before the first SQL statement
- Type name must end in `Row` and be valid PascalCase (`OrgRow`, `UserProfileRow`)
- All queries with the same annotation must return the exact same columns (names,
  types, nullability, order). Mismatch is a generation-time error.
- Scope is per-directory: `OrgRow` in `admin/sql/` is distinct from `OrgRow` in
  `public/sql/`
- Unannotated queries keep their per-query Row type (backwards compatible)

## FAQ

### What flavour of SQL does Marmot support?

Marmot only supports SQLite.

### Why not use Squirrel?

[Squirrel](https://github.com/giacomocavalieri/squirrel) is excellent and you
should use it if you use Postgres. Marmot exists because Squirrel doesn't
support SQLite, and SQLite introspection works fundamentally differently
(PRAGMAs and EXPLAIN instead of the Postgres wire protocol).

### How does Marmot infer types?

Marmot uses `PRAGMA table_info` for column types and nullability, and SQLite's
`EXPLAIN` to trace query result columns and parameters back to their source
table columns. This is a heuristic approach that works well for straightforward
queries.

### Why isn't Marmot configurable in any way?

Following Squirrel's lead, Marmot leans heavily on convention over
configuration. Small projects work with zero config. The two opt-in knobs --
`output` and `query_function` -- exist for real needs that convention can't
cover: centralizing generated code somewhere other than the default, and
wrapping queries with logging or timing. They don't change the shape of the
generated code, just where it lands and which function it calls.

## Known Limitations

### Fixable, worth doing eventually

These are limitations of Marmot's current implementation, not fundamental
constraints. Contributions welcome.

- **Table names containing SQL keywords** (`RETURNING`, `INTO`) may confuse
  the string-based parser. Use simple table names for now. A proper SQL
  tokenizer would fix this and unlocks the next two limitations too.
- **`INSERT INTO t VALUES (?, ?)` without an explicit column list** will not
  infer parameter names or types correctly. Always specify columns:
  `INSERT INTO t (col1, col2) VALUES (?, ?)`. Marmot could look up the table
  schema and match positionally.
- **Complex expressions** (subqueries, CTEs, `COALESCE`) may not have their
  types inferred. Use `CAST(... AS TYPE)` to help Marmot. Incremental
  improvements are possible per expression kind.

### Design decisions, not bugs

These work by choice. We could change them, but it would be a design shift.

- **`TIMESTAMP` and `DATETIME` columns are stored as Unix seconds (integer).**
  Sub-second precision (nanoseconds) is not preserved. We picked integer
  seconds for simplicity and interop with `strftime('%s', ...)`. Preserving
  nanos would need a new storage format or type mapping.

### Hard limits

Not Marmot's to fix.

- **Repeated anonymous `?` placeholders** that refer to the same value
  generate a separate function argument for each occurrence
  (`WHERE org_id = ? AND ... WHERE org_id = ?` produces `org_id` and
  `org_id_2`). This is a SQLite protocol limitation: anonymous `?` are
  always distinct bind slots. **Use named parameters (`@name` or `:name`)
  instead** -- SQLite deduplicates them natively, so `WHERE org_id = @org_id
  AND ... WHERE org_id = @org_id` generates a single `org_id` argument.
  Named parameters are also self-documenting and generally preferable.
- **`WHERE id IN (?)` with a dynamic list is not supported.** SQLite has no
  native array parameter type, so there is no way to bind a list to a single
  `?`. Workarounds:
  - Write separate queries for known list sizes
  - Use `json_each(?)` with a JSON array string:
    `WHERE id IN (SELECT value FROM json_each(?))`
  - Build the query string dynamically in your application code (outside Marmot)

## Differences from Squirrel

Marmot is heavily inspired by [Squirrel](https://github.com/giacomocavalieri/squirrel)
but has diverged in several ways:

- **Target**: SQLite instead of Postgres.
- **Output directory**: Marmot defaults to `src/generated/sql/` with a `_sql`
  filename suffix (e.g., `users_sql.gleam`). Squirrel places `sql.gleam` as a
  sibling of the `sql/` directory.
- **Named parameters**: Marmot supports `@name`, `:name`, and `$name`
  placeholders natively. Squirrel uses Postgres `$1` positional parameters and
  generates `arg_1`, `arg_2` names.
- **Return type signatures**: Generated functions include explicit
  `-> Result(List(RowType), sqlight.Error)` return types.
- **Formatting**: Generated code is run through `gleam format` automatically,
  so it never causes diffs when users run the formatter.
- **Shared return types**: Queries in the same `sql/` directory can share a
  single Row type and decoder via a `-- returns: EntityRow` annotation.
  Eliminates adapter boilerplate when multiple queries return the same shape.

## Why not just fork Squirrel?

Marmot deliberately mirrors Squirrel's ergonomics, so this question comes up.
The surface layers (SQL file conventions, code generation, formatting,
configuration) could have been forked. The engine underneath couldn't.
Postgres and SQLite expose type information in fundamentally incompatible
ways, so the "figure out the types" part had to be built from scratch.

Specifically:

- Postgres' wire protocol returns full type info for parameters and result
  columns when a statement is prepared. SQLite doesn't.
- SQLite columns have *affinity*, not types. A `TEXT` column can store an
  `INTEGER`, and the engine never commits to a type for a result slot.
- Result columns have to be traced through `EXPLAIN` opcodes
  (`OpenRead` / `Column` / `Rowid` / `ResultRow`) back to physical table columns,
  then joined against `PRAGMA table_info` for the declared type.
- Nullability has to be derived from joins, subqueries, and `COALESCE` usage
  instead of reported by the engine. Marmot tracks nullable-cursor sets
  through the opcode trace.
- Computed columns (`COUNT`, `CAST`, `COALESCE`, `CASE`, `SUM`, window
  functions) carry no type information and need a shape-based inference
  fallback that pattern-matches on expression text.
- Parameter types are reverse-engineered from usage sites (`WHERE col = ?`,
  `LIMIT ?`, `CAST(? AS TEXT)`) instead of reported.
- Repeated `@name` parameters need a dedup/unification pass to pick a single
  inferred type across occurrences.
- SQLite-specific quirks (`WITHOUT ROWID`, `STRICT` tables, `INTEGER PRIMARY
  KEY` aliasing `rowid`, affinity rules) have no Postgres analogue and
  required their own handling.

Squirrel's output shape is the right one for SQLite too, and we've kept it.
The path from SQL to "here are the types" is a different problem in SQLite,
so Marmot's type-inference pipeline is its own.

## Credits

Marmot's design, conventions, and approach are directly inspired by
[Squirrel](https://github.com/giacomocavalieri/squirrel) by
[Giacomo Cavalieri](https://github.com/giacomocavalieri).
Squirrel targets Postgres with beautiful ergonomics. Marmot brings that same
experience to SQLite. In fact, Marmot aims to be a near 1:1 mirror of Squirrel's
syntax and conventions, so switching between the two should feel seamless.

If you use Postgres, use Squirrel. If you use SQLite, use Marmot.

## Contributing

If you think there's any way to improve this package, or if you spot a bug don't
be afraid to open PRs, issues or requests of any kind! Any contribution is
welcome.

## License

[MIT](https://github.com/pairshaped/marmot/blob/master/LICENSE)