# Postgres and ParadeDB
Postgres is Exograph's built-in backend. Exograph uses Ecto migrations, schemas,
`Repo` operations, and transactions for source files, fragments, comments,
definitions, references, package metadata, and call graph facts.
ParadeDB's `pg_search` extension is optional and accelerates BM25 text/code-fact
retrieval when available.
## Indexing with Postgres
```elixir
{:ok, index} =
Exograph.index("lib",
repo: MyApp.Repo,
migrate?: true,
bm25?: true
)
```
`backend: :postgres` is accepted explicitly, but Postgres is the only built-in backend.
## Tables
The canonical migration creates normalized Ecto-backed tables under the configured
prefix (default: `exograph`):
- `exograph_packages`
- `exograph_package_versions`
- `exograph_files`
- `exograph_fragments`
- `exograph_comments`
- `exograph_definitions`
- `exograph_references`
- `exograph_graph_nodes`
- `exograph_call_edges`
- `exograph_tree_nodes`
Source text is stored once in `exograph_files`. Fragments carry package, version,
and file foreign keys.
## Indexes
### Structural search — `(kind, name, arity)` btree
A btree index on `(kind, name, arity)` on the fragments table lets structural
queries that know the kind or name skip the GIN term index entirely and go
straight to a btree range scan. This is the hot path for patterns like
`def handle_call(_, _, _) do ... end` where kind=`def` and name=`handle_call`
are extracted at query planning time.
### Term index — GIN
The inverted index on fragment terms (`exograph_terms`) uses a GIN index. Terms
are extracted by ExAST at indexing time and stored as normalized strings.
Candidate retrieval scans the GIN index; ExAST verification follows.
### Text search — `pg_trgm` GIN
`pg_trgm` GIN indexes on `files.source` and `files.comments_text` enable fast
`ILIKE` and `~*` regex without a full table scan:
```sql
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX CONCURRENTLY ON exograph_files USING gin (source gin_trgm_ops);
CREATE INDEX CONCURRENTLY ON exograph_files USING gin (comments_text gin_trgm_ops);
```
Migrations create these automatically. Without `pg_trgm`, text/regex search
falls back to sequential scans.
## ParadeDB / `pg_search`
When ParadeDB's `pg_search` extension is installed, `migrate?: true` creates BM25
indexes over source files, comments, definitions, and references. Source files use
ParadeDB's `pdb.source_code` tokenizer; symbol names use `pdb.ngram` for
prefix/partial matching.
Raw SQL is limited to areas Ecto cannot express directly:
- `CREATE EXTENSION`
- ParadeDB `USING bm25`
- ParadeDB tokenizer casts such as `source::pdb.source_code`
- ParadeDB operators (`|||`, `&&&`)
- ParadeDB scoring (`pdb.score(...)`)
## Recommended Postgres settings
Default Postgres settings are conservative. On a 13.8M fragment index, these
settings reduce structural search time from ~600ms to ~78ms and enable parallel
BM25 scans.
```sql
-- Parallel workers for ParadeDB BM25 scans
ALTER SYSTEM SET max_parallel_workers_per_gather = 4;
ALTER SYSTEM SET max_parallel_workers = 16;
ALTER SYSTEM SET max_worker_processes = 24;
-- Memory — adjust to available RAM
ALTER SYSTEM SET shared_buffers = '4GB'; -- ~25% of RAM
ALTER SYSTEM SET effective_cache_size = '12GB'; -- ~75% of RAM
ALTER SYSTEM SET work_mem = '256MB'; -- per-operation sort/hash
ALTER SYSTEM SET maintenance_work_mem = '1GB'; -- for CREATE INDEX / VACUUM
SELECT pg_reload_conf();
-- shared_buffers requires a Postgres restart
```
After restarting, prewarm BM25 indexes into the buffer cache:
```sql
CREATE EXTENSION IF NOT EXISTS pg_prewarm;
SELECT pg_prewarm('hex_files_bm25_idx');
SELECT pg_prewarm('hex_fragments_bm25_idx');
SELECT pg_prewarm('hex_definitions_bm25_idx');
```
For write-heavy indexing runs, increase parallelism for index creation:
```sql
SET max_parallel_maintenance_workers = 8;
SET maintenance_work_mem = '2GB';
```
## Fallback behavior
Without ParadeDB, text/regex search uses Postgres `ILIKE` (accelerated by
`pg_trgm` GIN indexes) and `~*`. Without `pg_trgm`, text search falls back to a
sequential scan — usable for small indexes but slow at scale.
## Testing
```bash
EXOGRAPH_DATABASE_URL=postgres://postgres:postgres@localhost:5432/exograph_test \
mix test
```