CHANGELOG.md

# v0.6.0

## New 🔥

**BM25 Full-Text Search** is now available via the new `Torus.bm25/5` macro!

[BM25](https://en.wikipedia.org/wiki/Okapi_BM25) is a modern ranking algorithm that generally provides superior relevance scoring compared to traditional TF-IDF (used by `full_text/5`). This integration uses the [pg_textsearch](https://github.com/timescale/pg_textsearch) extension by Timescale.

**See it in action [on the demo page](https://torus.dimamik.com/?method=bm25)**

Key features:

- State-of-the-art BM25 ranking with configurable index parameters (k1, b)
- Blazingly fast top-k queries via Block-Max WAND optimization (`Torus.bm25/5` + `limit`)
- Simple syntax: `Post |> Torus.bm25([p], p.body, "search term") |> limit(10)`
- Score selection with `:score_key` and post-filtering with `:score_threshold`
- Language/stemming configured at index creation via `text_config`

Requirements:

- PostgreSQL 17+
- pg_textsearch extension installed
- BM25 index on the search column (with `text_config` for language)

See the [BM25 Search Guide](https://dimamik.com/posts/bm25_search) for detailed setup instructions and examples.

**When to use BM25 vs full_text:**

- Use `bm25/5` for fast single-column search with modern relevance ranking
- Use `full_text/5` for multi-column search with weights or when using stored tsvector columns

# v0.5.3

## Fixes

- `Torus.Embeddings.OpenAI` now correctly parses updated response structure from OpenAI API.
- `exla` nx backend is now correctly supported in `Torus.Embeddings.LocalNxServing`.

# v0.5.2

## New 🔥

- New [demo page](https://torus.dimamik.com) where you can explore different search types and their options. It also includes semantic search, so if you're hesitant - go check it out!
- Other documentation improvements

## Fixes

- Correctly handles `order: :none` in `Torus.semantic/5` search.
- Updates `Torus.Embeddings.HuggingFace` to point to the updated feature extraction endpoint.
- Suppresses warnings for missing `ecto_sql` dependency by adding it to the required dependencies. Most of us already had it, but now it'll be explicit.
- Correctly parses an array of integers in `Torus.QueryInspector.substituted_sql/3` and `Torus.QueryInspector.tap_substituted_sql/3`. Now we should be able to handle all possible query variations.

# v0.5.1

- Adds `Torus.Embeddings.Gemini` to support Gemini embeddings.
- Extends semantic search docs on how to stack embedders
- Adds `:distance_key` option to `Torus.semantic/5` to allow selecting distance key to the result map. Later on we'll rely on this to support hybrid search.
- Correctly swaps `>` and `<` operators for pre-filtering when changing order in `Torus.semantic/5` search.

# v0.5.0

- Similarity search type now defaults to `:word_similarity` instead of `similarity`.
- Possible `Torus.similarity/5` search types are updated to be prefixed with `similarity` to replicate 1-1 these in `pg_trgm` extension.
- Extended optimization section in the docs

# v0.4.1

Minor doc updates

# v0.4.0

## Breaking changes:

- `Torus.full_text/5` - now returns all results when search term contains a stop word or is empty instead of returning none.

## Improvements:

- `Torus.full_text/5` - now supports `:empty_return` option that controls if the query should return all results when search term contains a stop word or is empty.
- `Torus.QueryInspector.tap_explain_analyze/3` - now correctly returns the query plan.
- Docs were grouped together by the search type.

## New 🔥

**Semantic search** is finally here! Read more about it in the [Semantic search with Torus](/guides/semantic_search.md) guide.
Shortly - it allows you to generate embeddings using a configurable adapters and use them to compare against the ones stored in your database.

Supported adapters (for now):

- `Torus.Embeddings.OpenAI` - uses OpenAI's API to generate embeddings.

- `Torus.Embeddings.HuggingFace` - uses HuggingFace's API to generate embeddings.

- `Torus.Embeddings.LocalNxServing` - generate embeddings on your local machine using a variety of models available on Hugging Face

- `Torus.Embeddings.PostgresML` - uses PostgreSQL [PostgresML extension](https://PostgresML.org/docs) to generate embeddings

- `Torus.Embeddings.Batcher` - a long‑running **GenServer** that collects individual embedding calls, groups them into a single batch, and forwards the batch to the configured `embedding_module` (any from the above or your custom one).

- `Torus.Embeddings.NebulexCache` - a wrapper around [Nebulex](https://hexdocs.pm/nebulex/readme.html) cache, allowing you to cache the embedding calls in memory, so you save the resources/cost of calling the embedding module multiple times for the same input.

And you can easily create your own adapter by implementing the `Torus.Embedding` behaviour.

# v0.3.0

## Breaking changes:

- `full_text_dynamic/5` is renamed to `full_text/5` and now supports stored columns.
- `similarity/5` - `limit` option is removed, use Ecto's `limit/2` instead.
- `full_text/5` - `:concat` option is renamed to `:coalesce`.

## Improvements:

- `full_text/5` now supports stored `tsvector` columns.
- `Torus.QueryInspector.substituted_sql/3` now correctly handles arrays substitutions.
- Docs are extended to guide through the performance and relevance.

And other minor performance/clearance improvements.

# v0.2.2

- `full_text_dynamic/5`: Replaced `:nullable_columns` with `:concat` option
- `similarity/5`: Fixed a bug where you weren't able to pass variable as a term
- `Torus.QueryInspector`: now is not tied with `Torus.Testing` and serves as a separate standalone module.

And other minor performance/clearance improvements.

# v0.2.1

`similarity/5` search is now fully tested and customizable. `full_text_dynamic/5` is up next.

# Changelog for Torus v0.2.0

Torus now supports full text search, ilike, and similarity search.