# Bombadil
![Lord of the rings, Tom Bombadil](/img/ring.png)
## You know, for (PostgreSQL) search
Bombadil is a wrapper around some PostgreSQL search capabilities.
It supports exact match through PostgreSQL tsvector(s) and fuzzy search inside
jsonb fields.
**This is a working proof of concept ;) I plan to iterate and improve it. If you want to contribute you are welcome!**
# Installation
The package can be installed by adding carry to your list of dependencies in mix.exs:
```elixir
def deps do
[
{:bombadil, "~> 0.1"}
]
end
```
Documentation is at https://hexdocs.pm/bombadil
# Preparation
```elixir
# Create and migrate the "search_index" table
mix do ecto.create, ecto.migrate
```
# Basic Usage
## (Almost) Exact match of an indexed word or sequence of words
```elixir
# Full string provided
alias Bombadil.Ecto.Schema.SearchIndex
iex> Bombadil.index(SearchIndex, payload: %{"book" => "Lord of the Rings"})
# Raw SQL: INSERT INTO "search_index" ("payload") VALUES ('{"book": "Lord of the Rings"}')
:ok
iex> Bombadil.search("Lord of the Rings")
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (to_tsvector('simple', payload::text) @@ plainto_tsquery('simple', 'Lord of the Rings'))
[
%Bombadil.Ecto.Schema.SearchIndex{
__meta__: #Ecto.Schema.Metadata<:loaded, "search_index">,
payload: %{"book" => "Lord of the Rings"},
id: 1
}
]
# One word provided (treated as case-insensitive)
iex> Bombadil.search("lord")
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (to_tsvector('simple', payload::text) @@ plainto_tsquery('simple', 'lord'))
[
%Bombadil.Ecto.Schema.SearchIndex{
__meta__: #Ecto.Schema.Metadata<:loaded, "search_index">,
payload: %{"book" => "Lord of the Rings"},
id: 1
}
]
# No results
iex> Bombadil.search("lordz")
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (to_tsvector('simple', payload::text) @@ plainto_tsquery('simple', 'lordz'))
[]
```
## Fuzzy match
```elixir
iex> Bombadil.fuzzy_search(SearchIndex, "lord of the ringz")
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (payload::text %> 'lord of the ringz')
[
%Bombadil.Ecto.Schema.SearchIndex{
__meta__: #Ecto.Schema.Metadata<:loaded, "search_index">,
payload: %{"book" => "Lord of the Rings"},
id: 1
}
]
# No results
iex> Bombadil.fuzzy_search(SearchIndex, "lard of the ringz asdf")
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (payload::text %> 'lord of the ringz asdf')
[]
```
# Match a specific field inside the indexed jsonb field
## (Almost) Exact match
```elixir
iex> Bombadil.index(SearchIndex, payload: %{"character" => "Tom Bombadil"})
:ok
iex> Bombadil.search([%{"book" => "rings"}])
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (FALSE OR to_tsvector('simple', (payload->'book')::text) @@ plainto_tsquery('simple', 'rings'))
[
%Bombadil.Ecto.Schema.SearchIndex{
__meta__: #Ecto.Schema.Metadata<:loaded, "search_index">,
payload: %{"book" => "Lord of the Rings"},
id: 1
}
]
iex> Bombadil.search([%{"character" => "bombadil"}])
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (FALSE OR to_tsvector('simple', (payload->'character')::text) @@ plainto_tsquery('simple', 'bombadil'))
[
%Bombadil.Ecto.Schema.SearchIndex{
__meta__: #Ecto.Schema.Metadata<:loaded, "search_index">,
payload: %{"character" => "Tom Bombadil"},
id: 3
}
]
```
## Fuzzy match
```elixir
iex> Bombadil.fuzzy_search(SearchIndex, [%{"book" => "lard"}])
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (FALSE OR (payload->'book')::text %> 'lard')
[
%Bombadil.Ecto.Schema.SearchIndex{
__meta__: #Ecto.Schema.Metadata<:loaded, "search_index">,
payload: %{"book" => "Lord of the Rings"},
id: 1
}
]
iex> Bombadil.fuzzy_search(SearchIndex, [%{"character" => "tom bomba"}])
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (FALSE OR (payload->'character')::text %> 'tom bomba')
[
%Bombadil.Ecto.Schema.SearchIndex{
__meta__: #Ecto.Schema.Metadata<:loaded, "search_index">,
payload: %{"character" => "Tom Bombadil"},
id: 3
}
]
iex> Bombadil.fuzzy_search(SearchIndex, [%{"book" => "tom"}])
# Raw SQL: SELECT s0."id", s0."payload" FROM "search_index" AS s0 WHERE (FALSE OR (payload->'book')::text %> 'tom bomba')
[]
```
# Matching search results with an additional context
Given a `search_index` that has also an `item_id` field as an additional context
for filtering, let's suppose you index also the `item_id` and you might have the
same `item_id` for another entry with different `payload` field like in this snippet:
```elixir
iex> Bombadil.index(SearchIndex, item_id: 42, payload: %{"ask" => "lord of the rings"})
iex> Bombadil.index(SearchIndex, item_id: 42, payload: %{"ask" => "I am hiding with the same id, don't find me!"})
```
You can apply a search and a context to filter a specific id, in this case `item_id` is `42`
and the payload that is "hiding" is filtered out.
```elixir
iex> Bombadil.fuzzy_search(SearchIndex, "lord of the ringz", context: %{item_id: 42})
[
%Bombadil.Ecto.Schema.SearchIndex{
__meta__: #Ecto.Schema.Metadata<:loaded, "search_index">,
payload: %{"ask" => "lord of the rings"},
id: 6,
item_id: 42
}
]
```
# Encoding to JSON
```elixir
iex> Bombadil.search("rings") |> Jason.encode!()
"[{\"payload\":{\"book\":\"Lord of the Rings\"}}]"
```
# Integration with your application
## Manual integration
### Define your schema
```elixir
defmodule YourApp.Ecto.Schema.SearchIndex do
use Ecto.Schema
@fields [:field1, :field2]
@derive {Jason.Encoder, only: @fields}
schema "your_table" do
field(:payload, :map)
field(:field1, :string)
field(:field2, :string)
end
end
```
### (optional) Define your schema with a different column name
If you want to use a different name for the `payload` field, you can define your schema in this way:
```elixir
defmodule YourApp.Ecto.Schema.SearchIndex do
use Ecto.Schema
schema "search_index" do
field(:data, :map, source: :payload)
field(:item_id, :integer)
end
end
```
Notice how the:
```elixir
field(:data, :map, source: :payload)
```
specifies via the `field` macro another name for the `:payload` column
### Generate an Ecto migration
```bash
mix ecto.gen.migration create_search_index # For example
```
### Implement the migration
```elixir
defmodule YourApp.Repo.Migrations.CreateSearchIndex do
use Ecto.Migration
def change do
create table("search_index") do
add(:payload, :jsonb)
add(:field1, :string)
add(:field2, :string)
end
end
end
```
## Through configuration (via macros)
```elixir
config :bombadil, Bombadil.Repo,
database: "your_database",
username: "postgres",
password: "postgres",
hostname: "localhost"
config :bombadil,
table_name: "search_index",
additional_fields: [
{:your_special_id, :string},
[...]
]
```
### Generate an Ecto migration
```bash
mix ecto.gen.migration create_search_index # For example
```
### Wire the DynamicMigration based on your configuration
```elixir
defmodule YourApp.Repo.Migrations.CreateSearchIndex do
use Ecto.Migration
require Bombadil.Ecto.DynamicMigration
def change do
Bombadil.Ecto.DynamicMigration.run()
end
end
```
### Run the migration
```elixir
mix ecto.migrate
```
And implement indexing and search for your use case by using the
`Bombadil.fuzzy_search/2` and `Bombadil.index/2` functions
# TODO
[ ] Port user schema to `Bombadil.search`