# PartitionedEts
[](https://github.com/twinn/partitioned_ets/actions/workflows/ci.yml)
[](https://hex.pm/packages/partitioned_ets)
[](https://hexdocs.pm/partitioned_ets)
A distributed, partitioned ETS table for Elixir.
`PartitionedEts` exposes the full `:ets` API, but routes every operation to a
single `{node, partition}` shard selected by
[rendezvous hashing (HRW)](https://en.wikipedia.org/wiki/Rendezvous_hashing)
over the cluster. On membership changes it transfers the affected entries
between shards so that data and routing remain in sync.
## Installation
Add `:partitioned_ets` to your list of dependencies in `mix.exs`:
```elixir
def deps do
[
{:partitioned_ets, "~> 0.1.0"}
]
end
```
Requires Elixir `~> 1.15` and OTP 26+.
## Usage
### Module form
Every function takes the table name as its first argument, mirroring `:ets`:
```elixir
# In a supervision tree
children = [
{PartitionedEts,
name: :my_cache, table_opts: [:named_table, :public], partitions: 16}
]
# Operations
PartitionedEts.insert(:my_cache, {:key, :value})
PartitionedEts.lookup(:my_cache, :key)
#=> [{:key, :value}]
PartitionedEts.delete(:my_cache, :key)
PartitionedEts.match(:my_cache, {:"$1", :"$2"})
PartitionedEts.foldl(:my_cache, fn {_k, v}, acc -> v + acc end, 0)
```
### Macro form
Defines one module per table with wrapper functions that omit the table argument:
```elixir
defmodule MyApp.Cache do
use PartitionedEts, table_opts: [:named_table, :public], partitions: 16
end
# In a supervision tree
children = [MyApp.Cache]
# Operations
MyApp.Cache.insert({:key, :value})
MyApp.Cache.lookup(:key)
```
Both forms share the same GenServer and partition layout. The macro generates
thin delegation functions.
## How it works
### Shards
A shard is a `{node, partition_table_atom}` pair. On each node hosting the
table, `PartitionedEts.start_link/1` creates `N` named ETS tables. With
`partitions: 1` (the default) the single ETS table is named after the table
itself; with `partitions: N > 1` they are named `:"<table>_p0"` through
`:"<table>_p<N-1>"`. The cluster-wide shard set is the cross product of all
nodes and all partition tables.
### Routing
Single-key operations use rendezvous hashing to select a shard. Local
operations go directly to ETS; remote operations are dispatched via
`:erpc.call`. Fan-out operations (`match/2`, `select/2`, `tab2list/1`,
`foldl/3`, etc.) iterate every shard and merge results.
Adding or removing a node remaps only approximately `1/total_shards` of keys.
### Handoff
On graceful join, every existing node iterates its local entries, identifies
keys whose new HRW shard is on the joining node, and ships them via `:erpc`.
On graceful leave, the leaving node ships every local entry to its new owner
using a two-pass strategy that avoids read misses during the transfer.
### Consistency model
PartitionedEts is **eventually consistent** with **last-writer-wins**
semantics during topology changes. In steady state (no nodes joining or
leaving), every operation is immediately consistent.
During a join or leave handoff, there is a brief window where concurrent
writes to the same key on different nodes can conflict. Conflicts are resolved
by a monotonic clock: the write with the higher clock value wins. There are no
locks or blocking during handoff — reads and writes continue to be served
throughout.
## Configuration
| Option | Type | Default | Description |
|---|---|---|---|
| `:name` | `atom` | required | Logical table name. Used as the ETS table name (when `partitions: 1`) or prefix, the PgRegistry group key, and the `:via` registration name. |
| `:table_opts` | `list` | required | Passed to `:ets.new/2`. Must include `:named_table` and `:public`. |
| `:partitions` | `pos_integer` | `1` | Number of ETS tables per node. Higher values reduce write contention at the cost of increased fan-out. |
| `:callbacks` | `module \| nil` | `nil` | Optional module exporting any subset of `hash/2`, `node_added/1`, `node_removed/1`. |
| `:distributed` | `boolean` | `true` | When `false`, the table runs on a single node with no cluster registration or handoff. Useful for write-contention relief without distribution overhead. |
## Limitations
- **Hard crashes lose data.** If a node terminates without running `terminate/2`,
the entries it owned are lost. Replication is not implemented.
- **Handoff blocks the owner GenServer.** For large tables, handoff may exceed
the default supervisor shutdown timeout. Increase the `shutdown:` value in
the child spec if needed.
- **Ordered-set semantics are not preserved across shards.** `first/1`,
`last/1`, `next/2`, and `prev/2` iterate shards in a deterministic but not
globally ordered sequence.
- **Pagination is in-memory.** The first call to `match/3` or `select/3`
materializes the full cluster-wide result set before returning chunks.
## Running the tests
```bash
epmd -daemon # required for the :peer cluster tests
mix test
```
## License
MIT -- see [LICENSE](LICENSE).