# FeistelCipher
Encrypted integer IDs using Feistel cipher
> **Database Support**: PostgreSQL only (uses PostgreSQL triggers and functions)
## Why?
**Problem**: Sequential IDs (1, 2, 3...) leak business information:
- Competitors can estimate your growth rate
- Users can enumerate resources (`/posts/1`, `/posts/2`...)
- Total record counts are exposed
**Common Solutions & Issues**:
- **UUIDs**: Strong uniqueness, but values differ across seed runs and are often too long for URLs (`/posts/550e8400-e29b-41d4-a716-446655440000`)
- **Random integers**: Shorter than UUIDs, but introduce collision risk and extra generation complexity
**This Library's Approach**:
- Store sequential integers internally
- Expose encrypted integers externally (non-sequential, unpredictable)
- Deterministic cipher core: the same `seq` value always maps to the same encrypted data component
- Automatic encryption via database trigger
- Adjustable bit size per column
- **Time-based prefix** for PostgreSQL incremental backup optimization
> If you need fully stable IDs across seed runs/environments, use `time_bits: 0` so IDs are generated from the ciphered data component only.
## Installation
> **Using Ash Framework?**
>
> If you're using [Ash Framework](https://ash-hq.org/), use [`ash_feistel_cipher`](https://github.com/devall-org/ash_feistel_cipher) instead! It provides a declarative DSL to configure Feistel cipher encryption directly in your Ash resources.
>
> For plain Ecto users, continue below.
### Using igniter (Recommended)
```bash
mix igniter.install feistel_cipher
```
### Manual Installation
```elixir
# mix.exs
def deps do
[{:feistel_cipher, "~> 1.0"}]
end
```
Then run:
```bash
mix deps.get
mix feistel_cipher.install
```
> â ïļ `mix feistel_cipher.install` is provided by Igniter. If your project does not use Igniter, create a migration manually and call `FeistelCipher.up_v1_functions/1` in `up` and `FeistelCipher.down_v1_functions/1` in `down`.
### Installation Options
Both methods support the following options:
* `--repo` or `-r`: Specify an Ecto repo (optional if auto-detection finds one)
* `--functions-prefix` or `-p`: PostgreSQL schema prefix (default: `public`)
* `--functions-salt` or `-s`: Cipher salt constant, max 2^31-1 (default: randomly generated)
> â ïļ **Security Note**: A cryptographically random salt is generated by default for each project. This ensures that encryption patterns cannot be analyzed across different projects. Never use the same salt across multiple production projects.
> **Fun Fact**: Notice the timestamp `19730501000000` in the migration file generated during installation? That's May 1, 1973 - the day [Horst Feistel published his groundbreaking paper](https://en.wikipedia.org/wiki/Feistel_cipher#History) at IBM, introducing the cipher structure that powers this library. We thought it deserved a permanent timestamp in your database history! ð
## Upgrading from v0.x
See [UPGRADE.md](UPGRADE.md) for the migration guide.
## Usage Example
### 1. Create Migration
```elixir
defmodule MyApp.Repo.Migrations.CreatePosts do
use Ecto.Migration
def up do
create table(:posts) do
add :seq, :bigserial
add :title, :string
end
# 1 day buckets
execute FeistelCipher.up_for_v1_trigger("public", "posts", "seq", "id",
time_bucket: 86400
)
end
def down do
execute FeistelCipher.down_for_v1_trigger("public", "posts", "seq", "id")
drop table(:posts)
end
end
```
### 2. Define Schema
```elixir
defmodule MyApp.Post do
use Ecto.Schema
# Hide seq in API responses
@derive {Jason.Encoder, except: [:seq]}
schema "posts" do
field :seq, :id, read_after_writes: true
field :title, :string
end
end
```
The `read_after_writes: true` option tells Ecto to fetch the `seq` value after INSERT (since it's generated by the database).
Now when you insert a record, `seq` auto-increments and the trigger automatically sets `id = [time_prefix | feistel_cipher_v1(seq)]`:
```elixir
%Post{title: "Hello"} |> Repo.insert!()
# => %Post{id: 8234567890123, seq: 1, title: "Hello"}
# In API responses, only id is exposed (seq is hidden)
```
**Security Note**: Keep `seq` internal. Only expose `id` in APIs to prevent enumeration attacks.
## ID Structure
The generated ID has the structure `[time_bits | data_bits]`:
```
âââââââââââââââââââŽâââââââââââââââââââââââââââââââââââââââââââ
â time_bits â data_bits â
â (15 bits) â (38 bits) â
â time prefix â feistel_cipher_v1(seq) â
âââââââââââââââââââīâââââââââââââââââââââââââââââââââââââââââââ
```
- **time_bits** (upper): Derived from current time. Rows created in the same time bucket share the same prefix, clustering them on nearby PostgreSQL pages.
- **data_bits** (lower): The sequential value encrypted with Feistel cipher.
### Why Time Prefix?
PostgreSQL incremental backups (e.g., pg_basebackup with WAL, pgBackRest) back up entire **pages** (8KB blocks). Without a time prefix, Feistel cipher distributes IDs uniformly across all pages â meaning each new row touches a different page, and incremental backups become as large as full backups.
With a time prefix, rows from the same time bucket land on nearby pages, so incremental backups only need to capture the recently-modified pages.
### When to Use Time Prefix (`time_bits > 0`)
Use a time prefix when you want write locality and smaller incremental backups on large/high-write tables.
- Example: `events`, `logs`, `orders`, `messages` tables that receive continuous inserts.
- Typical config: `time_bits: 15`, `time_bucket: 86400` (daily, default) or `3600` (hourly for tighter locality windows).
- With `time_bits: 15`, `time_bucket: 86400`, and `encrypt_time: false`, the time prefix wraps after about 89 years 9 months.
### When NOT to Use Time Prefix (`time_bits: 0`)
Disable time prefix when you only need opaque IDs and don't need backup/page-locality optimization.
- Example: small reference tables (`countries`, `roles`, `currencies`) or low-write admin/config tables.
- Also useful when you want the simplest mode: `id = feistel_cipher_v1(seq)` with no time component.
## Trigger Options
`up_for_v1_trigger/5` takes 4 positional arguments and an options keyword list:
- Positional arguments: `prefix`, `table`, `from`, `to`
- Options:
> â ïļ **Important**: Parameter changes should be handled as explicit migrations. Some options (like `time_bits`/`time_bucket`/`encrypt_time`) can be changed technically, but old/new IDs will use different semantics. Core cipher options (`data_bits`/`key`/`rounds`) should be treated as immutable in-place.
- `time_bits`: Time prefix bits (default: 15). Set to 0 for no time prefix
- `time_bucket`: Time bucket size in seconds (default: `86400`)
- Example: `86400` for 1 day (default), `3600` for 1 hour
- Rows inserted within the same bucket share the same time prefix
- `time_offset`: Time offset in seconds applied before bucket calculation (default: `0`)
- Formula: `time_value = floor((epoch + time_offset) / time_bucket)`
- Sign convention: positive values move the boundary earlier in local time; negative values move it later
- Example: `time_bucket: 86400`, `time_offset: 21600` shifts daily boundary from `00:00 UTC` to `18:00 UTC` (`03:00 KST`)
- Use this when business day boundaries differ from UTC midnight, or when multiple countries need a stable operational cutover time
- `encrypt_time`: Whether to encrypt the time prefix with Feistel cipher (default: `false`)
- `false`: Time prefix may reflect recent bucket progression, but it is **not** a globally orderable timestamp
- `true`: Time prefix is encrypted (hides time patterns, but same-bucket rows still share prefix). `time_bits` must be even
- `data_bits`: Data cipher bits (default: 38, must be even)
- **Choose different sizes per column**: Unlike UUIDs (fixed 16 bytes), tailor each column's ID length
- Example: User ID = 32 bits (~4B values), Post ID = 40 bits (~1T values)
- Input values in `from` must fit this range (`0..2^data_bits-1`), or INSERT/UPDATE fails with a database error
- `rounds`: Number of Feistel rounds (default: 16, min: 1, max: 32)
- **Default 16** provides good security/performance balance
- **Note**: Diagrams and proofs in this README use 2 rounds for simplicity
- More rounds = more secure but slower
- Odd rounds (1, 3, 5...) and even rounds (2, 4, 6...) are both supported
- `key`: Encryption key (auto-generated if not specified)
- `functions_prefix`: Schema where cipher functions reside (default: `public`)
**Constraints**:
- `time_bits + data_bits` must be âĪ 63 when `encrypt_time: false`, and âĪ 62 when `encrypt_time: true`
- `time_bits` must be even when `encrypt_time: true`
- `data_bits` must be even
> â ïļ You cannot reliably compare IDs by `time_bits` alone to determine temporal order. Because `time_value = floor(now / time_bucket) mod 2^time_bits`, the prefix wraps after `time_bucket * 2^time_bits` seconds. This feature is intended to improve PostgreSQL incremental backup locality, not to provide UUIDv7-style global time ordering.
### Why `time_offset` Exists
`time_bucket` alone uses UTC-based boundaries. For daily buckets, that means bucket changes at UTC midnight, which may split a local business day at awkward local times (for example, evening in the Americas or early morning in Europe).
`time_offset` lets you align bucket boundaries to your operational day (for example, 03:00 local cutover) without changing `time_bucket` size. This improves practical continuity for time-prefix clustering, especially when `encrypt_time: true` is enabled and the prefix itself is not human-readable.
In this library, `time_offset` is added to epoch before bucketing. That is why `+21600` (not `-21600`) gives a 03:00 KST boundary for daily buckets.
Example with custom options:
```elixir
execute FeistelCipher.up_for_v1_trigger(
"public", "posts", "seq", "id",
time_bits: 8,
time_bucket: 86400,
time_offset: 21600,
data_bits: 32,
key: 123456789,
rounds: 8,
functions_prefix: "crypto"
)
```
Example without time prefix:
```elixir
execute FeistelCipher.up_for_v1_trigger(
"public", "posts", "seq", "id",
time_bits: 0
)
```
## Advanced Usage
### Column Rename
When renaming columns that have triggers, drop and recreate the trigger:
```elixir
defmodule MyApp.Repo.Migrations.RenamePostsColumns do
use Ecto.Migration
def change do
# 1. Drop the old trigger
execute FeistelCipher.down_for_v1_trigger("public", "posts", "seq", "id")
# 2. Rename columns
rename table(:posts), :seq, to: :sequence
rename table(:posts), :id, to: :external_id
# 3. Recreate trigger with SAME encryption parameters
# IMPORTANT: Generate key using OLD column names (seq, id)
old_key = FeistelCipher.generate_key("public", "posts", "seq", "id")
execute FeistelCipher.up_for_v1_trigger("public", "posts", "sequence", "external_id",
time_bits: 15, # Must match original
time_bucket: 86400, # Must match original
data_bits: 38, # Must match original
key: old_key, # Key from OLD column names
rounds: 16, # Must match original
functions_prefix: "public" # Must match original
)
end
end
```
**â ïļ Critical**: When recreating triggers, ALL encryption parameters (`time_bits`, `time_bucket`, `data_bits`, `key`, `rounds`, `functions_prefix`) MUST match the original values. Otherwise:
- Updates will fail with exceptions
- 1:1 mapping breaks (new inserts may produce duplicate encrypted values)
> **â ïļ Warning**: Dropping a trigger removes encryption for that column pair. Only use this when intentionally removing or recreating the trigger.
## Alternative: Display-Only IDs
If you prefer to keep your sequential `id` as the primary key, you can use Feistel cipher for display-only columns. This approach is similar to using [Hashids](https://hashids.org/) or other ID obfuscation libraries, but with database-native encryption.
```elixir
# Migration
create table(:posts) do
add :disp_id, :bigint # Encrypted, for public APIs
add :title, :string
end
create unique_index(:posts, [:disp_id])
execute FeistelCipher.up_for_v1_trigger("public", "posts", "id", "disp_id",
time_bucket: 86400
)
# Schema
defmodule MyApp.Post do
use Ecto.Schema
# Hide internal id in API responses
@derive {Jason.Encoder, except: [:id]}
schema "posts" do
field :disp_id, :id, read_after_writes: true
field :title, :string
end
end
```
Then only expose `disp_id` in your APIs while keeping `id` internal.
**Advantages over Hashids:** Database-native (no encoding/decoding).
## Performance
Encrypting 100,000 sequential values:
| Rounds | Total Time | Per Encryption |
|--------|------------|----------------|
| 1 | 180 ms | ~1.8Ξs |
| 2 | 285 ms | ~2.8Ξs |
| 4 | 475 ms | ~4.7Ξs |
| 8 | 824 ms | ~8.2Ξs |
| **16** | **1709 ms**| **~17.1Ξs** |
| 32 | 3171 ms | ~31.7Ξs |
**Default is 16 rounds** - provides good security/performance balance with cryptographic HMAC-SHA256. The overhead per INSERT/UPDATE is negligible for most applications.
### Benchmark Environment
- **CPU**: Apple M1 Pro (10 cores)
- **Database**: PostgreSQL (local)
- **OS**: macOS
- **Elixir**: 1.19.4 / OTP 28
### Running Benchmarks
```bash
MIX_ENV=test mix run benchmark/rounds_benchmark.exs
```
Prerequisites:
- Local PostgreSQL reachable at the `config/test.exs` settings (`username: postgres`, `password: postgres`, `database: feistel_cipher_test`)
- Database/user created before running the benchmark command
The benchmark encrypts 100,000 sequential values (1 to 100,000) using a SQL batch function to minimize overhead and measure pure encryption performance.
## How It Works
The Feistel cipher is a symmetric structure used in the construction of block ciphers. This library implements a configurable Feistel network that transforms sequential integers into non-sequential encrypted integers with one-to-one mapping.
<p align="center">
<img src="assets/feistel-diagram.png" alt="Feistel Cipher Diagram" width="66%">
</p>
> **Note**: The diagram above illustrates a 2-round Feistel cipher for simplicity. By default, this library uses **16 rounds** for better security. The number of rounds is configurable (see [Trigger Options](#trigger-options)).
### Self-Inverse Property
The Feistel cipher is **self-inverse**: applying the same function twice returns the original value. This means encryption and decryption use the exact same algorithm.
**Mathematical Proof:**
Let's denote the input as $(L_1, R_1)$ and the round function as $F(x)$.
**First application (Encryption):**
$$
\begin{aligned}
L_2 &= R_1, & R_2 &= L_1 \oplus F(R_1) \\
L_3 &= R_2, & R_3 &= L_2 \oplus F(R_2) \\
\text{Output} &= (R_3, L_3)
\end{aligned}
$$
**Second application (Decryption) - Starting with $(R_3, L_3)$:**
$$
\begin{aligned}
L_2' &= L_3, & R_2' &= R_3 \oplus F(L_3) \\
&= L_3, & &= R_3 \oplus F(R_2) \\
&= L_3, & &= (L_2 \oplus F(R_2)) \oplus F(R_2) \\
&= L_3, & &= L_2 = R_1 \quad \text{(XOR cancellation)} \\
\\
L_3' &= R_2' = R_1, & R_3' &= L_2' \oplus F(R_2') \\
&= R_1, & &= L_3 \oplus F(R_1) \\
&= R_1, & &= R_2 \oplus F(R_1) \\
&= R_1, & &= (L_1 \oplus F(R_1)) \oplus F(R_1) \\
&= R_1, & &= L_1 \quad \text{(XOR cancellation)} \\
\\
\text{Output} &= (R_3', L_3') = (L_1, R_1) \quad \checkmark
\end{aligned}
$$
**Key Insight:** The XOR operation's property $a \oplus b \oplus b = a$ ensures that each transformation is reversed when applied twice.
**Database Implementation:**
In the database trigger implementation, this means:
```sql
-- Encryption: seq â data part of id
data_component = feistel_cipher_v1(seq, data_bits, key, rounds)
-- Decryption: data part of id â seq (using the same function!)
seq = feistel_cipher_v1(data_component, data_bits, key, rounds)
```
### Key Properties
- **Deterministic**: Same input always produces same output
- **Non-sequential**: Sequential inputs produce seemingly random outputs
- **Collision-free**: One-to-one mapping within the bit range
## License
MIT