README.md

# arcterex_data

## About

A collection of specialized data structures for Elixir, designed for performance and functionality beyond what's available in the standard library.

## Contents

- [Contents](#contents)
- [About](#about)
- [Getting Started](#getting-started)
- [Contributing](#contributing)
- [Core Team](#core-team)

### Splay Tree

A self-optimizing binary search tree that moves frequently accessed elements closer to the root through a process called "splaying". Perfect for workloads with locality of reference where certain keys are accessed more frequently than others.

**Key Benefits:**
- **Self-optimizing**: Frequently accessed keys automatically move toward the root
- **Ordered operations**: Native support for predecessor/successor operations
- **Pure functional**: Immutable data structure following Elixir conventions
- **Comprehensive API**: All standard operations plus ordered traversal

**When to Use:**
- Workloads with 80/20 access patterns (hot keys)
- Need for ordered operations (min/max, predecessor/successor)
- Scenarios where recently accessed items are likely to be accessed again
- Applications requiring both fast access and ordered traversal

**When NOT to Use:**
- Uniform random access patterns (use `Map` instead)
- Simple key-value storage without ordering (use `Map` instead)
- Strictly balanced tree requirements (use `:gb_trees` instead)

## Getting Started

The package can be installed by adding `arcterex_data` to your 
list of dependencies in `mix.exs`:

```elixir
def deps do
  [
    {:arcterex_data, "~> 0.1.0"}
  ]
end
```

## Usage

### Splay Tree Quick Start

```elixir
alias Arcterex.Data.SplayTree

# Create a new splay tree
tree = SplayTree.new()

# Insert key-value pairs
{:ok, tree} = SplayTree.insert(tree, :user_123, %{name: "Alice"})
{:ok, tree} = SplayTree.insert(tree, :user_456, %{name: "Bob"})

# Or use put/3 which always succeeds (insert or update)
tree = SplayTree.put(tree, :user_789, %{name: "Charlie"})

# Create from a list
tree = SplayTree.new([{:a, 1}, {:b, 2}, {:c, 3}])

# Create with custom comparator (reverse order)
reverse_cmp = fn a, b ->
  cond do
    a > b -> :lt
    a < b -> :gt
    true -> :eq
  end
end

tree = SplayTree.new([{1, :a}, {2, :b}, {3, :c}], reverse_cmp)
SplayTree.keys(tree)  # => [3, 2, 1]
```

### Basic Operations

```elixir
# Get a value (does not splay)
SplayTree.get(tree, :user_123)  # => %{name: "Alice"}
SplayTree.get(tree, :missing, :default)  # => :default

# Access with splaying (returns {value, new_tree})
{value, tree} = SplayTree.access(tree, :user_123)
# Repeated accesses to same key are faster due to splaying

# Fetch with explicit error handling
SplayTree.fetch(tree, :user_123)  # => {:ok, %{name: "Alice"}}
SplayTree.fetch(tree, :missing)   # => :error

# Check key existence
SplayTree.has_key?(tree, :user_123)  # => true

# Delete a key
tree = SplayTree.delete(tree, :user_123)

# Update or insert
tree = SplayTree.put(tree, :user_123, %{name: "Alice Updated"})

# Bulk insert
tree = SplayTree.put_many(tree, [{:d, 4}, {:e, 5}, {:f, 6}])
```

### Ordered Operations

```elixir
# Get all keys in sorted order
SplayTree.keys(tree)  # => [:a, :b, :c]

# Get all values in key-sorted order
SplayTree.values(tree)  # => [1, 2, 3]

# Convert to list
SplayTree.to_list(tree)  # => [a: 1, b: 2, c: 3]

# Access by position
SplayTree.at(tree, 0)  # => {:ok, {:a, 1}}
SplayTree.at(tree, 10) # => :error
```

### Min/Max and Extrema

```elixir
# Get minimum/maximum keys
SplayTree.min_key(tree)    # => :a
SplayTree.max_key(tree)    # => :z

# Get minimum/maximum entries
SplayTree.min_entry(tree)  # => {:a, 1}
SplayTree.max_entry(tree)  # => {:z, 26}
```

### Predecessor and Successor

These functions find the predecessor (largest key less than a given key) or successor (smallest key greater than a given key). They work for any key value, whether or not it exists in the tree.

```elixir
tree = SplayTree.new([{:a, 1}, {:c, 3}, {:e, 5}])

# Find predecessor (largest key less than given key)
SplayTree.predecessor(tree, :e)  # => :c (key :e exists in tree)
SplayTree.predecessor(tree, :d)  # => :c (key :d not in tree, returns :c)
SplayTree.predecessor(tree, :a)  # => nil (no predecessor exists)

# Find successor (smallest key greater than given key)
SplayTree.successor(tree, :a)  # => :c (key :a exists in tree)
SplayTree.successor(tree, :b)  # => :c (key :b not in tree, returns :c)
SplayTree.successor(tree, :e)  # => nil (no successor exists)
```

### Tree Information

```elixir
# Check if empty
SplayTree.empty?(tree)  # => false

# Get size
SplayTree.size(tree)  # => 10

# Get height (for debugging/analysis)
SplayTree.height(tree)  # => 5

# Clear all elements (preserves comparator)
tree = SplayTree.clear(tree)
```

### Performance Considerations

**Use `get/3` when:**
- Performing one-time lookups
- Don't need tree rebalancing
- Want minimal overhead

**Use `access/2` when:**
- Keys will be accessed multiple times
- Want to benefit from automatic rebalancing
- Working with hot keys (80/20 access patterns)

```elixir
# Example: Caching frequently accessed users
defmodule UserCache do
  def get_user(tree, user_id) do
    case SplayTree.access(tree, user_id) do
      {nil, tree} ->
        # User not in cache, fetch from database
        user = fetch_from_db(user_id)
        tree = SplayTree.put(tree, user_id, user)
        {user, tree}
      
      {user, tree} ->
        # User in cache, tree now has user_id closer to root
        {user, tree}
    end
  end
end
```

## Performance Characteristics

| Operation | Average | Worst Case | Notes |
|-----------|---------|------------|-------|
| `insert/3`, `put/3` | O(log n) | O(n) | Amortized O(log n) |
| `get/3` | O(log n) | O(n) | No splaying |
| `access/2` | O(log n) | O(n) | With splaying |
| `delete/2` | O(log n) | O(n) | Amortized O(log n) |
| `keys/1`, `values/1`, `to_list/1` | O(n) | O(n) | In-order traversal |
| `min_key/1`, `max_key/1` | O(log n) | O(n) | With splaying |
| `predecessor/2`, `successor/2` | O(log n) | O(n) | Unique to splay trees |
| `size/1`, `empty?/1` | O(1) | O(1) | Cached |

**Benchmark Results** (compared to `:gb_trees`):
- Sequential access: **1.04x** :gb_trees ✓
- Mixed operations: **1.21x** :gb_trees ✓
- Build time (10K): **1.74x** :gb_trees ✓
- Hot key access with `access/2`: **33x faster** than `get/3`

## Comparison with Other Structures

### vs Map
- **SplayTree**: Ordered operations, self-optimizing, O(log n) access
- **Map**: Unordered, O(1) average access, simpler for basic key-value storage

### vs :gb_trees
- **SplayTree**: Self-optimizing, native predecessor/successor, better for hot keys
- **:gb_trees**: Strictly balanced, slightly faster build time, no self-optimization

### vs Ordered Lists
- **SplayTree**: O(log n) access and modification
- **Lists**: O(n) access, O(1) prepend, better for small datasets

## Testing

```bash
# Run all tests
mix test

# Run with coverage
mix test --cover

# Run performance benchmarks
elixir benchmarks/manual_perf_test.exs
```

## License

Copyright 2026 Esri

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

> [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.