README.md

# AB

**Automatically compare two implementations of the same problem with property-based testing and performance benchmarks.**

AB is an Elixir library that makes it effortless to verify that two implementations of the same function behave identically, while also comparing their performance characteristics. Perfect for refactoring, algorithm optimization, and A/B testing different approaches.

## Why AB?

When you have two implementations of the same function:
- **Refactoring** - Ensure your optimized version produces identical results
- **Algorithm comparison** - Compare different algorithms solving the same problem
- **Migration** - Verify new code matches legacy behavior exactly
- **Learning** - Understand tradeoffs between different approaches

AB automatically generates property tests from your typespecs and runs comprehensive comparisons.

## Features

✅ **Automatic property test generation** from function typespecs  
✅ **Side-by-side comparison** of two implementations  
✅ **Performance benchmarking** with detailed statistics  
✅ **Invalid input testing** to verify error handling  
✅ **Type consistency validation** between specs and implementations  
✅ **Zero boilerplate** - just add macros to your tests

## Installation

Add `ab` to your `mix.exs` dependencies:

```elixir
def deps do
  [
    {:ab, "~> 0.1.0"}
  ]
end
```

## Quick Start

### 1. Define two implementations with identical typespecs

```elixir
defmodule Math do
  # Implementation A: iterative
  @spec factorial_iterative(non_neg_integer()) :: pos_integer()
  def factorial_iterative(n), do: factorial_iter(n, 1)
  
  defp factorial_iter(0, acc), do: acc
  defp factorial_iter(n, acc), do: factorial_iter(n - 1, n * acc)

  # Implementation B: recursive
  @spec factorial_recursive(non_neg_integer()) :: pos_integer()
  def factorial_recursive(0), do: 1
  def factorial_recursive(n), do: n * factorial_recursive(n - 1)
end
```

### 2. Compare them automatically

```elixir
defmodule MathTest do
  use ExUnit.Case
  use ExUnitProperties
  import PropertyGenerator

  # Automatically test both implementations produce identical results
  compare_test {Math, :factorial_iterative}, {Math, :factorial_recursive}

  # Benchmark performance differences
  benchmark_test {Math, :factorial_iterative}, {Math, :factorial_recursive}

  # Test each implementation matches its typespec
  property_test Math, :factorial_iterative
  property_test Math, :factorial_recursive
end
```

That's it! AB will:
- Generate random test data matching your typespec
- Verify both functions produce identical outputs
- Compare performance with detailed statistics
- Validate outputs match the declared return type

## Core Macros

### `compare_test/2` - Verify Identical Behavior

Generates property tests proving two implementations produce identical results:

```elixir
# Basic comparison
compare_test {ModuleA, :function}, {ModuleB, :function}

# With verbose logging
compare_test {ModuleA, :function}, {ModuleB, :function}, verbose: true
```

The macro will:
1. Extract and compare typespecs (must be identical)
2. Generate test data matching the input types
3. Run both functions on the same inputs
4. Assert outputs are identical
5. Validate outputs match the return type

**Example output:**
```
property factorial_iterative and factorial_recursive produce identical results
  ✓ 100 successful comparison runs
✓ factorial_iterative and factorial_recursive produce identical results (1.2ms)
```

### `benchmark_test/2` - Compare Performance

Generates benchmarks comparing two implementations:

```elixir
# Basic benchmark
benchmark_test {ModuleA, :function}, {ModuleB, :function}

# Custom timing
benchmark_test {ModuleA, :function}, {ModuleB, :function},
  time: 5,           # 5 seconds of benchmarking
  memory_time: 2     # 2 seconds of memory profiling
```

**Example output:**
```
=== Benchmarking Math.factorial_iterative vs Math.factorial_recursive ===

Name                           ips        average  deviation         median         99th %
Math.factorial_iterative    1.23 M        0.81 μs   ±612.45%        0.75 μs        1.12 μs
Math.factorial_recursive    0.98 M        1.02 μs   ±587.32%        0.96 μs        1.35 μs

Comparison:
Math.factorial_iterative    1.23 M
Math.factorial_recursive    0.98 M - 1.26x slower +0.21 μs
```

### `property_test/2` - Validate Against Typespec

Automatically generates property tests from function typespecs:

```elixir
# Basic property test
property_test MyModule, :my_function

# With verbose logging
property_test MyModule, :my_function, verbose: true
```

The macro will:
1. Parse the function's `@spec` declaration
2. Generate appropriate test data for all input types
3. Call the function with generated inputs
4. Validate outputs match the declared return type
5. Test type consistency between `@type` and `@spec`

**Supported types:**
- Basic: `integer()`, `float()`, `boolean()`, `atom()`, `binary()`, `bitstring()`, `String.t()`, `charlist()`, `nil`
- Collections: `list(type)`, `tuple({type1, type2})`, `map()`, keyword lists
- Ranges: `0..100`, `pos_integer()`, `non_neg_integer()`, `neg_integer()`
- Structs: Custom struct types with `@type t :: %__MODULE__{...}`
- Union types: `integer() | String.t()`
- Literals: Specific atom or integer values (e.g., `:ok`, `42`)
- Generic: `any()`, `term()`
- Complex: Nested structures, remote types

### `robust_test/2` - Verify Error Handling

Tests that functions properly reject invalid inputs:

```elixir
# Test invalid input handling
robust_test MyModule, :my_function

# With verbose logging
robust_test MyModule, :my_function, verbose: true
```

This generates inputs that **don't** match the typespec and verifies the function either:
- Raises an appropriate exception
- Has guards that prevent type mismatches

Great for ensuring functions fail gracefully rather than producing garbage output.

## Complete Example

```elixir
defmodule Sum do
  # Implementation A: Enum.sum
  @spec sum_builtin([integer()]) :: integer()
  def sum_builtin(list), do: Enum.sum(list)

  # Implementation B: manual recursion
  @spec sum_recursive([integer()]) :: integer()
  def sum_recursive([]), do: 0
  def sum_recursive([head | tail]), do: head + sum_recursive(tail)
end

defmodule SumTest do
  use ExUnit.Case
  use ExUnitProperties
  import PropertyGenerator

  describe "Sum implementations" do
    # Verify both produce identical results
    compare_test {Sum, :sum_builtin}, {Sum, :sum_recursive}

    # Compare performance
    benchmark_test {Sum, :sum_builtin}, {Sum, :sum_recursive}

    # Validate each against typespec
    property_test Sum, :sum_builtin
    property_test Sum, :sum_recursive

    # Test error handling
    robust_test Sum, :sum_builtin
    robust_test Sum, :sum_recursive
  end
end
```

**Output:**
```
SumTest
  Sum implementations
    property sum_builtin and sum_recursive produce identical results
      ✓ 100 successful comparison runs
    ✓ sum_builtin and sum_recursive produce identical results (1.8ms)
    
    property sum_builtin satisfies its typespec
      ✓ 100 successful property test runs
    ✓ sum_builtin satisfies its typespec (2.1ms)
    ✓ sum_builtin type consistency validation (0.1ms)
    
    property sum_recursive satisfies its typespec
      ✓ 100 successful property test runs
    ✓ sum_recursive satisfies its typespec (2.4ms)
    ✓ sum_recursive type consistency validation (0.1ms)
    
    property sum_builtin properly rejects invalid input
      ✓ 100 successful invalid input test runs
    ✓ sum_builtin properly rejects invalid input (124.3ms)
    
    property sum_recursive properly rejects invalid input
      ✓ 100 successful invalid input test runs
    ✓ sum_recursive properly rejects invalid input (127.8ms)
    
    test benchmark sum_builtin vs sum_recursive
    === Benchmarking Sum.sum_builtin vs Sum.sum_recursive ===
    Name                   ips        average  deviation
    Sum.sum_builtin     1.45 M        0.69 μs   ±652.34%
    Sum.sum_recursive   0.87 M        1.15 μs   ±723.12%
    
    Comparison:
    Sum.sum_builtin     1.45 M
    Sum.sum_recursive   0.87 M - 1.67x slower +0.46 μs
    ✓ benchmark sum_builtin vs sum_recursive (7503.5ms)

Finished in 7.9 seconds
8 properties, 1 test, 0 failures
```

## API Functions

For manual testing and custom scenarios:

### `PropertyGenerator.get_function_spec/2`

Extract typespec information:

```elixir
{:ok, {input_types, output_type}} = 
  PropertyGenerator.get_function_spec(MyModule, :my_function)
```

### `PropertyGenerator.types_equivalent?/2`

Compare two type specifications:

```elixir
PropertyGenerator.types_equivalent?(type1, type2)
# => true | false
```

### `PropertyGenerator.infer_result_type/1`

Get human-readable type name from a value:

```elixir
PropertyGenerator.infer_result_type([1, 2, 3])
# => "list"
```

## Advanced Usage

### Custom Test Data

While AB generates test data automatically, you can combine it with custom generators:

```elixir
property "custom test scenario" do
  check all my_data <- my_custom_generator() do
    result1 = ModuleA.function(my_data)
    result2 = ModuleB.function(my_data)
    assert result1 == result2
  end
end
```

### Struct Type Validation

Validate consistency between `@type` definitions and `@spec`:

```elixir
defmodule User do
  @type t :: %__MODULE__{
    name: String.t(),
    age: integer()
  }

  defstruct [:name, :age]

  @spec new(String.t(), integer()) :: t()
  def new(name, age), do: %__MODULE__{name: name, age: age}
end

# In test
validate_struct_consistency User
```

### Conditional Comparison

Compare implementations only when certain conditions are met:

```elixir
if System.get_env("RUN_SLOW_TESTS") do
  compare_test {SlowImpl, :process}, {FastImpl, :process}
end
```

## Configuration

In `config/test.exs`:

```elixir
config :stream_data,
  max_runs: 100,           # Default number of test cases
  max_shrinking_steps: 50  # Shrinking iterations for failures
```

In your test file:

```elixir
# Configure ExUnit
ExUnit.configure(
  exclude: [:slow, :benchmark],
  trace: true,
  seed: 0  # Deterministic test runs
)
```

## Best Practices

### 1. Use Precise Typespecs

```elixir
# Good - precise types
@spec divide(integer(), pos_integer()) :: float()

# Less precise
@spec divide(number(), number()) :: number()
```

### 2. Test Edge Cases

The generated tests cover random cases, but add explicit tests for edge cases:

```elixir
test "handles empty lists" do
  assert MyModule.sort([]) == []
end
```

### 3. Tag Slow Tests

```elixir
@tag :slow
benchmark_test {Impl1, :heavy_function}, {Impl2, :heavy_function}
```

Then run with `mix test --exclude slow` for fast feedback.

### 4. Document Differences

When implementations have different performance characteristics, document why:

```elixir
# merge_sort is faster for large lists (O(n log n))
# but has overhead for small lists
benchmark_test {Sort, :merge_sort}, {Sort, :quick_sort}
```

## Real-World Examples

### Refactoring for Performance

```elixir
# Compare old vs new implementation
compare_test {Parser, :parse_legacy}, {Parser, :parse_optimized}
benchmark_test {Parser, :parse_legacy}, {Parser, :parse_optimized}
```

### Algorithm Comparison

```elixir
# Test different search algorithms
compare_test {Search, :binary_search}, {Search, :interpolation_search}
```

### Data Encoding Comparison

```elixir
# Compare JSON encoding libraries
compare_test {Encoder, :encode_with_jason}, {Encoder, :encode_with_poison}
```

## Dependencies

- **stream_data** - Property-based testing and data generation
- **benchee** - Performance benchmarking
- **ex_unit** - Elixir's built-in test framework

## Contributing

Contributions are welcome! Please:

1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Submit a pull request

## License

MIT License - see LICENSE file for details

## Credits

Built with ❤️ using:
- [StreamData](https://github.com/whatyouhide/stream_data) by Andrea Leopardi
- [Benchee](https://github.com/bencheeorg/benchee) by Tobias Pfeiffer
- Inspired by QuickCheck and property-based testing

---

**Start comparing your implementations today!** 🚀