TESTING_ARCHITECTURE.md

Select File:
TESTING_ARCHITECTURE.md

# Testing Architecture & Development Guide

## Overview

This document outlines the comprehensive testing strategy for the pipeline orchestration system, providing elegant mock/live mode management and robust test coverage for all components.

## Core Testing Principles

### 1. Clear Separation of Concerns
- **Business Logic**: Tested with mocks for external dependencies
- **Integration Points**: Tested with both mocks and live services
- **End-to-End Flows**: Tested in controlled environments

### 2. Environment-Based Testing
- **Unit Tests**: Always use mocks (fast, reliable, no external deps)
- **Integration Tests**: Can use mocks or live services based on configuration
- **E2E Tests**: Use live services in staging environments

### 3. Dependency Injection Pattern
- All external service interactions go through provider interfaces
- Providers can be swapped between mock and live implementations
- Configuration determines which provider to use

## Testing Architecture

### Provider Interface Pattern

```elixir
# All external services implement a provider behavior
defmodule Pipeline.Providers.AIProvider do
  @callback query(prompt :: String.t(), options :: map()) :: 
    {:ok, response :: map()} | {:error, reason :: String.t()}
end

# Live implementation
defmodule Pipeline.Providers.ClaudeProvider do
  @behaviour Pipeline.Providers.AIProvider
  
  def query(prompt, options) do
    # Real Claude SDK calls
  end
end

# Mock implementation  
defmodule Pipeline.Test.Mocks.ClaudeProvider do
  @behaviour Pipeline.Providers.AIProvider
  
  def query(prompt, options) do
    # Deterministic mock responses
  end
end
```

### Configuration-Driven Provider Selection

```elixir
# config/test.exs
config :pipeline, :providers, %{
  ai_provider: Pipeline.Test.Mocks.ClaudeProvider,
  gemini_provider: Pipeline.Test.Mocks.GeminiProvider
}

# config/dev.exs
config :pipeline, :providers, %{
  ai_provider: Pipeline.Providers.ClaudeProvider,
  gemini_provider: Pipeline.Providers.GeminiProvider
}
```

### Test Mode Management

```elixir
# Environment variable controls test mode
# TEST_MODE=mock mix test          # Uses mocks
# TEST_MODE=live mix test          # Uses live services  
# TEST_MODE=mixed mix test         # Uses mocks for unit, live for integration

defmodule Pipeline.TestMode do
  def provider_for(service) do
    case get_test_mode() do
      :mock -> mock_provider(service)
      :live -> live_provider(service)
      :mixed -> mixed_provider(service)
    end
  end
  
  defp get_test_mode do
    System.get_env("TEST_MODE", "mock") |> String.to_atom()
  end
end
```

## Mock Implementation Strategy

### 1. Deterministic Responses
```elixir
defmodule Pipeline.Test.Mocks.ClaudeProvider do
  @behaviour Pipeline.Providers.AIProvider
  
  # Predictable responses based on input patterns
  def query("simple test", _opts) do
    {:ok, %{
      text: "Mock response for simple test",
      success: true,
      cost: 0.001
    }}
  end
  
  def query("error test", _opts) do
    {:error, "Mock error for testing"}
  end
  
  # Pattern matching for common scenarios
  def query(prompt, _opts) when is_binary(prompt) do
    {:ok, %{
      text: "Mock response for: #{String.slice(prompt, 0, 50)}...",
      success: true,
      cost: 0.001
    }}
  end
end
```

### 2. Stateful Mocks for Complex Scenarios
```elixir
defmodule Pipeline.Test.Mocks.StatefulClaudeProvider do
  use GenServer
  @behaviour Pipeline.Providers.AIProvider
  
  # Track conversation state, turn counts, etc.
  def query(prompt, options) do
    GenServer.call(__MODULE__, {:query, prompt, options})
  end
  
  def handle_call({:query, prompt, options}, _from, state) do
    # Simulate turn limits, state transitions, etc.
    {response, new_state} = generate_response(prompt, options, state)
    {:reply, response, new_state}
  end
end
```

### 3. Scenario-Based Testing
```elixir
defmodule Pipeline.Test.Scenarios do
  def setup_successful_workflow do
    # Configure mocks for a complete successful workflow
    Pipeline.Test.Mocks.ClaudeProvider.set_responses([
      "Create a Python hello world program",
      "Add error handling to the program", 
      "Write unit tests for the program"
    ])
  end
  
  def setup_failure_scenario do
    # Configure mocks to simulate various failure modes
    Pipeline.Test.Mocks.ClaudeProvider.set_error_on_turn(2, "API rate limit exceeded")
  end
end
```

## Test Organization

### Directory Structure
```
test/
├── unit/                    # Fast, isolated tests with mocks
│   ├── pipeline/
│   │   ├── executor_test.exs
│   │   ├── step/
│   │   │   ├── claude_test.exs
│   │   │   └── gemini_test.exs
│   │   └── workflow_loader_test.exs
│   └── support/
├── integration/             # Cross-component tests
│   ├── end_to_end_test.exs
│   ├── workflow_execution_test.exs
│   └── provider_integration_test.exs
├── fixtures/                # Test data and configurations
│   ├── workflows/
│   │   ├── simple_workflow.yaml
│   │   └── complex_workflow.yaml
│   └── responses/
│       ├── claude_responses.json
│       └── gemini_responses.json
└── support/                 # Test helpers and utilities
    ├── test_case.exs
    ├── factory.ex
    └── mocks/
        ├── claude_provider.ex
        └── gemini_provider.ex
```

### Test Categories

#### Unit Tests (Always Mocked)
```elixir
defmodule Pipeline.ExecutorTest do
  use Pipeline.Test.Case, mode: :mock
  
  test "executes workflow steps in sequence" do
    workflow = build(:simple_workflow)
    
    assert {:ok, results} = Pipeline.Executor.execute(workflow)
    assert length(results) == 3
  end
end
```

#### Integration Tests (Configurable)
```elixir
defmodule Pipeline.WorkflowExecutionTest do
  use Pipeline.Test.Case, mode: :configurable
  
  @tag :integration
  test "complete workflow execution" do
    workflow = load_fixture("workflows/simple_workflow.yaml")
    
    assert {:ok, results} = Pipeline.Executor.execute(workflow)
    assert results["final_step"]["success"] == true
  end
end
```

#### Live Tests (Live Services Only)
```elixir
defmodule Pipeline.LiveIntegrationTest do
  use Pipeline.Test.Case, mode: :live
  
  @tag :live
  @tag timeout: 30_000
  test "actual Claude API integration" do
    # Only runs with TEST_MODE=live or TEST_MODE=mixed
    prompt = "Write a simple hello world in Python"
    
    assert {:ok, response} = Pipeline.Providers.ClaudeProvider.query(prompt, %{})
    assert response.success == true
    assert is_binary(response.text)
  end
end
```

## Mock Data Management

### Response Fixtures
```elixir
# test/fixtures/responses/claude_responses.json
{
  "simple_python_program": {
    "text": "print('Hello, World!')\n\n# This is a simple Python program...",
    "success": true,
    "cost": 0.0023
  },
  "code_review": {
    "text": "Code review feedback:\n1. Consider adding type hints...",
    "success": true, 
    "cost": 0.0156
  }
}
```

### Factory Pattern
```elixir
defmodule Pipeline.Test.Factory do
  def build(:workflow) do
    %{
      "workflow" => %{
        "name" => "test_workflow",
        "steps" => [
          build(:claude_step),
          build(:gemini_step)
        ]
      }
    }
  end
  
  def build(:claude_step) do
    %{
      "name" => "claude_task",
      "type" => "claude",
      "prompt" => [%{"type" => "static", "content" => "Test prompt"}]
    }
  end
end
```

## Running Tests

### Command Examples
```bash
# Unit tests only (fast, always mocked)
mix test test/unit/

# Integration tests with mocks
TEST_MODE=mock mix test test/integration/

# Integration tests with live services
TEST_MODE=live mix test test/integration/ --include live

# All tests with mixed mode
TEST_MODE=mixed mix test

# Specific test scenarios
mix test --only integration
mix test --only live
mix test --exclude live  # Skip live tests
```

### Continuous Integration
```bash
# CI pipeline uses mocks for speed and reliability
TEST_MODE=mock mix test --coverage

# Nightly build runs live tests
TEST_MODE=live mix test --include live --timeout 300000
```

## Development Workflow

### 1. Writing New Features
1. Start with unit tests using mocks
2. Implement the feature with provider interfaces
3. Add integration tests that work with mocks
4. Test with live services locally
5. Update mock responses based on live behavior

### 2. Debugging Issues  
1. Reproduce with mocks first (faster iteration)
2. Compare mock vs live behavior
3. Update mocks to match live service behavior
4. Fix implementation based on findings

### 3. Adding New External Services
1. Define provider behavior/interface
2. Create mock implementation
3. Implement live provider
4. Add configuration switching
5. Write comprehensive tests for both

## Quality Assurance

### Test Coverage Requirements
- **Unit Tests**: 95% line coverage minimum
- **Integration Tests**: All critical paths covered
- **Mock Accuracy**: Regular validation against live services

### Performance Benchmarks
```elixir
defmodule Pipeline.Test.Performance do
  use ExUnit.Case
  
  @tag :benchmark
  test "workflow execution performance" do
    workflow = build(:complex_workflow)
    
    {time, _result} = :timer.tc(fn ->
      Pipeline.Executor.execute(workflow)
    end)
    
    # Workflow should complete within 5 seconds with mocks
    assert time < 5_000_000  # microseconds
  end
end
```

### Mock Validation
```elixir
# Periodically validate mocks against live services
defmodule Pipeline.Test.MockValidation do
  @tag :validation
  test "mock responses match live service behavior" do
    test_cases = load_validation_cases()
    
    for test_case <- test_cases do
      mock_response = MockProvider.query(test_case.prompt, test_case.options)
      live_response = LiveProvider.query(test_case.prompt, test_case.options)
      
      assert_responses_equivalent(mock_response, live_response)
    end
  end
end
```

## Best Practices

### 1. Mock Design
- **Deterministic**: Same input always produces same output
- **Realistic**: Mirror real service behavior patterns
- **Fast**: No network calls, minimal computation
- **Comprehensive**: Cover error cases and edge conditions

### 2. Test Data
- **Version Controlled**: All fixtures in git
- **Realistic**: Based on actual service responses
- **Minimal**: Only include necessary data
- **Documented**: Clear comments explaining test scenarios

### 3. Configuration
- **Environment Driven**: Use env vars for mode selection
- **Default Safe**: Default to mocks for safety
- **Override Capable**: Easy to switch modes for debugging
- **CI Friendly**: Reliable in automated environments

This testing architecture provides a robust foundation for developing and maintaining the pipeline orchestration system with confidence in both mocked and live environments.