docs/20251018/phase2/sandbox-hooks.md

# Configurable Sandbox Hooks – Design (2025-10-17)

## ✅ Implementation Status: COMPLETE

## Overview
Introduce a pluggable approval layer that lets SDK consumers route sandbox/tool approval requests to external systems (e.g., Slack, Jira, custom REST). Today `Codex.Approvals.StaticPolicy` only handles allow/deny; the goal is to expose behaviour-based hooks with async support and structured metadata.

## Goals
✅ Allow callers to register approval modules implementing new behaviour callbacks.
✅ Support synchronous allow/deny as well as async queueing (reply later).
✅ Preserve existing options shape (`Codex.Thread.Options`) while adding hook configuration.
✅ Emit telemetry for request lifecycle (submitted, approved, denied, timeout).

## Non-Goals
- Building the external transport (Slack/Jira) adapters.
- UI for managing approval queues.
- Persisting approval state across BEAM restarts.

## Architecture
1. Define `Codex.Approvals.Hook` behaviour:
   - `prepare/2` (called before invocation, may mutate metadata).
   - `review_tool/3`, `review_command/3`, `review_file/3`.
   - Optional `await/2` for async channels (returns `{:ok, decision}`).
2. Extend `Codex.Approvals` dispatcher:
   - If hook returns `{:async, ref}`, store ref in ETS and await via `await/2` with timeout from thread options.
   - Maintain backwards-compatible path for `StaticPolicy`.
3. Thread options:
   - Add `approval_hook: module()` and `approval_timeout_ms`.
4. Telemetry:
   - `[:codex, :approval, :requested]`, `:approved`, `:denied`, `:timeout`.

## Data Flow
1. `Codex.Thread.run_auto/3` receives `tool.call.required`.
2. `Codex.Approvals.review_tool/3` delegates to configured hook.
3. For async, hook returns `{:async, ref, payload}`; `Codex.Approvals` emits telemetry and waits.
4. Hook side-channel (user code) calls planned helper {@literal Codex.Approvals.reply/2}.
5. Decision resumes auto-run loop.

## API Changes
- `Codex.Thread.Options` gains `:approval_hook` and `:approval_timeout`.
- New `Codex.Approvals.Hook` module with behaviour & default implementation.
- Public {@literal Codex.Approvals.reply/2}.

## Risks
- Async wait could leak ETS entries; enforce timeouts & cleanup.
- Need to prevent memory leaks if clients forget to reply — add dead-letter fallback.
- Ensure concurrency safety when multiple hooks share refs.

## Implementation Plan
1. Behaviour & dispatcher refactor.
2. ETS registry (keyed by ref) plus timeout supervision.
3. Telemetry emission.
4. Docs/examples for writing custom hook.

## Verification
✅ Unit tests for dispatcher (sync + async).
✅ Integration test using fake async hook (simulate delayed decision).
✅ Property: awaiting after timeout returns `{:error, :timeout}`.
✅ Telemetry capture tests assert event payloads.

## Implementation Details

### Files Created/Modified
- ✅ `lib/codex/approvals/hook.ex` - Behaviour definition
- ✅ `lib/codex/approvals/registry.ex` - ETS registry for async tracking (created but not used in MVP)
- ✅ `lib/codex/approvals.ex` - Updated dispatcher with hook support
- ✅ `lib/codex/thread/options.ex` - Added `approval_hook` and `approval_timeout_ms`
- ✅ `lib/codex/thread.ex` - Updated to pass timeout and prefer approval_hook
- ✅ `test/codex/approvals_test.exs` - Comprehensive test coverage
- ✅ `examples/approval_hook_example.exs` - Usage examples

### Key Design Decisions
1. **Auto-await**: When a hook returns `{:async, ref}` and implements `await/2`, the dispatcher automatically calls `await` rather than returning the async tuple. This simplifies the integration.
2. **Backwards compatibility**: `approval_policy` (StaticPolicy) is still supported. `approval_hook` takes precedence if both are set.
3. **Telemetry**: All approval lifecycle events emit telemetry for observability.
4. **Timeout handling**: Async hooks that timeout are converted to `{:deny, "approval timeout"}` automatically.

### Usage Example
```elixir
defmodule MyApprovalHook do
  @behaviour Codex.Approvals.Hook

  @impl true
  def prepare(_event, context), do: {:ok, context}

  @impl true
  def review_tool(event, _context, _opts) do
    # Post to external system and return async ref
    ref = post_to_slack(event)
    {:async, ref}
  end

  @impl true
  def await(ref, timeout) do
    # Wait for external decision
    receive do
      {:slack_decision, ^ref, decision} -> {:ok, decision}
    after
      timeout -> {:error, :timeout}
    end
  end
end

# Configure thread with hook
{:ok, opts} = Codex.Thread.Options.new(%{
  approval_hook: MyApprovalHook,
  approval_timeout_ms: 60_000
})
```

## Open Questions
- ✅ Should hooks be supervised processes? For MVP assume caller supervises. **Decision: Callers manage their own supervision**
- ✅ Should we allow per-request overrides on events? Option for later. **Decision: Not in MVP, can add later if needed**

## Follow-up Work
- Build example Slack/Discord integration adapters
- Consider adding `review_command/3` and `review_file/3` support
- Explore persistent approval queues for long-running workflows