README.md

# ProgramFacts

ProgramFacts generates Elixir programs with known structural facts for analyzer testing.

It is designed for tools that need source code plus ground truth: call edges, call paths, data-flow facts, effects, branch structures, source locations, architecture violations, and project layouts.

The first implementation slice supports deterministic generation of:

- single calls
- linear call chains
- branching call graphs
- module dependency chains
- module cycles
- straight-line data-flow programs
- assignment-chain data-flow programs
- helper-call data-flow programs
- pipeline, branch, helper-call, and return data-flow programs
- if/else branch programs
- case clause programs
- cond, with, nested, anonymous function branch, and multi-clause function programs
- pure/io/send/raise/read/write effect programs
- mixed-effect boundary programs
- architecture policy fixtures and `.reach.exs` files
- plain, umbrella, and package-style project layouts
- AST-based metamorphic transforms
- feedback-directed feature search
- temporary Mix projects and replayable corpus entries

## Installation

```elixir
def deps do
  [
    {:program_facts, "~> 0.1", only: [:dev, :test]}
  ]
end
```

## Usage

```elixir
program =
  ProgramFacts.generate!(
    policy: :linear_call_chain,
    seed: 123,
    depth: 4
  )

program.files
program.facts.call_edges
program.facts.call_paths
program.facts.locations

ProgramFacts.to_map(program)
ProgramFacts.to_json!(program)
# JSON includes schema_version and program_facts_version.

umbrella_program =
  ProgramFacts.generate!(
    policy: :linear_call_chain,
    seed: 123,
    depth: 4,
    layout: :umbrella
  )
```

Write a generated Mix project to a temporary directory. The project includes a `program_facts.json` manifest with the generated facts:

```elixir
{:ok, dir, program} =
  ProgramFacts.Project.write_tmp!(
    policy: :straight_line_data_flow,
    seed: 42
  )
```

`ProgramFacts.Project.write!/3` refuses to overwrite non-empty directories unless `force: true` is passed.
Seeds are bounded to `0..10_000` because generated module names are atoms.

Apply fact-aware transformations:

```elixir
variant =
  program
  |> ProgramFacts.Transform.apply!([:rename_variables, :add_dead_pure_statement])

ProgramFacts.transforms()
variant.metadata.transforms
```

Use test helpers:

```elixir
ProgramFacts.ExUnit.assert_compiles(program)
ProgramFacts.ExUnit.assert_manifest_round_trip(program)

ProgramFacts.ExUnit.with_tmp_project(program, fn dir, program ->
  assert File.exists?(Path.join(dir, "mix.exs"))
end)
```

Save a replayable corpus entry:

```elixir
program = ProgramFacts.generate!(policy: :case_clauses, seed: 43)
dir = ProgramFacts.Corpus.save!(program, "corpus/reach")
manifest = ProgramFacts.Corpus.load_manifest!(dir)
ProgramFacts.Corpus.manifests("corpus/reach")
ProgramFacts.Corpus.load_manifests!("corpus/reach")
```

Run feedback-directed generation over feature coverage:

```elixir
result = ProgramFacts.Search.run(iterations: 50, seed: 100)
result.programs
result.coverage
```

Project a program into its semantic summary model:

```elixir
model = ProgramFacts.model(program)
model.relationships.call_edges
```

## Policies

```elixir
ProgramFacts.policies()
#=> [
#=>   :single_call,
#=>   :linear_call_chain,
#=>   :branching_call_graph,
#=>   :module_dependency_chain,
#=>   :module_cycle,
#=>   :straight_line_data_flow,
#=>   :assignment_chain,
#=>   :branch_data_flow,
#=>   :helper_call_data_flow,
#=>   :pipeline_data_flow,
#=>   :return_data_flow,
#=>   :if_else,
#=>   :case_clauses,
#=>   :cond_branches,
#=>   :with_chain,
#=>   :anonymous_fn_branch,
#=>   :multi_clause_function,
#=>   :nested_branches,
#=>   :pure,
#=>   :io_effect,
#=>   :send_effect,
#=>   :raise_effect,
#=>   :read_effect,
#=>   :write_effect,
#=>   :mixed_effect_boundary,
#=>   :layered_valid,
#=>   :forbidden_dependency,
#=>   :layer_cycle,
#=>   :public_api_boundary_violation,
#=>   :internal_boundary_violation,
#=>   :allowed_effect_violation
#=> ]
```

## Why not random Elixir strings?

ProgramFacts generates from semantic templates and returns facts from the same model. The goal is not to produce arbitrary syntax; the goal is to produce programs whose expected structural properties are known before an analyzer sees them.

See `ROADMAP.md` for the long-term plan.