README.md

# CodingAgent

A small Elixir library for building a coding-agent harness on top of
[OpenRouter](https://openrouter.ai) (via [`req_llm`](https://hex.pm/packages/req_llm)),
where each agent conversation runs as its own supervised `GenServer`, and
agents can discover and invoke **Claude-Code-style skills** — directories
containing a `SKILL.md` with YAML frontmatter that the model can choose to
load mid-conversation.

The agent's file tools operate entirely on an **in-memory virtual
filesystem** (`path => content`) — there is no `bash`/shell tool and no
real disk access of any kind. The caller seeds the initial files, runs the
conversation, and gets the resulting file map back to do with as it
pleases (write to disk, diff, ship elsewhere, discard).

## Installation

```elixir
def deps do
  [
    {:coding_agent, "~> 0.1.0"}
  ]
end
```

## Architecture

- `CodingAgent.Application` — starts a `Registry` and `DynamicSupervisor`
  so many independent agent sessions can run concurrently.
- `CodingAgent.Session` — one `GenServer` per conversation. Owns a
  `ReqLLM.Context`, drives the agentic loop (call the model → detect
  `finish_reason: :tool_calls` → execute tools via
  `ReqLLM.Context.execute_and_append_tools/3` → repeat until a final answer
  or `:max_turns` is hit), and exposes a synchronous `send_message/3` that
  returns `{:ok, reply, files}`.
- `CodingAgent.Tools` — built-in `ReqLLM.Tool`s: `read`, `write`, `edit`
  (all operating on the in-memory virtual filesystem) and `skill` (invokes
  a discovered skill by name).
- `CodingAgent.Skill` / `CodingAgent.Skills` — parses and discovers
  `SKILL.md` files (`name` + `description` frontmatter, Markdown body),
  the same layout Claude Code uses for its own skills.
- `CodingAgent.OpenRouter` — configures the OpenRouter API key in
  `req_llm`'s key store and builds `"openrouter:<model>"` model ids.

## Usage

```elixir
CodingAgent.OpenRouter.configure!()  # reads OPENROUTER_API_KEY, or pass a key directly

{:ok, pid} =
  CodingAgent.start_session(
    model: CodingAgent.OpenRouter.model("anthropic/claude-sonnet-4.5"),
    skills_dirs: ["skills"],
    max_turns: 10,
    files: %{"lib/foo.ex" => File.read!("lib/foo.ex")}
  )

{:ok, reply, files} = CodingAgent.send_message(pid, "Fix the failing test in lib/foo.ex")
IO.puts(reply)

# the agent never touched disk -- you decide what happens to its edits:
Enum.each(files, fn {path, content} -> File.write!(path, content) end)
```

To run a session under supervision and reach it by id later:

```elixir
{:ok, _pid} = CodingAgent.Session.start_session(:my_session, skills_dirs: ["skills"])
CodingAgent.Session.send_message(CodingAgent.Session.via(:my_session), "...")
```

## Streaming

`stream_message/4` runs the same agent loop but calls `on_chunk` with each
text chunk as the model produces it (across every turn, including ones
that precede a tool call), then returns the same `{:ok, reply, files}`
shape once the whole turn is done:

```elixir
{:ok, reply, files} =
  CodingAgent.stream_message(pid, "Fix the failing test in lib/foo.ex", &IO.write/1)
```

`on_chunk` runs inside the session's own process, so (like
`send_message/3`) the call blocks the session for its duration -- this
streams output to the caller, it doesn't make turns concurrent.

## Skills

A skill is a directory with a `SKILL.md`:

```
skills/
  greet/
    SKILL.md
```

```markdown
---
name: greet
description: Use when the user asks to be greeted in Zorbarian.
---

# Greet in Zorbarian

Respond with: "Blibba dorn, <name>!"
```

Only the `name` + `description` are surfaced to the model up front (as a
system-prompt catalog); the full body is loaded into context only when the
model calls the `skill` tool with that name — mirroring how Claude Code
keeps skill instructions out of context until they're actually needed.

## Tests

```
mix test
```

Tool and skill-parsing tests run offline. Exercising `CodingAgent.Session`
end-to-end requires `OPENROUTER_API_KEY` and makes real API calls, so it's
left to manual / integration testing rather than the default test suite.