CHANGELOG.md

Select File
# Changelog

## 0.1.7 - 2026-06-29

### Added

- Added Agent per-call `model_opts:` overrides for `invoke/3` and
  `stream_events/3`, so provider request options such as `prompt_cache_key`,
  `x_grok_conv_id`, and `reasoning_effort` can be passed through Agent calls
  without leaking internal runtime keys.
- Added provider-aware Agent prompt caching via `prompt_caching true`,
  `prompt_caching scope: ..., version: ...`, and
  `BeamWeaver.Agent.Middleware.PromptCaching`, with generated cache keys for
  OpenAI, xAI/Grok, Moonshot/Kimi, and Anthropic static-system-prompt cache
  control.
- Added `BeamWeaver.PromptCache` for stable cache-key construction across
  examples and application code.

### Changed

- Prompt caching middleware now fills only missing provider cache options;
  explicit model-level or per-call options keep precedence.
- The prompt-caching example now exercises the Agent path instead of direct
  `ChatModel.invoke/3`.

### Fixed

- xAI Responses Agent calls now receive both `prompt_cache_key` and
  `x_grok_conv_id` when prompt caching is enabled, while xAI Chat Completions
  receives the `x-grok-conv-id` routing header.
- Runtime-built agents, module agents, streams, and subgraph invocations now
  preserve per-call model options through the graph runtime.

## 0.1.6 - 2026-06-29

### Changed

- Cleaned user-facing docs to reduce duplicate coverage, clarify provider and
  adapter responsibilities, and keep development-only implementation notes out
  of public guides.

### Fixed

- Z.ai structured-output requests now inject the JSON Schema contract into the
  prompt when BeamWeaver has a schema but the provider request can only enforce
  JSON object mode.
- Structured-output retries now include safe validation details such as missing
  required keys in the retry prompt, making provider retries more likely to
  repair invalid JSON shapes.
- Structured-output result handling now preserves parser error details when
  converting validation failures into BeamWeaver errors or tool feedback.

## 0.1.5 - 2026-06-28

### Added

- Added a dedicated prompt-caching guide and runnable example covering stable
  cache-key construction, provider-specific cache controls, and cache-hit
  inspection across OpenAI, xAI/Grok, Anthropic, Moonshot/Kimi, Gemini, and
  Z.ai.
- Added first-class OpenAI Responses and Chat Completions `prompt_cache_key` and
  `prompt_cache_retention` model options, while keeping per-call overrides.
- Added xAI/Grok cache routing support: Responses accepts `prompt_cache_key`,
  and Chat Completions accepts `x_grok_conv_id`, which BeamWeaver sends as the
  `x-grok-conv-id` header.
- Added prompt-cache example helpers, including `GOOGLE_CACHED_CONTENT` /
  `GEMINI_CACHED_CONTENT` support for externally managed Gemini cached-content
  resources.

### Changed

- Model traces now preserve provider-specific invocation parameters from the
  checked-in profile registry, including cache keys and cache-routing options,
  while still omitting client configuration and secrets.
- Prompt-cache documentation now lives in the main model documentation and
  dedicated guide instead of the custom-middleware page; the prebuilt middleware
  guide points Anthropic static-system-prompt caching to
  `BeamWeaver.Agent.Middleware.PromptCaching`.
- Redaction now keeps token counts and generation limits such as
  `max_output_tokens`, `budget_tokens`, and `*_token_count` fields so cache and
  usage telemetry remain inspectable.

### Fixed

- OpenAI Responses `prompt_cache_key` now respects first-class model options
  ahead of `model_kwargs`, while per-call options continue to take precedence.
- xAI requests now merge per-call headers with default headers and let per-call
  `x_grok_conv_id` override the model-level conversation/cache routing header.

## 0.1.4 - 2026-06-25

### Added

- Added OpenAI Responses `store: false` replay sanitization for cached
  assistant messages. Replay now drops provider-only item IDs, skips
  non-replayable reasoning and empty image-generation blocks, and preserves
  encrypted reasoning content that can safely round-trip.
- Added OpenAI `apply_patch` built-in tool rendering and replay parsing for
  `apply_patch_call` and `apply_patch_call_output` provider items.
- Added an offline OpenAI `apply_patch` example that shows request rendering and
  replayable assistant patch history without requiring live credentials.
- Added replay-backed provider conformance fixtures for OpenAI replay
  sanitization and xAI reasoning request-shape handling.
- Added WeaveScope exporter queue telemetry coverage for retry, flush,
  rejection, and dead-letter paths.
- Added `tool_call_streaming` to model capability profiles and exposed it through
  profile compilation, compatibility checks, and the profile matrix task.
- Added pre-projection message stream transforms, including a PII stream
  redaction helper that edits typed token/message envelopes before projection.
- Added Agent Protocol client hardening for encoded task paths,
  JSON-string/map body normalization, non-2xx error payloads, and async-subagent
  native trace metadata.
- Added a native sandbox provider registry with validated provider specs,
  builtin local provider construction, lifecycle capability checks, and redacted
  provider metadata.
- Added explicit interpreter adapter contracts and a supervised interpreter
  session boundary for adapter-owned eval, snapshot, restore, timeout, cancel,
  and close behavior without adding a default unsafe interpreter runtime.

### Changed

- Anthropic tool-call IDs are now normalized deterministically at the Anthropic
  provider boundary while preserving BeamWeaver's native message and tool-call
  structs.
- xAI reasoning profiles omit unsupported `stop` request parameters while
  non-reasoning xAI chat-completions models continue to send supported stop
  sequences.
- Deep Agents offload and model-request metadata now use BeamWeaver-native keys
  such as `:offloaded_to` and `:source` instead of Python ecosystem labels.
- WeaveScope trace payload tests now assert native BeamWeaver/WeaveScope fields
  for run envelopes, model generation details, tool payloads, usage, lifecycle
  status, event versions, tags, and metadata.
- OpenAI and xAI stream handling now preserves empty initial chunks,
  incremental tool-call arguments, reconstructed final assistant tool calls, and
  detailed usage metadata.
- Summarization triggers now support explicit AND/OR composition with
  `{:all, triggers}` and `{:any, triggers}`.
- Human-in-the-loop middleware now supports predicate-gated review configs and
  `interrupt_mode: :first` for reviewing only the first matching tool call.
- Sandbox and filesystem command execution results now carry additive native
  metadata such as provider ID, sandbox ID, command ID, snapshot ID, reconnect
  count, timeout, exit status, retryability, and raw provider status when the
  backend supplies it.

### Fixed

- Graph task exits, timeouts, and cancellations now preserve the BEAM root cause
  in normalized graph errors and emit native graph telemetry for failure paths.
- Checkpoint serialization tests now guard that only known BeamWeaver tagged
  structs decode, while foreign constructor-shaped maps remain inert data.
- Transport and trace redaction now cover nested bearer tokens, provider keys,
  URL credentials, query-string secrets, env-style assignments, private key
  blocks, and secret headers without redacting token-count usage fields.
- `list_async_tasks` refreshes active async tasks while leaving terminal task
  records cached, so completed/cancelled/error tasks are not re-polled.
- Sandbox execution, remote-provider fakes, interpreter sessions, and shell
  commands now normalize timeout/crash metadata and emit native telemetry while
  redacting credential-shaped fields before tracing.

## 0.1.3 - 2026-06-23

### Added

- Added a first-class Z.ai provider for `zai:glm-5.2`, including runtime
  config from `ZAI_API_KEY`, optional Z.ai base URL overrides, provider/model
  registry entries, model profile metadata, and provider matrix support.
- Added native Z.ai chat-completions support for non-streaming and streaming
  responses, JSON mode, function tools, streamed tool-call argument merging,
  reasoning deltas, request metadata, raw usage, and estimated cost metadata.

## 0.1.2 - 2026-06-20

### Fixed

Security and transport:

- SSRF: Req no longer auto-follows redirects. Every redirect hop is validated
  against the URL policy, including on streaming paths.
- PII: overlapping detector spans (such as a URL containing an IP) no longer
  crash or garble output. Spans are pruned to stay disjoint.
- IPv4 reserved-range checks no longer over-block public `/16` ranges.
- IPv6 `fe80::/10` is treated as reserved, so it stays blocked when
  `allow_metadata?` is on.
- Per-call request options (including `:timeout`) override client transport
  defaults instead of being shadowed.

Providers:

- Anthropic: beta flags go in the `anthropic-beta` header, not the body. This
  covers invoke, stream, stream_events, and count_tokens.
- Anthropic: final streaming usage is a proper usage map merged into metadata.
- Anthropic: a non-list `:tools` option (such as `nil`) no longer raises; tools
  are omitted.
- Google: streaming tolerates chunks with no candidates, content, or parts.
- Google: `countTokens` sends exactly one of `contents` or
  `generateContentRequest`.
- OpenAI: a bare-string `:image_url` block is handled instead of crashing.
- OpenAI: model inference is limited to `o1`/`o3`/`o4`/`chatgpt` prefixes, so
  other models are not misrouted.
- OpenAI: parsing a chunk with no tool calls returns `[]`, not an error.
- xAI: `ChatCompletionsModel.new/1` no longer crashes on `streaming: true`.
- Moonshot: the tool-name regex accepts short and digit/hyphen-leading names.
- Provider runtime: stream-metadata returns `%{}` instead of crashing when an
  adapter has no metadata function.
- Cached models keep usage cost on a cache miss, for both stream and
  stream_events.

Agents and middleware:

- `interrupt_before`/`interrupt_after: :all` is preserved instead of becoming
  `[:all]`, which had disabled all human-in-the-loop interrupts.
- Structured final-response extraction handles string-keyed state instead of
  crashing.
- `list_async_tasks` refreshes live status before filtering, not stale status.
- A configured subagent-output response that is not a map or function is honored,
  not dropped.
- The tool emulator falls back to `"unknown_tool"` when a tool call has no name.

Graph:

- Validation-node task exits become error messages instead of crashing the node.
- Pending checkpoint-map merging tolerates a missing `configurable` key.
- Delta channels keep `nil`/`false` overwrite values instead of resetting to the
  initial value.
- `input_channels` hides private channels (such as `__node_outputs__`) when there
  is no `input_schema`.
- An empty-list edge condition is treated as membership, not a match-anything map.
- `add_messages` honors the last `remove_all` marker, not the first.
- OpenAI message formatting tags tool calls as `"tool_call"`, not `"tool_calls"`.
- `ServerInfo.User` Access no longer leaks struct-field keys into user metadata.

Memory, retrieval, and indexing:

- Ecto memory applies `default_ttl` on write. Refresh-on-read updates only
  `expires_at`, not `updated_at`.
- ETS chat history no longer drops messages under concurrent `add_messages`;
  writes are atomic.
- Indexing without a record manager deduplicates documents that share an id.
- ETS memory namespace listing ignores unknown match-condition types instead of
  crashing.
- Memory metadata filters match a plain-map value by equality instead of always
  failing.
- Vector-store SQL filters render `$and`/`$or`/`$not`.
- The policy retriever honors `:similarity_score` and `score_threshold`.
- The SQL `$like` filter keeps the user pattern verbatim, matching ETS `like`.
- ETS vector store stringifies ids in `delete`/`get_by_ids`, so non-string ids
  match.
- File-search snippets slice with correct offsets, fixing garbled multibyte text.
- `add_start_index` reports a character index, not a byte offset.

Rate limiting:

- The rate-limited wrapper streams through `ChatModel.stream/3` instead of
  degrading to invoke when the provider module is not loaded.
- Token-bucket `acquire` rejects negative or non-integer timeouts and invalid
  modes up front.
- Retry delays are re-clamped to `max_delay` after jitter, including the
  zero-backoff path.

Serialization and schema:

- Maps with colliding atom/string keys raise instead of silently dropping one.
- The pretty JSON encoder escapes all control characters.
- Structured-output `oneOf` variants get distinct spec names.
- A nullable typeless tool field becomes `anyOf[…, null]`, not null-only.
- Checkpoint config normalization returns `%{}` when `configurable` is `nil`.
- `MapAccess.first` returns a `false`/`nil` value instead of the default.
- Empty-string schema defaults are emitted, not dropped.
- Strict tool-schema rendering keeps user keys with `nil` values.
- Schema fields given as 2-tuples normalize instead of crashing.
- Tracing `custom_fields` skips non-pair list entries instead of crashing.

Core messages and text:

- XML output parsing slices on byte offsets, fixing multibyte corruption.
- Tool-result truncation backs off to a valid UTF-8 boundary.
- Trimming with `:last` keeps the last words, not the first.
- Usage subtraction clamps right-only token counts to zero, including nested maps.
- The `drop_oldest` mux drops the newest item instead of erroring when the buffer
  is full.
- Prompt partials and the simple template renderer no longer mishandle supplied
  or brace values.
- Partial JSON repair is no longer cubic on long input.
- Fixed assorted dead and edge-case clauses: agent decision normalization,
  transient-error and MFA retry predicates, and the HTML header splitter.

Sandbox and shell:

- Docker `edit/5` starts the container once and threads it through
  read/write/execute. Edits previously hit throwaway containers and leaked
  containers.
- Docker execute/read truncation respects UTF-8 codepoint boundaries.
- The Docker sandbox prints the entry type before the path, so `ls`/`glob` handle
  paths containing `|`.
- File-formatting long-line chunking splits on character boundaries, not raw
  bytes.
- A host shell command no longer crashes when the policy timeout is `nil`.
- Shell host-executor and session temp files are cleaned up even on timeout or
  kill.

### Changed

- Default operation timeout raised from 5 seconds to 5 minutes for runnable
  `batch`/`map`/`parallel`/fallback and the agent `GenServer.call`. An explicit
  `:timeout` still overrides it.

### Internal

- Removed dead clauses and other compiler-warning sources. `mix compile` and
  `mix test` now run clean.

## 0.1.1 - 2026-06-20

### Added

- Added Moonshot/Kimi profiles for `kimi-k2.7-code`,
  `kimi-k2.7-code-highspeed`, `kimi-k2.6`, and `kimi-k2.5`.
- Added request validation for Kimi thinking, fixed sampling, and tool-choice
  constraints.
- Added an Apache License 2.0 `LICENSE` file.

### Changed

- Updated Moonshot/Kimi docs and install snippets for the `0.1.1` release.