# Changelog
All notable changes to this project are documented here. The format is based on
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project aims
to follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.1.0] - 2026-06-15
First release: a pure-Elixir PDF parsing and lossless surgery engine.
### Parsing & extraction
- Lazy dual-AST parser: classic xref tables, PDF 1.5+ xref streams, object
streams, Flate + PNG-predictor decoding.
- `PdfEx.open/1`, `page_count/1`, `pages/1`, `extract_text/1,2`.
- Text extraction with positions, fonts, real `/Widths` metrics, and
ToUnicode/encoding decoding.
### Editing
- Structural page ops (`PdfEx.Editor`): insert / delete (lossless free) /
reorder, with inherited-attribute materialization on reorder-flatten.
- Run-level text editing (`PdfEx.ContentEdit`): `replace_text/3`,
`delete_glyph/2`, `run_text/2` — token-span patches with width compensation;
single-byte fonts and Type0 / Identity-H composite fonts.
- Stable per-glyph UIDs and visual position mutation
(`PdfEx.Convert.apply_visual_mutation/3`).
### Projection
- Visual and semantic HTML (`PdfEx.Convert.to_html/2`) with `data-uid`
back-references; reverse mapping of edited semantic blocks into per-run text
ops (`semantic_ops/3`, `apply_semantic_mutation/3`).
### Collaboration
- Supervised per-document editing sessions (`PdfEx.Session`) with a
crash-surviving snapshot cache, plain-struct operations (`PdfEx.Op`), and
operational transformation (`PdfEx.OT`) for intention-preserving concurrent
edits.
### Serialization
- Incremental-first serializer (`PdfEx.Serializer`): byte-exact round-trip on
unmodified documents, xref style matched to the source; opt-in full
re-serialization (`mode: :full`, a single clean revision, not byte-lossless).
### Fonts
- TrueType glyph-retaining subset surgery (`PdfEx.Font.Surgery`) with
composite-glyph closure and recomputed checksums.
### Robustness
- Hardened against hostile input: atom-table exhaustion, nesting-depth bombs,
circular xref/`/Length` chains, unbounded xref-stream ranges, malformed
positioning operands, CR/LF escaping in re-serialized strings, spec-legal
real number forms, huge-float serialization, and refc binary pinning.
- Real-PDF corpus harness and a deterministic fuzz suite.