# Dicom
[](https://hex.pm/packages/dicom)
[](https://hexdocs.pm/dicom)
[](https://github.com/Balneario-de-Cofrentes/dicom/actions)
[](https://opensource.org/licenses/MIT)
Pure Elixir DICOM P10 parser and writer. Zero runtime dependencies.
Built on Elixir's binary pattern matching for fast, correct parsing of
[DICOM](https://www.dicomstandard.org/) medical imaging files.
## Features
- **P10 file parsing** -- read DICOM Part 10 files into structured data sets
- **P10 file writing** -- serialize data sets back to conformant P10 files
- **Streaming parser** -- lazy, event-based parsing for large files and pipelines
- **Data dictionary** -- comprehensive PS3.6 tag registry (5,035 entries) with VR, VM, keyword lookup, and retired flags
- **DICOM JSON** -- encode/decode DataSets to/from the DICOM JSON model (PS3.18 Annex F.2) for DICOMweb
- **Pixel data frames** -- extract individual frames from native and encapsulated pixel data (PS3.5 Section A.4)
- **De-identification** -- anonymize data sets per PS3.15 Basic Profile with 10 option columns and consistent UID replacement
- **Character set support** -- decode text values per (0008,0005) SpecificCharacterSet (Latin-1 through Latin-5, Cyrillic, Arabic, Greek, Hebrew, JIS X 0201, UTF-8)
- **Value decoding** -- automatic VR-aware decoding (numeric, string, date, UID, etc.)
- **Transfer syntaxes** -- all 62 DICOM transfer syntaxes (49 active + 13 retired); strict rejection of unknown UIDs with opt-in lenient mode
- **Sequences** -- defined-length and undefined-length SQ with nested items
- **Encapsulated pixel data** -- fragments with Basic Offset Table
- **Validation** -- File Meta Information validation per PS3.10 Section 7.1
- **Zero dependencies** -- pure Elixir, no NIFs, no external tools
## Installation
Add `dicom` to your `mix.exs` dependencies:
```elixir
def deps do
[
{:dicom, "~> 0.3.0"}
]
end
```
## Quick Start
```elixir
# Parse a DICOM file
{:ok, data_set} = Dicom.parse_file("/path/to/image.dcm")
# Access attributes by tag
patient_name = Dicom.DataSet.get(data_set, Dicom.Tag.patient_name())
study_date = Dicom.DataSet.get(data_set, Dicom.Tag.study_date())
modality = Dicom.DataSet.get(data_set, Dicom.Tag.modality())
# Decode values with VR awareness
raw_element = Dicom.DataSet.get_element(data_set, Dicom.Tag.rows())
rows = Dicom.Value.decode(raw_element.value, raw_element.vr)
# Build a data set from scratch
ds = Dicom.DataSet.new()
|> Dicom.DataSet.put({0x0002, 0x0002}, :UI, "1.2.840.10008.5.1.4.1.1.2")
|> Dicom.DataSet.put({0x0002, 0x0003}, :UI, Dicom.UID.generate())
|> Dicom.DataSet.put({0x0002, 0x0010}, :UI, Dicom.UID.explicit_vr_little_endian())
|> Dicom.DataSet.put({0x0010, 0x0010}, :PN, "DOE^JOHN")
|> Dicom.DataSet.put({0x0010, 0x0020}, :LO, "PAT001")
# Serialize to binary and write
{:ok, binary} = Dicom.write(ds)
:ok = Dicom.write_file(ds, "/path/to/output.dcm")
# Parse from binary
{:ok, parsed} = Dicom.parse(binary)
```
### Streaming
```elixir
# Stream events lazily from a file (constant memory)
events = Dicom.stream_parse_file("/path/to/large_image.dcm")
# Filter for specific tags without loading the entire file
patient_tags =
events
|> Stream.filter(&match?({:element, %{tag: {0x0010, _}}}, &1))
|> Enum.map(fn {:element, elem} -> {elem.tag, elem.value} end)
# Or materialize back into a DataSet
{:ok, data_set} =
Dicom.stream_parse(binary)
|> Dicom.P10.Stream.to_data_set()
```
## Architecture
```
lib/dicom/
dicom.ex -- Public API: parse, write, stream_parse, stream_parse_file
data_set.ex -- DataSet struct (elements + file meta)
data_element.ex -- DataElement struct (tag + VR + value + length)
tag.ex -- Tag constants and utilities
vr.ex -- Value Representation types and padding
uid.ex -- UID constants, generation, and validation
value.ex -- VR-aware value encoding and decoding
transfer_syntax.ex -- Transfer syntax registry (62 TSes) and encoding dispatch
character_set.ex -- Specific Character Set decoding (0008,0005)
character_set/
tables.ex -- ISO 8859-{2..9} and JIS X 0201 lookup tables
json.ex -- DICOM JSON model encoder/decoder (PS3.18 Annex F.2)
pixel_data.ex -- Pixel data frame extraction (PS3.5 Section A.4)
de_identification.ex -- De-identification / anonymization (PS3.15 Table E.1-1)
de_identification/
profile.ex -- Profile options struct (10 boolean columns)
p10/
reader.ex -- P10 binary parser (preamble, file meta, data set)
writer.ex -- P10 binary serializer (iodata pipeline)
file_meta.ex -- Preamble validation and File Meta Information
stream.ex -- Streaming API: parse/1, parse_file/2, to_data_set/1
stream/
event.ex -- Event type definitions
source.ex -- Data source abstraction (binary + file I/O)
parser.ex -- State machine: preamble -> file_meta -> data_set -> done
dictionary/
registry.ex -- PS3.6 tag -> {name, VR, VM} lookup (5,035 entries)
```
## DICOM Standard Coverage
| Part | Title | Coverage |
|------|-------|----------|
| PS3.5 | Data Structures and Encoding | VR types, 62 transfer syntaxes, data encoding, sequences, pixel data frame extraction |
| PS3.6 | Data Dictionary | Comprehensive tag registry (5,035 entries), keyword lookup, retired flags |
| PS3.10 | Media Storage and File Format | P10 read/write, File Meta Information, preamble |
| PS3.15 | Security and System Management | Basic Application Level Confidentiality Profile (de-identification) |
| PS3.18 | Web Services | DICOM JSON model encoding/decoding (Annex F.2) |
### Transfer Syntaxes
| Transfer Syntax | Read | Write |
|----------------|------|-------|
| Implicit VR Little Endian (1.2.840.10008.1.2) | Yes | Yes |
| Explicit VR Little Endian (1.2.840.10008.1.2.1) | Yes | Yes |
| Deflated Explicit VR Little Endian (1.2.840.10008.1.2.1.99) | Yes | Yes |
| Explicit VR Big Endian (1.2.840.10008.1.2.2, retired) | Yes | Yes |
| JPEG, JPEG-LS, JPEG 2000, JPEG XL, RLE, MPEG, HEVC, HTJ2K, SMPTE (58 TSes) | Metadata only | Metadata only |
Unknown transfer syntaxes are rejected by default. Use `TransferSyntax.encoding(uid, lenient: true)`
to fall back to Explicit VR Little Endian for unrecognized UIDs.
## Performance
Benchmarked on Apple Silicon (Elixir 1.18, OTP 27):
| Operation | Throughput |
|-----------|-----------|
| Parse 50-element data set | ~10 us |
| Parse 200-element data set | ~50 us |
| Stream parse 50 elements | ~20 us |
| Stream parse 200 elements | ~80 us |
| Stream enumerate 200 elements | ~55 us |
| Write 50-element data set | ~13 us |
| Write 200-element data set | ~55 us |
| Roundtrip 100 elements | ~37 us |
| Parse 1 MB pixel data | ~1 us |
Run benchmarks with `mix test test/dicom/benchmark_test.exs`.
## Testing
```bash
mix test # Run all tests (621 tests)
mix test --cover # Run with coverage report (91%+)
mix format --check-formatted
```
Property-based tests using [StreamData](https://hex.pm/packages/stream_data)
verify encode/decode roundtrips across all VR types and streaming parser equivalence.
## Comparison with Other BEAM DICOM Libraries
Five DICOM libraries exist for the BEAM. Only three are published to Hex.pm.
| Feature | **dicom** | dicom\_ex 0.3.0 | ex\_dicom 0.2.0 | DCMfx 0.43.0 | WolfPACS |
|---------|-----------|-----------------|-----------------|--------------|----------|
| **Language** | Elixir | Elixir | Elixir | Gleam + Rust | Erlang |
| **License** | MIT | Apache-2.0 | MIT | AGPL-3.0 | AGPL-3.0 |
| **On Hex.pm** | Yes | Yes | Yes | No (git only) | No (git only) |
| **Runtime deps** | 0 | 0 | 0 | 6 | 2 |
| **P10 parse** | Yes | Yes | Yes | Yes | Basic |
| **P10 write** | Yes | Yes | No | Yes | No |
| **Transfer syntaxes** | 62 (49 active + 13 retired) | 3 | 3 | 47 | 3 |
| **Sequences (SQ)** | Yes | Yes | Yes | Yes | Yes |
| **Tag dictionary** | 5,035 tags | 5,249 tags | None | 13,689 tags | None |
| **UID generation** | Yes | Yes | No | No | No |
| **UID validation** | Yes | No | No | No | No |
| **File Meta validation** | Yes | Partial | Partial | Yes | Yes |
| **Character sets** | ISO 8859-{1..9}, JIS X 0201, UTF-8 | No | No | Full (CJK, GB18030) | No |
| **Value decoding** | Yes (36 VRs) | Yes | Basic | Yes | Yes (25 VRs) |
| **Streaming parser** | Yes | No | No | Yes | No |
| **DIMSE networking** | No | C-ECHO/C-FIND/C-STORE | No | No | C-ECHO/C-STORE |
| **DICOM JSON** | Yes (PS3.18 F.2) | No | No | Yes | No |
| **Anonymization** | Yes (PS3.15 Basic Profile) | No | No | Yes | No |
| **Pixel data frames** | Yes (native + encapsulated) | No | No | Yes | No |
| **Test suite** | 621 tests, 91%+ cov | Unknown | 1 test file | 39 test files | 80+ tests |
| **CI** | Passing | None | None | Failing | Failing |
| **Docs** | HexDocs + @moduledoc | HexDocs | HexDocs | Dedicated site | Project site |
| **Production-ready** | Yes | Explicitly no | No | Yes (if AGPL ok) | Alpha |
| **Gleam toolchain** | Not required | Not required | Not required | Required | Not required |
**dicom** is the most complete pure-Elixir DICOM library: zero dependencies,
streaming + read + write, DICOM JSON, anonymization, pixel data extraction,
62 transfer syntaxes, and MIT-licensed. DCMfx has a larger tag dictionary
and full CJK character set support but requires the Gleam toolchain, carries
AGPL-3.0 licensing, and is not published to Hex.pm. For DIMSE networking,
`dicom_ex` provides C-ECHO/C-FIND/C-STORE SCP support.
## AI-Assisted Development
This project welcomes AI-assisted contributions. See [AGENTS.md](AGENTS.md)
for instructions that AI coding assistants can use to work with this codebase,
and [CONTRIBUTING.md](CONTRIBUTING.md) for our AI contribution policy.
## Contributing
Contributions are welcome. Please read our [Contributing Guide](CONTRIBUTING.md)
and [Code of Conduct](CODE_OF_CONDUCT.md) before opening a PR.
## License
MIT -- see [LICENSE](LICENSE) for details.