README.md

# JsonlLd

**Streaming Linked Data for Agent Accessibility (Agent A11y)**

`jsonl_ld` is a stateful, streaming parser and encoder for Newline-Delimited
JSON-LD (JSONL-LD).

As the web transitions from human-only interfaces to hybrid human-agent
interfaces, autonomous AI agents and LLMs need a way to consume structured data
without relying on token-heavy, brittle DOM scraping. JSONL-LD combines the
streaming efficiency of NDJSON with the semantic power of JSON-LD, providing a
highly efficient, deterministic data feed for agents.

## Installation

Add `jsonl_ld` to your list of dependencies in `mix.exs`:

```elixir
def deps do
  [
    {:jsonl_ld, "~> 0.1.0"}
  ]
end
```

## What is JSONL-LD?

Standard JSON-LD requires an `@context` object to give data semantic meaning. If
you stream thousands of JSON-LD items via standard NDJSON, repeating the
`@context` on every single line results in massive bandwidth bloat.

JSONL-LD solves this via **stateful context resolution**. The stream emits a
lightweight "Context Line" to set the state, followed by pure data lines.

**How to structure your JSONL-LD:**
Instead of a giant, nested array or repeating the context, emit one object per
line:

```jsonld
{"@context": "https://schema.org"}
{"@type": "Product", "name": "Laptop", "price": "999.00"}
{"@type": "Product", "name": "Mouse", "price": "49.00"}
{"@context": "https://custom.vocab.org"}
{"@type": "Widget", "name": "Custom Gear"}
```

For the full technical details, see the [Specification](SPECIFICATION.md).

## Usage

This library is built on top of Elixir's native `Stream` module, meaning it uses
virtually zero memory, regardless of whether you are parsing a 50GB file or an
infinite live stream.

### Parsing a Stream (Agent Side)

Read a file or an HTTP response body line-by-line. The parser maintains the
active context and automatically injects it into the parsed objects so your
application receives fully qualified JSON-LD maps.

```elixir
File.stream!("products.jsonlld")
|> JsonlLd.parse_stream()
|> Stream.map(fn product ->
  # The parser automatically injected the correct @context!
  IO.puts("Importing #{product["name"]} using #{product["@context"]}")
end)
|> Stream.run()
```

### Encoding a Stream (Server Side)

When serving data from your database (e.g., via Ecto) to an AI agent, you can
feed fully qualified maps to the encoder. It will automatically deduplicate the
`@context` strings, emitting standalone "Context Lines" only when the context
changes, saving bandwidth over the wire.

```elixir
products = [
  %{"@context" => "https://schema.org", "@type" => "Product", "name" => "Shoe"},
  %{"@context" => "https://schema.org", "@type" => "Product", "name" => "Hat"},
  %{"@context" => "https://custom.org", "@type" => "Widget", "name" => "Gear"}
]

products
|> JsonlLd.encode_stream()
|> Enum.into(File.stream!("output.jsonlld"))

# The resulting file will automatically strip the redundant schema.org contexts.
```

## Documentation

Full documentation can be found at
[https://hexdocs.pm/jsonl_ld](https://hexdocs.pm/jsonl_ld).