Skip to main content

README.md

# SchemaOrg

A strictly-typed builder for generating SEO [Schema.org](https://schema.org)
JSON-LD in Elixir and Phoenix applications.

You should not have to memorise the Schema.org vocabulary. This library ships
**1000+ generated struct modules** (`SchemaOrg.Product`, `SchemaOrg.Offer`, …),
one per Schema.org Class. Build a graph with ordinary struct literals — your
editor auto-completes the valid fields and the compiler rejects the rest — then
serialise it with `to_json_ld/1`.

```elixir
%SchemaOrg.Product{
  name: "MacBook Pro",
  offers: %SchemaOrg.Offer{price: 1999.00, price_currency: "USD"}
}
|> SchemaOrg.to_json_ld()
```

```json
{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "MacBook Pro",
  "offers": { "@type": "Offer", "price": 1999.0, "priceCurrency": "USD" }
}
```

For a page that describes several independent things at once, pass a **list** of
structs — they are wrapped in a single top-level `@graph`:

```elixir
[%SchemaOrg.Organization{name: "Acme"}, %SchemaOrg.WebSite{name: "Acme"}]
|> SchemaOrg.to_json_ld()
#=> {"@context":"https://schema.org","@graph":[{...},{...}]}
```

## Guides

Worked, copy-pasteable examples (each verified by `test/examples_test.exs`):

- [E-commerce product page](https://github.com/mike-kostov/schema_org/blob/main/guides/ecommerce-product.md)
- [Blog main page](https://github.com/mike-kostov/schema_org/blob/main/guides/blog-home.md)
- [Single article](https://github.com/mike-kostov/schema_org/blob/main/guides/blog-article.md)
- [Article with a video clip](https://github.com/mike-kostov/schema_org/blob/main/guides/article-with-video.md)
- [Article with an audio clip](https://github.com/mike-kostov/schema_org/blob/main/guides/article-with-audio.md)
- [Complex landing page (`@graph`)](https://github.com/mike-kostov/schema_org/blob/main/guides/landing-page.md)

## Embedding in a page

`to_script_tag/1` returns a complete, HTML-safe
`<script type="application/ld+json">…</script>` string (a value containing
`</script>` cannot break out). In Phoenix, the `SchemaOrg.HTML.json_ld/1`
function component wraps it — compiled only when `:phoenix_live_view` is in your
deps, so non-Phoenix apps never pull it in:

```heex
<SchemaOrg.HTML.json_ld data={@product} />
```

## Roadmap

- **`@id` cross-node linking** — reference a shared entity by id within a
  `@graph` instead of inlining it.
- **`rangeIncludes` validation** / Google required-field checks.

## Quick Start

```bash
mix deps.get          # fetch jason + ex_doc
mix compile           # compile the library and its generated types
mix test              # run the suite
```

## Commands

| Command | Description |
|---|---|
| `mix compile` | Compile the library and the generated type modules |
| `mix test` | Run the test suite |
| `mix test --failed` | Re-run only previously failing tests |
| `mix precommit` | Format check + compile (warnings as errors) + full test suite |
| `mix schema_org.build_types` | **Maintainer only** — regenerate `lib/schema_org/types/` from the vendored Schema.org graph |
| `mix docs` | Generate HTML documentation with ExDoc |

## Architecture

Two layers, one hand-written and one generated:

- **Runtime API** (`lib/schema_org.ex`, `lib/schema_org/thing.ex`) — hand-written.
  `SchemaOrg.to_json_ld/1` serialises any generated struct (recursively) into a
  `@context`/`@type`-annotated JSON-LD map and encodes it with Jason.
- **Generated types** (`lib/schema_org/types/`) — 1000+ files, one per Schema.org
  Class. Each is a plain struct (every valid property, direct and inherited, is a
  field) plus a `new/0` constructor. Build values with struct literals; a field
  is untyped, so it accepts Schema.org's loose value model directly (a scalar or
  a nested struct; a single value or a list). **Never edit these by hand** — they
  are produced by the code-generation task and overwritten on every run. (See
  [ADR-002](https://github.com/mike-kostov/schema_org/blob/main/docs/decisions/ADR-002-struct-literal-api-over-pipe-setters.md) for
  why building is struct-literal rather than pipe-setter based.)
- **Code generation** (`lib/mix/tasks/schema_org.build_types.ex`) — a
  maintainer-only Mix task that ingests the official Schema.org JSON-LD graph
  (`priv/schemaorg-current-https.jsonld`), maps Properties onto Classes via
  `domainIncludes` and `subClassOf` inheritance, and renders each module through
  an EEx template (`priv/templates/type.ex.eex`).

```
priv/schemaorg-current-https.jsonld ──▶ mix schema_org.build_types ──▶ lib/schema_org/types/*.ex
                                              (EEx template)
```

## Documentation

| Path | Contents |
|---|---|
| `docs/specs/` | Feature specs — one file per capability, updated in place as scope evolves |
| `docs/plans/` | Implementation plans — one file per spec, task-by-task breakdown with acceptance criteria |
| `docs/decisions/` | Architecture Decision Records (ADRs) — immutable; record the *why* behind significant choices |
| `docs/ideas/` | Early problem framing, refined before a spec is written |

Current docs:

- [`docs/specs/01-type-generation.md`](https://github.com/mike-kostov/schema_org/blob/main/docs/specs/01-type-generation.md) — Code-generation pipeline: JSON-LD parsing, EEx template, type module layout — **Implemented**
- [`docs/plans/01-type-generation.md`](https://github.com/mike-kostov/schema_org/blob/main/docs/plans/01-type-generation.md) — Task-by-task breakdown for the above
- [`docs/decisions/ADR-001-build-time-codegen-committed-artifacts.md`](https://github.com/mike-kostov/schema_org/blob/main/docs/decisions/ADR-001-build-time-codegen-committed-artifacts.md) — Build-time generation, committed as artifacts
- [`docs/decisions/ADR-002-struct-literal-api-over-pipe-setters.md`](https://github.com/mike-kostov/schema_org/blob/main/docs/decisions/ADR-002-struct-literal-api-over-pipe-setters.md) — Struct-literal building API (performance)
- [`docs/decisions/ADR-003-multi-node-graph-serialisation.md`](https://github.com/mike-kostov/schema_org/blob/main/docs/decisions/ADR-003-multi-node-graph-serialisation.md) — Multi-node `@graph` output
- [`docs/decisions/ADR-004-html-embedding-optional-phoenix.md`](https://github.com/mike-kostov/schema_org/blob/main/docs/decisions/ADR-004-html-embedding-optional-phoenix.md) — `to_script_tag/1` + optional Phoenix component

## Schema.org in Plain Terms

Schema.org is a shared vocabulary that search engines (Google, Bing, Yandex)
understand. When you embed it as JSON-LD in a page, you are telling the crawler
"this page is about a *Product* called *MacBook Pro* that has an *Offer* of
*$1999*". That structured data is what powers rich results — star ratings,
price snippets, breadcrumbs, FAQ accordions.

The vocabulary is a **graph of Types (Classes) and Properties**:

- A **Type** is a thing you can describe — `Product`, `Person`, `Recipe`, `Event`.
- A **Property** is an attribute — `name`, `price`, `author`, `startDate`.
- Each property declares which types it is valid on (`domainIncludes`) and what
  kind of value it expects (`rangeIncludes`).
- Types form an inheritance tree (`subClassOf`): every type ultimately descends
  from `Thing`, so every type has `name`, `description`, `url`, and so on.

Writing this JSON by hand is error-prone — there are ~800 types and ~1500
properties, and putting a property on the wrong type produces silently-invalid
markup. This library turns that graph into typed Elixir so the compiler and your
editor catch mistakes before a crawler ever sees them.

---

## Abbreviations

| Abbreviation | Full name | What it means in this project |
|---|---|---|
| **JSON-LD** | JSON for Linking Data | The JSON-based serialisation of Schema.org that search engines read. The output format this library produces |
| **SEO** | Search Engine Optimisation | The reason structured data exists — richer, higher-ranking search results |
| **DX** | Developer Experience | The guiding goal of this package: typed, autocompletable struct APIs |
| **Class** | Schema.org Class | A describable type (`Product`, `Offer`). Becomes one generated Elixir module |
| **Property** | Schema.org Property | An attribute (`name`, `price`). Becomes one struct field |
| **domainIncludes** | — | The Schema.org relation declaring which Classes a Property is valid on. Drives field→module mapping |
| **rangeIncludes** | — | The Schema.org relation declaring what value types a Property accepts |
| **EEx** | Embedded Elixir | The templating language used to render the generated `.ex` files |