Skip to main content

README.md

# JSONPathEx

[![Elixir](https://img.shields.io/badge/elixir-~%3E%201.15-purple.svg)](https://elixir-lang.org/)
[![Module Version](https://img.shields.io/hexpm/v/jsonpath_ex.svg)](https://hex.pm/packages/jsonpath_ex)
[![Hex Docs](https://img.shields.io/badge/hex-docs-lightgreen.svg)](https://hexdocs.pm/jsonpath_ex)

A fast Elixir library for parsing and evaluating [JSONPath](https://goessner.net/articles/JsonPath/) expressions, with [RFC 9535](https://www.rfc-editor.org/rfc/rfc9535)-aligned filter semantics.

---

## Features

- **Parses** the full common JSONPath surface: root, dot/bracket child, wildcards, recursive descent, slices, multi-index, filters, nested filters, grouping, arithmetic, and shorthand `[?expr]` filters.
- **Unicode** member names in dot notation (`$.屬性`).
- **Quoted dot-child** (`$."key with spaces"`, `$..'key'`).
- **Full RFC 9535 escape sequences** in quoted strings: `\n \t \r \b \f \/ \\ \' \" \uXXXX`. Unknown escapes pass through so you can pass regex patterns inline.
- **Scientific-notation float literals** (`1.5e2`, `-1e-3`).
- **Built-in functions**: `length()`, `count()`, `min()`, `max()`, `sum()`, `avg()`, `concat()`, `match()`, `search()` — available as both postfix (`@.items.length()`) and prefix (`length(@.items)`).
- **RFC 9535 filter comparison semantics**: missing keys do not equal `null`, mixed-type ordering yields `false`, division/modulo by zero short-circuits to no match.
- **Stable performance**: linear-time slicing on 100k-element lists, single-index access avoids tuple allocation in filters.

---

## Installation

```elixir
def deps do
  [
    {:jsonpath_ex, "~> 0.3.0"}
  ]
end
```

Then run:

```bash
mix deps.get
```

## Usage

### Evaluating expressions

```elixir
iex> json = %{
...>   "store" => %{
...>     "book" => [
...>       %{"category" => "reference", "author" => "Nigel Rees", "price" => 8.95},
...>       %{"category" => "fiction", "author" => "Evelyn Waugh", "price" => 12.99}
...>     ]
...>   }
...> }
iex> JSONPathEx.evaluate("$.store.book[*].author", json)
{:ok, ["Nigel Rees", "Evelyn Waugh"]}
```

### Filters with prefix functions and regex

```elixir
iex> data = [
...>   %{"name" => "alice", "score" => 90},
...>   %{"name" => "bob", "score" => 50},
...>   %{"name" => "alex", "score" => 75}
...> ]
iex> JSONPathEx.evaluate(~S{$[?(search(@.name, "^al") && @.score > 70)]}, data)
{:ok, [%{"name" => "alice", "score" => 90}, %{"name" => "alex", "score" => 75}]}
```

### Parse separately for re-use

If you're evaluating the same path against many JSON documents, parse once:

```elixir
{:ok, ast} = JSONPathEx.Parser.parse("$.store.book[*].title")
JSONPathEx.Evaluator.evaluate(ast, json1)
JSONPathEx.Evaluator.evaluate(ast, json2)
```

### `evaluate!/2`

```elixir
iex> JSONPathEx.evaluate!("$[*]", [1, 2, 3])
[1, 2, 3]

iex> JSONPathEx.evaluate!("invalid", %{})
** (ArgumentError) JSONPathEx: parse error at line 1: expected string "$"
```

---

## Supported syntax

| Selector | Example | Notes |
|---|---|---|
| Root | `$` | |
| Current node | `@` | only inside filters |
| Dot child | `$.key` | unicode names supported |
| Quoted dot child | `$."key with spaces"`, `$.'k'` | |
| Bracket child | `$['key']`, `$["k"]` | escapes supported |
| Multi-key bracket | `$['a','b','c']` | |
| Wildcard | `$.*`, `$[*]` | |
| Array index | `$[0]`, `$[-1]` | negative indexing supported |
| Multi-index | `$[0,1,3]` | |
| Slice | `$[1:5]`, `$[::2]`, `$[::-1]` | RFC 9535 semantics |
| Recursive descent | `$..key`, `$..*` | |
| Filter | `$[?(@.price < 10)]` | also shorthand `$[?@.x]` |
| Nested filter | `$[?(@.tags[?(@.name == "x")])]` | |
| Grouping | `$[?((@.a || @.b) && @.c)]` | |
| Comparisons | `==` `!=` `===` `<` `<=` `>` `>=` `in` `nin` | |
| Logical | `&&` `\|\|` `!` (prefix only) | |
| Arithmetic | `+ - * / %` | left-associative |
| Postfix functions | `@.items.length()` | |
| Prefix functions | `length(@)`, `match(@.s, "\\d+")` | |

---

## Filter semantics

`JSONPathEx` follows RFC 9535-style "Nothing" semantics inside filters:

| Expression | On `%{"v" => nil}` | On `%{}` (missing) |
|---|---|---|
| `@.v == null` | `true` | `false` |
| `@.v != null` | `false` | `true` |
| `@.missing == @.also_missing` | n/a | `true` |
| `@.v > 5` (with `@.v = "x"`) | `false` (mixed types) | `false` |
| `@.v / 0` | filter excludes (no match) | filter excludes |

This means:
- **Missing key ≠ explicit null.**
- **Ordering comparisons between incompatible types are false** (not BEAM term-order surprises).
- **Division/modulo by zero short-circuits** — no crash, the filter just excludes.

---

## Benchmarks

The `bench/` directory contains [Benchee](https://github.com/bencheeorg/benchee) suites:

```bash
mix run bench/parsing.exs    # parser throughput
mix run bench/evaluation.exs # bookstore-document workloads
mix run bench/slicing.exs    # array slicing on 100/10k/100k lists
mix run bench/filters.exs    # filter throughput on 1k/10k items
mix run bench/recursion.exs  # deep-scan on deep/wide structures
```

Indicative results (Apple Silicon, single core):

| Workload | v0.3.0 |
|---|---|
| Parse `$.store.book[?(@.price > 10 && @.category == "fiction")]` | ~10 µs |
| `$[::-1]` on a 100,000-element list | ~5 ms |
| `$[?(@.id > 100)]` on 10,000 maps | ~10 ms |
| `$[?(@.value > $[0].value)]` on 10,000 maps | ~13 ms |

(v0.2.0 took ~9 s for `$[::2]` on 100k and ~340 ms for the root-reference filter — see `CHANGELOG.md`.)

---

## Contributing

```bash
mix test                         # fast unit tests
mix test --include performance   # include stress + performance tests
mix run bench/parsing.exs        # one of the benchmark suites
```

PRs welcome.

## License

MIT — see the [LICENSE](https://github.com/b-erdem/jsonpath_ex/blob/main/LICENSE) file for details.