Jiffy - JSON NIF for Erlang
============================
Jiffy is a JSON NIF library that focuses on correctness over performance. It’s
not the fastest JSON library for Erlang in standard benchmarks, but it
endeavors to be as fast as possible while affecting total system performance as
little as possible.

Usage
-----
Jiffy is a simple API. The only thing that might catch you off guard
is that the return type of `jiffy:encode/1` is an iolist even though
it returns a binary most of the time.
A quick note on Unicode. Jiffy only understands UTF-8 in binaries. End
of story.
Errors are raised as error exceptions.
Eshell V5.8.2 (abort with ^G)
1> jiffy:decode(<<"{\"foo\": \"bar\"}">>).
{[{<<"foo">>,<<"bar">>}]}
2> Doc = {[{foo, [<<"bing">>, 2.3, true]}]}.
{[{foo,[<<"bing">>,2.3,true]}]}
3> jiffy:encode(Doc).
<<"{\"foo\":[\"bing\",2.3,true]}">>
`jiffy:decode/1,2`
------------------
* `jiffy:decode(IoData)`
* `jiffy:decode(IoData, Options)`
The options for decode are:
* `return_maps` - Tell Jiffy to return objects using the maps data type
on VMs that support it. This raises an error on VMs that don't support
maps.
* `{null_term, Term}` - Returns the specified `Term` instead of `null`
when decoding JSON. This is for people that wish to use `undefined`
instead of `null`.
* `use_nil` - Returns the atom `nil` instead of `null` when decoding
JSON. This is a short hand for `{null_term, nil}`.
* `return_trailer` - If any non-whitespace is found after the first
JSON term is decoded the return value of decode/2 becomes
`{has_trailer, FirstTerm, RestData::iodata()}`. This is useful to
decode multiple terms in a single binary.
* `dedupe_keys` - If a key is repeated in a JSON object this flag
will ensure that the parsed object only contains a single entry
containing the last value seen. This mirrors the parsing behavior
of virtually every other JSON parser.
* `copy_strings` - Normally, when strings are decoded, they are
created as sub-binaries of the input data. With some workloads, this
leads to an undesirable bloating of memory: Strings in the decode
result keep a reference to the full JSON document alive. Setting
this option will instead allocate new binaries for each string, so
the original JSON document can be garbage collected even though
the decode result is still in use.
* `{bytes_per_red, N}` where N >= 0 - This controls the number of
bytes that Jiffy will process as an equivalent to a reduction. Each
20 reductions we consume 1% of our allocated time slice for the current
process. When the Erlang VM indicates we need to return from the NIF.
* `{bytes_per_iter, N}` where N >= 0 - Backwards compatible option
that is converted into the `bytes_per_red` value.
`jiffy:encode/1,2`
------------------
* `jiffy:encode(EJSON)`
* `jiffy:encode(EJSON, Options)`
where EJSON is a valid representation of JSON in Erlang according to
the table below.
The options for encode are:
* `uescape` - Escapes UTF-8 sequences to produce a 7-bit clean output
* `pretty` - Produce JSON using two-space indentation
* `force_utf8` - Force strings to encode as UTF-8 by fixing broken
surrogate pairs and/or using the replacement character to remove
broken UTF-8 sequences in data.
* `use_nil` - Encodes the atom `nil` as `null`.
* `escape_forward_slashes` - Escapes the `/` character which can be
useful when encoding URLs in some cases.
* `{bytes_per_red, N}` - Refer to the decode options
* `{bytes_per_iter, N}` - Refer to the decode options
Pre-encoded JSON
----------------
A `{json, IoData}` tuple can appear anywhere a JSON value is expected (except
as an object key). The `IoData` is spliced into the output as is. Jiffy does
not parse, validate, copy, or pretty-print it.
1> jiffy:encode([1, {json, <<"{\"cached\":true}">>}, 3]).
<<"[1,{\"cached\":true},3]">>
2> jiffy:encode({[{<<"a">>, {json, [<<"[1,">>, "2,3]"]}}]}).
<<"{\"a\":[1,2,3]}">>
The caller is responsible for ensuring it is well-formed JSON.
Data Format
-----------
Erlang JSON Erlang
==========================================================================
null -> null -> null
true -> true -> true
false -> false -> false
"hi" -> [104, 105] -> [104, 105]
<<"hi">> -> "hi" -> <<"hi">>
hi -> "hi" -> <<"hi">>
1 -> 1 -> 1
1.25 -> 1.25 -> 1.25
[] -> [] -> []
[true, 1.0] -> [true, 1.0] -> [true, 1.0]
{[]} -> {} -> {[]}
{[{foo, bar}]} -> {"foo": "bar"} -> {[{<<"foo">>, <<"bar">>}]}
{[{123, bar}]} -> {"123": "bar"} -> {[{<<"123">>, <<"bar">>}]}
{[{1.5, bar}]} -> {"1.5": "bar"} -> {[{<<"1.5">>, <<"bar">>}]}
{[{<<"foo">>, <<"bar">>}]} -> {"foo": "bar"} -> {[{<<"foo">>, <<"bar">>}]}
#{<<"foo">> => <<"bar">>} -> {"foo": "bar"} -> #{<<"foo">> => <<"bar">>}
#{123 => <<"bar">>} -> {"123": "bar"} -> #{<<"123">> => <<"bar">>}
#{1.5 => <<"bar">>} -> {"1.5": "bar"} -> #{<<"1.5">> => <<"bar">>}
N.B. The last three entries in this table are only valid for VM's that support
the `maps` data type (i.e., 17.0 and newer) and client code must pass
the `return_maps` option to `jiffy:decode/2`.
Scheduler Usage
----
Jiffy specifically avoids using shared resources like the dirty CPU schedulers,
since those are used for large heap garbage collection, crypto functions, large
binary matching, etc. Instead, it works with Erlang's regular VM schedulers and
yields appropriately after consuming a fraction of available reductions.
Yielding behavior can be explicitly controlled via the ` {bytes_per_red, N}`
option.
To get an idea of how this works, use the `bench_scheduling.sh` benchmark from
https://github.com/nickva/bench. It check concurrent encoding and decoding
scaled by the number of schedulers. An example run comparing against other JSON
libraries may look like:
```
./bench_scheduling.sh
...
scheduler responsiveness check
input: citm-catalog.json duration: 2000
schedulers: 12 online
impls: json, jiffy, simdjsone, jsone, jsx
[json]
1x encdec n=84 p50=135.0ms p95=182.9ms p99=191.9ms max=196.7ms
12x encdec n=86 p50=129.7ms p95=189.9ms p99=203.0ms max=206.2ms
24x encdec n=87 p50=263.0ms p95=461.2ms p99=506.1ms max=527.1ms
[jiffy]
1x encdec n=309 p50=38.3ms p95=51.9ms p99=57.4ms max=66.5ms
12x encdec n=300 p50=41.2ms p95=52.5ms p99=59.7ms max=66.2ms
24x encdec n=306 p50=80.2ms p95=111.8ms p99=118.8ms max=140.1ms
[simdjsone]
1x encdec n=20 p50=690.1ms p95=784.6ms p99=784.6ms max=784.8ms
12x encdec n=16 p50=790.9ms p95=887.5ms p99=887.5ms max=899.9ms
24x encdec n=24 p50=1448.4ms p95=1876.7ms p99=1879.5ms max=1882.7ms
[jsone]
1x encdec n=60 p50=213.1ms p95=261.8ms p99=263.9ms max=264.8ms
12x encdec n=60 p50=204.9ms p95=329.8ms p99=345.0ms max=350.9ms
24x encdec n=52 p50=440.1ms p95=700.3ms p99=773.3ms max=817.3ms
[jsx]
1x encdec n=24 p50=398.8ms p95=539.0ms p99=544.1ms max=548.3ms
12x encdec n=24 p50=391.5ms p95=684.9ms p99=687.0ms max=689.6ms
24x encdec n=24 p50=1181.3ms p95=1479.0ms p99=1558.1ms max=1654.7ms
```