# High Dynamic Range (HDR) Histogram for Elixir
Get percentile values from stream of input. Each histogram has a configurable
min, max and precision value which control both the accuracy and memory
Learn more about [HDR Histograms](http://hdrhistogram.org/).
Writes (which ought to substantially outnumber reads) are in constant time
regardless of the configuration. Querying a histogram for a specific percentile,
or other metric (max value, min value) is slightly slower as the range and/or
precision grows. However, the implementation relies on write-optimized ETS
tables and thus operations are not serialized (beyond the locks used by ETS).
A histogram with a range of 1..1_000_000 and a precision of 3 takes roughly
16000 bytes.
## Usage
The fist step involves creating a registry:
defmodule MyApp.Stats do
use Histogrex
histogrex :load_user, min: 1, max: 10_000_000, precision: 3
histogrex :db_save_settings, min: 1, max: 10_000, precision: 2
And then adding this module as a `worker` to your application's supervisor
worker(MyApp.Stats, [])
Values can then be recorded via the `record!` or `record` functions:
alias MyApp.Stats
Stats.record!(:load_user, 233)
Stats.record!(:db_save_settings, 84)
`min`, `max`, `total_count` and `value_at_quantile` are used to query the
alias MyApp.Stats
Stats.value_at_quantile(:load_user, 99.9)
It would be reasonable to have a GenServer dump these statistics to some log
ingestor every X seconds (10? 60?). This would be the only reader (though
concurrent reads are fully supported).
## Query Multiple Metrics at Once
If you're querying multiple metrics from an histogram at once, you can get a
slight performance boost by first getting an iterator and querying that.
There is no need to reset the iterator between calls:
it = Stats.iterator(:load_user)
Stats.value_at_quantile(it, 99.9)
## Dynamic Metrics
You won't always be able to define histograms upfront. In these cases, use templates:
defmodule MyApp.Stats do
use Histogrex
template :http_ms, min: 1, max: 10_000, precision: 2
The `template` macro takes the same arguments as the `histogrex` macro and, the
two can live side by side. However, all your metric names must be unique.
Overloaded functions take the template name and your dynamic metric name::
alias MyApp.Stats
Stats.record!(:http_ms, "about", 82)
Stats.mean(:http_ms, "about")
Stats.max(:http_ms, "about")
Stats.total_count(:http_ms, "about")
Stats.value_at_quantile(:http_ms, "about", 99.9)
The dynamic metric name can be a string or atom.
You can also get an iterator and use that for efficient multi-queries:
it = Stats.iterator(:http_ms, "about")
## Collecting All Data
Use your registry's `reduce/2` function to collect the data:
Stats.reduce(%{}, fn {name, iterator}, metrics ->
collected = {Stats.max(iterator), Stats.value_at_quantile(iterator, 99.9)}
Map.put(metrics, name, collected)
## Implementation
The core histogram implementation is taken from [the Go version](https://github.com/codahale/hdrhistogram).
In order to maintain high write throughput, data access is not serialized through
a single process. Instead, write-optimized (via the `write_concurrency: true` option)
ETS tables are used. Functions are fully executed by the calling process. A write consists of as single `:ets.update_counter` call. A read consists of a single `:ets.lookup`.
It is possible to use multiple registries, though I suspect this will have no impact
unless thousands or metrics with high load are used. Benchmark it.
You can get the memory requirements of a registry by calling:
There's a bit of additional overhead for each histogram, but that will give you
the bulk of it.