Skip to main content

README.md

# Statwise

Statwise is an Elixir statistics library that aims for idiomatic Elixir APIs
with results checked against well-known Python references.

This first milestone includes:

- Descriptive statistics for lists and one-dimensional Nx tensors.
- Normal and Student's t distribution helpers.
- One-sample, paired, Welch, and pooled t-tests.
- Average-rank utilities.
- Asymptotic and exact Mann-Whitney U tests.
- Dataframe-style column wrappers for running tests from maps or Explorer
  dataframes.
- Visualization builders for histograms, ECDFs, QQ plots, box plots, scatter
  plots, line plots, summary bars and points with intervals, count plots,
  strip plots, and heatmaps with Vega-Lite-compatible output.
- Committed JSONL fixtures generated from pinned Python references.

## Examples

```elixir
Statwise.Descriptive.mean([1, 2, 3])
#=> 2.0

Statwise.TTest.independent([1.2, 1.9, 2.4], [2.2, 3.0, 3.4],
  variance: :welch
)
#=> %Statwise.TestResult{}

Statwise.MannWhitney.test([1, 3, 5], [2, 4],
  alternative: :two_sided,
  method: :asymptotic
)
#=> %Statwise.TestResult{}

Statwise.Visualization.histogram([1, 2, 2, 3], bins: 10)
|> Statwise.Visualization.to_vega_lite()
#=> %{"$schema" => "https://vega.github.io/schema/vega-lite/v5.json", ...}

# In Livebook with :jason, :vega_lite, and :kino_vega_lite installed:
Statwise.Visualization.histogram([1, 2, 2, 3], bins: 10)
|> Statwise.Visualization.with_style(width: 420, color: "#2563eb")
|> Statwise.Visualization.show()
```

```elixir
rows = [
  %{site: :north, treatment: :control, time: 1, score: 1.2},
  %{site: :north, treatment: :control, time: 2, score: 1.8},
  %{site: :south, treatment: :treated, time: 1, score: 2.4},
  %{site: :south, treatment: :treated, time: 2, score: 2.9}
]

rows
|> Statwise.Visualization.plot(x: :time, y: :score, color: :treatment)
|> Statwise.Visualization.add(:point)
|> Statwise.Visualization.add(:line)
|> Statwise.Visualization.facet(column: :site)
|> Statwise.Visualization.show()

rows
|> Statwise.Visualization.box_plot(x: :treatment, y: :score)
|> Statwise.Visualization.with_test(:t_test, groups: {:control, :treated})
|> Statwise.Visualization.show()
```

## T-Tests

```elixir
Statwise.TTest.one_sample([2.5, 3.1, 3.6, 4.0], mean: 3.0)

Statwise.TTest.paired(
  [10.2, 11.5, 12.1, 13.8],
  [9.9, 10.8, 11.2, 12.6],
  alternative: :greater
)

Statwise.TTest.independent(
  [1.2, 1.9, 2.4, 2.9],
  [2.2, 3.0, 3.4, 4.1, 4.8],
  variance: :welch,
  alternative: :less,
  null_difference: 0.0,
  confidence_level: 0.95,
  effect_size: true
)
```

The test APIs can also pull samples from dataframe-like column data. Statwise
does not depend on Explorer, but if your application has Explorer loaded,
`Explorer.DataFrame` columns are accepted. Maps of columns work too:

```elixir
df = %{
  before: [10.2, 11.5, 12.1, 13.8],
  after: [9.9, 10.8, 11.2, 12.6],
  control: [1.2, 1.9, 2.4, 2.9],
  treatment: [2.2, 3.0, 3.4, 4.1]
}

Statwise.TTest.one_sample(df, columns: [:before, :after], mean: 10.0)
#=> %{before: %Statwise.TestResult{}, after: %Statwise.TestResult{}}

Statwise.TTest.paired(df, columns: [:before, :after])
#=> %Statwise.TestResult{}

Statwise.TTest.independent(df, columns: [:control, :treatment], variance: :welch)
#=> %Statwise.TestResult{}
```

Column extraction defaults to ordinary lists. Pass `input: :tensor` to extract
map or Explorer columns as one-dimensional `f64` tensors. With Explorer loaded,
Statwise uses `Explorer.Series.to_tensor/2` when it is available:

```elixir
Statwise.TTest.one_sample(df,
  columns: [:before, :after],
  mean: 10.0,
  input: :tensor,
  backend: :tensor
)
```

Use `pairs:` to run several two-sample tests in one call:

```elixir
Statwise.TTest.paired(df,
  pairs: [
    before: :after,
    control: :treatment
  ]
)
#=> %{{:before, :after} => %Statwise.TestResult{}, ...}
```

Supported alternatives are `:two_sided`, `:greater`, and `:less`. Independent
t-tests support `variance: :welch` and `variance: :pooled`.
T-test results include confidence intervals by default. Pass `effect_size: true`
to include Cohen's d and Hedges' g.

## Nonparametric Tests

```elixir
Statwise.Nonparametric.Rank.ranks([10, 20, 20, 30])
#=> [1.0, 2.5, 2.5, 4.0]

Statwise.MannWhitney.test(
  [1.0, 3.0, 5.0],
  [2.0, 4.0],
  alternative: :two_sided,
  method: :auto,
  continuity: true
)
```

Dataframe columns are supported with the same `columns:` and `pairs:` options:

```elixir
Statwise.MannWhitney.test(df, columns: [:control, :treatment], method: :auto)

Statwise.MannWhitney.test(df,
  pairs: [
    control: :treatment,
    before: :after
  ],
  method: :auto
)
```

Ranking currently supports SciPy-compatible average ranks for ties. Mann-Whitney
U supports `method: :asymptotic`, `method: :exact`, and `method: :auto`.
Like SciPy, explicit `method: :exact` does not apply a tie correction. `:auto`
uses exact p-values when there are no ties and the smaller sample has at most 8
observations; otherwise it uses the asymptotic normal approximation.
Mann-Whitney results include common-language and rank-biserial effect sizes.
`effect_size.cliffs_delta` is also provided as an alias of rank-biserial.

Stage-one behavior is intentionally strict: raw samples must be finite numeric
lists or one-dimensional Nx tensors. Test APIs can also extract raw samples
from dataframe-style columns with `columns:` or `pairs:`. Tensor-native Nx
reductions are opt-in with `backend: :tensor`; the default path still favors
the fastest scalar implementation for the current Nx binary backend. NaN
behavior is controlled with
`nan_policy: :raise | :propagate | :omit`; see
[`docs/compatibility.md`](docs/compatibility.md).
Degenerate t-tests with zero standard error return explicit `:nan`,
`:infinity`, or `:neg_infinity` statistics according to the compatibility
contract.

## Python Compatibility

The Elixir tests use committed fixtures from:

- NumPy 2.3.0 for descriptive statistics.
- SciPy 1.16.0 for distributions and Mann-Whitney U.
- Statsmodels 0.14.6 for independent t-tests.

Python is not required for the normal test suite. To intentionally refresh
fixtures:

```bash
cd reference/python
uv sync
uv run python generate_fixtures.py
cd ../..
mix test
```

Review fixture diffs before committing refreshed values.

For randomized pre-release checks against Python references:

```bash
cd reference/python
uv sync
uv run python differential_check.py --cases 250 --seed 202607
```

See [`docs/release_checklist.md`](docs/release_checklist.md) for the release
readiness checklist.

For runnable tutorials, see
[`docs/statistical_tests_gallery.livemd`](docs/statistical_tests_gallery.livemd)
and [`docs/visualization_gallery.livemd`](docs/visualization_gallery.livemd).

## CI

Run:

```bash
mix format --check-formatted
mix compile --warnings-as-errors
mix test
```