Skip to main content

docs/visualization.md

# Visualization

`Statwise.Visualization` builds statistical plot descriptions from tidy rows,
one-dimensional `Nx.Tensor`s, map columns, and dataframe-like column data. The
direct constructors return `%Statwise.Visualization.Plot{}` structs. The
composition API returns `%Statwise.Visualization.Figure{}` structs.

Use `Statwise.Visualization.to_vega_lite/2` to export a plot as a plain
Vega-Lite-compatible map.

For a runnable Livebook with all chart types and styling options, see
[`docs/visualization_gallery.livemd`](visualization_gallery.livemd).

For the longer seaborn-inspired improvement plan, see
[`docs/visualization_roadmap.md`](visualization_roadmap.md).

In Livebook, install the optional display packages and call
`Statwise.Visualization.show/2`:

```elixir
Mix.install([
  {:statwise, path: "/Users/catethos/workspace/statistic"},
  {:jason, "~> 1.4"},
  {:vega_lite, "~> 0.1"},
  {:kino_vega_lite, "~> 0.1"}
])
```

```elixir
Statwise.Visualization.box_plot(%{
  control: [1.2, 1.8, 2.1],
  treatment: [2.4, 2.8, 3.0]
})
|> Statwise.Visualization.show()
```

Style can be applied after the plot content is built:

```elixir
Statwise.Visualization.box_plot(%{
  control: [1.2, 1.8, 2.1],
  treatment: [2.4, 2.8, 3.0]
})
|> Statwise.Visualization.with_style(
  width: 480,
  height: 280,
  color: "#2563eb",
  opacity: 0.8,
  background: "#ffffff",
  config: [
    axis: [
      labelColor: "#374151",
      titleColor: "#111827"
    ],
    view: [
      stroke: nil
    ]
  ]
)
|> Statwise.Visualization.show()
```

## Histogram

```elixir
plot = Statwise.Visualization.histogram([1, 2, 2, 3, 4], bins: 10)

Statwise.Visualization.to_vega_lite(plot)
```

With dataframe-style column data:

```elixir
data = %{height: [62, 64, 66, nil, 70]}

Statwise.Visualization.histogram(data, x: :height)
```

Missing values are dropped by default. Pass `missing: :error` to reject them.

## Box Plot

```elixir
Statwise.Visualization.box_plot([1, 2, 3, 4])
```

Grouped data can be supplied as a map:

```elixir
Statwise.Visualization.box_plot(%{
  control: [1.2, 1.8, 2.1],
  treatment: [2.4, 2.8, 3.0]
})
```

Or selected from column data:

```elixir
Statwise.Visualization.box_plot(data, columns: [:control, :treatment])
```

Faceted box plots use tidy row data. Prefer semantic mappings:

```elixir
rows = [
  %{site: :north, treatment: :control, score: 1.2},
  %{site: :north, treatment: :control, score: 1.8},
  %{site: :north, treatment: :treated, score: 2.4},
  %{site: :north, treatment: :treated, score: 2.8},
  %{site: :south, treatment: :control, score: 1.0},
  %{site: :south, treatment: :control, score: 1.4},
  %{site: :south, treatment: :treated, score: 3.0},
  %{site: :south, treatment: :treated, score: 3.4}
]

Statwise.Visualization.box_plot(rows,
  x: :treatment,
  y: :score,
  color: :treatment,
  facet: :site,
  columns: 2,
  title: "Scores by Treatment and Site"
)
|> Statwise.Visualization.with_style(width: 180, height: 220, color: "#2563eb")
|> Statwise.Visualization.show()
```

## Relational Plots

```elixir
rows = [
  %{time: 1, treatment: :control, score: 1.2},
  %{time: 2, treatment: :control, score: 1.8},
  %{time: 1, treatment: :treated, score: 2.4},
  %{time: 2, treatment: :treated, score: 2.9}
]

Statwise.Visualization.scatter(rows, x: :time, y: :score, color: :treatment)
Statwise.Visualization.line(rows, x: :time, y: :score, color: :treatment)
```

## Categorical And Summary Plots

```elixir
Statwise.Visualization.strip_plot(rows, x: :treatment, y: :score)
Statwise.Visualization.count_plot(rows, x: :treatment)
Statwise.Visualization.bar_plot(rows, x: :treatment, y: :score, stat: :mean)
Statwise.Visualization.point_plot(rows, x: :treatment, y: :score, interval: :confidence)
```

Supported summary statistics are `:count`, `:mean`, `:median`, and `:sum`.
Supported intervals are `nil`, `:standard_error`, `:confidence`, and
`:percentile`. Standard-error and confidence intervals apply to `stat: :mean`;
percentile intervals can be used with any summary statistic.

## Heatmaps

```elixir
cells = [
  %{metric: :a, group: :control, value: 0.42},
  %{metric: :a, group: :treated, value: 0.61}
]

Statwise.Visualization.heatmap(cells, x: :group, y: :metric, color: :value)
```

## Faceting

Wrapped facets use `facet: :field`. Row and column facets use a facet spec:

```elixir
Statwise.Visualization.scatter(rows,
  x: :time,
  y: :score,
  facet: [row: :site, column: :treatment],
  share_y: false
)
```

`share_x: false` or `share_y: false` exports Vega-Lite independent scale
resolution for the corresponding axis.

## ECDF

```elixir
Statwise.Visualization.ecdf([3, 1, 2, 2])
```

The ECDF builder sorts the sample and computes cumulative probabilities before
export, making the resulting plot independent of Vega-Lite transforms.

## QQ Plot

```elixir
Statwise.Visualization.qq_plot([10.0, 11.5, 12.2, 13.1])
```

QQ plots currently support `distribution: :normal`. The builder uses standard
normal quantiles on the x-axis and includes a sample-scaled reference line by
default. This keeps the plot useful for testing normality even when the sample
is not already standard normal.

```elixir
Statwise.Visualization.qq_plot(values, reference_line: false)
```

Use `reference_scale: :standard` to draw the unscaled standard-normal `y = x`
line instead:

```elixir
Statwise.Visualization.qq_plot(values, reference_scale: :standard)
```

QQ plots use separate point and reference-line layers. Style point size and the
reference line separately:

```elixir
Statwise.Visualization.qq_plot([10.0, 11.5, 12.2, 13.1, 13.8, 15.0])
|> Statwise.Visualization.with_style(
  point: [
    color: "#dc2626",
    size: 80
  ],
  reference: [
    color: "#6b7280",
    stroke_width: 1
  ]
)
|> Statwise.Visualization.show()
```

## Rank Plot

Rank plots show the average ranks assigned across two samples, which can be
useful when inspecting nonparametric comparisons.

```elixir
Statwise.Visualization.rank_plot([10, 30], [20], x_label: :control, y_label: :treatment)
```

## Statistical Test Annotations

The highest-level workflow is to build a categorical plot from tidy rows, then
ask Statwise to compute the statistical comparison from the same plotted data.
This keeps the visualized samples and the tested samples aligned:

```elixir
control = [1.2, 1.9, 2.4]
treatment = [2.2, 3.0, 3.4]

rows =
  Enum.map(control, &%{group: :control, score: &1}) ++
    Enum.map(treatment, &%{group: :treated, score: &1})

rows
|> Statwise.Visualization.box_plot(x: :group, y: :score)
|> Statwise.Visualization.with_test(:t_test, groups: {:control, :treated})
|> Statwise.Visualization.show()
```

For categorical plots, Statwise exports comparison brackets and labels as
additional Vega-Lite layers. The default label includes the p-value and a
preferred effect size when the test result has one. Annotation layers also carry
tooltip fields for the test name, p-value, statistic, and rendered label.

You can use a nonparametric test the same way:

```elixir
rows
|> Statwise.Visualization.box_plot(x: :group, y: :score)
|> Statwise.Visualization.with_test(:mann_whitney, groups: {:control, :treated})
|> Statwise.Visualization.show()
```

For faceted plots, computed tests are run independently inside each facet:

```elixir
rows
|> Statwise.Visualization.box_plot(x: :group, y: :score, facet: :site)
|> Statwise.Visualization.with_test(:t_test, groups: {:control, :treated})
|> Statwise.Visualization.show()
```

When each panel has exactly two x-groups, `groups:` can be omitted. If a
requested group is missing or empty in a computed test annotation, Statwise
raises an explicit error instead of silently dropping the comparison. Multiple
test annotations on the same plot are stacked above the plotted values.

Precomputed results can also be attached when the test was run elsewhere:

```elixir
result = Statwise.TTest.independent(control, treatment, effect_size: true)

rows
|> Statwise.Visualization.box_plot(x: :group, y: :score)
|> Statwise.Visualization.with_test(result, groups: {:control, :treated})
|> Statwise.Visualization.show()
```

Use `show:` to control label content:

```elixir
Statwise.Visualization.with_test(plot, result, show: [:p_value, :statistic])
```

The existing result-specific plots are still available for direct inspection.

T-test results can be visualized as confidence-interval plots:

```elixir
result = Statwise.TTest.independent([1.2, 1.9, 2.4], [2.2, 3.0, 3.4])

Statwise.Visualization.t_test(result)
```

Any result with a confidence interval can use the generic helper:

```elixir
Statwise.Visualization.confidence_interval(result)
```

Mann-Whitney U results can be visualized as `U1` and `U2` statistic bars:

```elixir
result = Statwise.MannWhitney.test([1, 2, 3], [2, 4, 6])

Statwise.Visualization.mann_whitney(result)
```

## Vega-Lite Output

The Vega-Lite exporter returns ordinary maps with string keys:

```elixir
plot = Statwise.Visualization.histogram([1, 2, 2, 3], title: "Values")

spec = Statwise.Visualization.to_vega_lite(plot, width: 400, height: 240)
```

For applications that want a specific format directly:

```elixir
Statwise.Visualization.to_json(plot)
Statwise.Visualization.to_vega_lite_plot(plot)
Statwise.Visualization.to_kino(plot)
```

`to_vega_lite/2` remains dependency-free and returns a map. The other helpers
require `Jason`, `VegaLite`, or `Kino.VegaLite` to be available at runtime.

Style can also be passed as a one-off export option:

```elixir
Statwise.Visualization.to_vega_lite(plot,
  style: [
    width: 420,
    height: 240,
    color: "#16a34a",
    stroke_width: 3
  ]
)
```

## Themes, Palettes, And Escape Hatches

Themes are applied before attached style, and export-time style wins last:

```elixir
plot
|> Statwise.Visualization.with_theme(:minimal)
|> Statwise.Visualization.with_palette(:statwise)
|> Statwise.Visualization.with_style(
  encoding: [
    x: [
      axis: [
        grid: true
      ]
    ]
  ]
)
```

Available named themes are `:default`, `:minimal`, `:paper`, `:dark`, and
`:livebook`. Palettes accept `:category10`, `:statwise`, `:muted`, or a list of
hex colors.

## Composition API

For layered charts, start with shared data and mappings, add layers, then label,
facet, theme, or style the figure:

```elixir
rows
|> Statwise.Visualization.plot(x: :time, y: :score, color: :treatment)
|> Statwise.Visualization.add(:point)
|> Statwise.Visualization.add(:line)
|> Statwise.Visualization.facet(column: :site)
|> Statwise.Visualization.label(title: "Scores by Site")
|> Statwise.Visualization.to_vega_lite()
```