# Visualization Roadmap
This roadmap sketches how `Statwise.Visualization` can grow toward a mature,
seaborn-inspired statistical visualization API while staying idiomatic Elixir
and keeping renderer dependencies optional.
## North Star
The long-term goal is a small statistical plotting grammar:
```elixir
rows
|> Statwise.Visualization.plot(x: :treatment, y: :score, color: :site)
|> Statwise.Visualization.add(:box_plot)
|> Statwise.Visualization.facet(column: :site)
|> Statwise.Visualization.with_theme(:minimal)
|> Statwise.Visualization.show()
```
Statwise should continue to support simple direct constructors:
```elixir
Statwise.Visualization.box_plot(rows, x: :treatment, y: :score, facet: :site)
```
The direct constructors should remain easy for common use, while the grammar
API can support composition and advanced charts.
## Design Principles
- Keep chart content separate from presentation.
- Keep runtime visualization dependencies optional.
- Prefer tidy row data and semantic mappings.
- Export plain Vega-Lite-compatible maps as the stable renderer contract.
- Test generated specs and statistical transformations, not screenshots.
- Preserve existing APIs through aliases or a deprecation period.
## Phase 1: Normalize Semantic Mappings
Make every chart accept consistent field mappings.
Current shape:
```elixir
Statwise.Visualization.box_plot(rows,
value: :score,
group: :treatment,
facet: :site
)
```
Target shape:
```elixir
Statwise.Visualization.box_plot(rows,
x: :treatment,
y: :score,
color: :treatment,
facet: :site
)
```
Semantic channels to support:
- `:x`
- `:y`
- `:color`
- `:facet`
- `:row`
- `:column`
- `:size`
- `:shape`
- `:detail`
- `:tooltip`
Compatibility aliases:
- `value: :score` maps to `y: :score`
- `group: :treatment` maps to `x: :treatment`
- `facet: :site` maps to a column/wrapped facet
Deliverables:
- Add `Statwise.Visualization.Mapping`
- Normalize aliases into semantic channels
- Share row extraction across plot types
- Support atom and string map keys
- Preserve old API behavior
- Add tests for old and new option names
## Phase 2: Make Row Data First-Class
Seaborn works best with tidy data. Statwise should make tidy rows the primary
shape while still supporting lists and maps of columns.
Supported inputs:
```elixir
[%{group: :a, value: 1.2}]
%{group: [:a, :b], value: [1.2, 2.4]}
Explorer.DataFrame
```
Potential internal representation:
```elixir
%Statwise.Visualization.Dataset{
rows: [%{}],
fields: %{...},
source: :rows | :columns | :explorer
}
```
Conversion APIs:
```elixir
Statwise.Visualization.Dataset.from_rows(rows)
Statwise.Visualization.Dataset.from_columns(columns)
Statwise.Visualization.Dataset.from_explorer(df)
```
Explorer should remain optional and be detected with `Code.ensure_loaded?/1`.
Deliverables:
- Direct Explorer support using `Explorer.DataFrame.to_rows/2`
- Map-of-columns support
- Row validation
- Shared missing-value policy
- Dataset tests
## Phase 3: Expand Core Plot Types
Seaborn organizes plots into relational, distribution, categorical, regression,
and matrix families. Statwise should prioritize statistical usefulness.
Relational plots:
```elixir
Statwise.Visualization.scatter(data, x: :height, y: :weight)
Statwise.Visualization.line(data, x: :time, y: :value)
```
Distribution plots:
```elixir
Statwise.Visualization.histogram(data, x: :score)
Statwise.Visualization.ecdf(data, x: :score)
Statwise.Visualization.density(data, x: :score)
Statwise.Visualization.qq_plot(data, x: :score)
```
Categorical plots:
```elixir
Statwise.Visualization.box_plot(data, x: :group, y: :score)
Statwise.Visualization.violin_plot(data, x: :group, y: :score)
Statwise.Visualization.strip_plot(data, x: :group, y: :score)
Statwise.Visualization.swarm_plot(data, x: :group, y: :score)
Statwise.Visualization.bar_plot(data, x: :group, y: :score, stat: :mean)
Statwise.Visualization.point_plot(data, x: :group, y: :score, interval: :confidence)
Statwise.Visualization.count_plot(data, x: :category)
```
Matrix plots:
```elixir
Statwise.Visualization.heatmap(matrix)
Statwise.Visualization.correlation_heatmap(data, columns: [:a, :b, :c])
```
Recommended initial additions:
- `scatter/2`
- `line/2`
- `bar_plot/2`
- `count_plot/2`
- `strip_plot/2`
- `heatmap/2`
Defer until the statistical transformation story is clear:
- `density/2`
- `violin_plot/2`
- `swarm_plot/2`
## Phase 4: Improve Faceting
Current support:
```elixir
facet: :site
facet_columns: 2
```
Target support:
```elixir
facet: :site
facet: [column: :site]
facet: [row: :sex, column: :site]
columns: 3
share_x: true
share_y: false
```
Potential internal representation:
```elixir
%{
row: channel | nil,
column: channel | nil,
columns: integer | nil,
share_x: boolean,
share_y: boolean
}
```
Deliverables:
- Row facets
- Column facets
- Wrapped facets
- Shared-axis controls through Vega-Lite `resolve`
- Livebook examples
## Phase 5: Style And Theme System
The current `with_style/2` supports friendly style keys. Make it more powerful
and more Vega-Lite-native.
Theme presets:
```elixir
Statwise.Visualization.with_theme(plot, :default)
Statwise.Visualization.with_theme(plot, :minimal)
Statwise.Visualization.with_theme(plot, :paper)
Statwise.Visualization.with_theme(plot, :dark)
Statwise.Visualization.with_theme(plot, :livebook)
```
Palette support:
```elixir
Statwise.Visualization.with_palette(plot, :category10)
Statwise.Visualization.with_palette(plot, ["#2563eb", "#dc2626", "#16a34a"])
```
Vega-Lite escape hatches:
```elixir
Statwise.Visualization.with_style(plot,
vega_lite: [...],
mark: [...],
encoding: [...],
facet: [...],
spec: [...],
config: [...]
)
```
Precedence rules should be explicit:
1. Plot defaults
2. Theme
3. Attached style
4. Export-time style
Deliverables:
- Add `Statwise.Visualization.Theme`
- Add `Statwise.Visualization.Palette`
- Add full Vega-Lite pass-through support
- Document merge precedence
- Add tests for faceted and layered style routing
## Phase 6: Plot Object And Composition API
This is the seaborn objects-inspired layer.
Target API:
```elixir
Statwise.Visualization.plot(rows, x: :score, color: :group)
|> Statwise.Visualization.add(:histogram, bins: 20)
|> Statwise.Visualization.add(:rug)
|> Statwise.Visualization.facet(column: :site)
|> Statwise.Visualization.label(title: "Scores by Site")
|> Statwise.Visualization.show()
```
Potential structs:
```elixir
%Statwise.Visualization.Figure{
data: dataset,
mappings: %{x: :score, y: nil, color: :group},
layers: [%Statwise.Visualization.Layer{}],
facet: nil,
labels: %{},
theme: nil,
style: %{}
}
```
Layer examples:
```elixir
add(:point)
add(:line)
add(:bar)
add(:box_plot)
add(:histogram)
add(:rule)
```
Deliverables:
- `plot/2`
- `add/3`
- `facet/2`
- `label/2`
- `show/1`
- Vega-Lite conversion for layered and faceted figures
This should happen after the direct constructors and semantic mappings are
stable.
## Phase 7: Statistical Summaries And Intervals
Implemented: add seaborn-like estimate plots that compute summaries.
Examples:
```elixir
Statwise.Visualization.bar_plot(data, x: :group, y: :score, stat: :mean)
Statwise.Visualization.point_plot(data,
x: :group,
y: :score,
stat: :mean,
interval: :confidence,
confidence_level: 0.95
)
```
Supported summaries:
- `:count`
- `:mean`
- `:median`
- `:sum`
Supported intervals:
- `nil`
- `:standard_error`
- `:confidence`
- `:percentile`
Deliverables:
- Add `Statwise.Visualization.Summary`
- Grouped summaries
- Confidence, standard-error, and percentile intervals
- `point_plot/2`
- Optional bootstrap intervals later
- Tests for summary correctness
## Phase 8: Statistical Result Annotations
Implemented. Result-specific plots remain available for direct inspection:
```elixir
Statwise.Visualization.t_test(result)
Statwise.Visualization.mann_whitney(result)
Statwise.Visualization.confidence_interval(result)
```
The primary workflow now shows statistical results directly on ordinary plots,
for example a box plot with a comparison bracket and p-value/effect-size
annotation:
```elixir
rows
|> Statwise.Visualization.box_plot(x: :group, y: :score)
|> Statwise.Visualization.with_test(result, groups: {:control, :treated})
```
Tests can also be computed from the plotted rows. When the plot is faceted, the
test is computed independently inside each facet panel:
```elixir
rows
|> Statwise.Visualization.box_plot(x: :group, y: :score, facet: :site)
|> Statwise.Visualization.with_test(:mann_whitney, groups: {:control, :treated})
```
Deliverables:
- Completed: test-result annotation data model
- Completed: comparison brackets for categorical plots
- Completed: p-value, statistic, and effect-size labels for t-test and
Mann-Whitney results
- Completed: per-facet test computation when tests are computed from plotted
rows
- Completed: facet-aware annotation placement
Future extensions:
- Optional confidence interval overlays from `%Statwise.TestResult{}`
## Phase 9: Documentation And Gallery
The Livebook should become the canonical tutorial.
Recommended structure:
1. Quickstart
2. Data shapes
3. Semantic mappings
4. Distribution plots
5. Categorical plots
6. Relational plots
7. Faceting
8. Styling and themes
9. Statistical result plots
10. Exporting
11. Vega-Lite escape hatches
Also keep the README compact:
```elixir
df
|> Statwise.Visualization.box_plot(x: :treatment, y: :score, facet: :site)
|> Statwise.Visualization.show()
```
Deliverables:
- Keep `docs/visualization_gallery.livemd` current
- Update `docs/visualization.md`
- Add small README examples
- Add generated Vega-Lite examples in tests
## Phase 10: Compatibility And Stability
Before calling the visualization API mature:
- Keep no required visualization runtime dependency.
- Keep `VegaLite`, `Kino`, `Jason`, and `Explorer` optional.
- Keep old APIs through aliases or a deprecation period.
- Add changelog entries.
- Ensure chart constructors return plain `%Plot{}` or `%Figure{}` structs.
- Test generated Vega-Lite specs, not screenshots.
## Recommended Implementation Order
1. Semantic mappings: `x`, `y`, `color`, `facet`
2. Dataset normalization for rows, columns, and Explorer
3. Scatter, line, bar, count, and strip plots
4. Better faceting: row/column facets and shared axes
5. Vega-Lite escape hatches in `with_style/2`
6. Themes and palettes
7. Statistical summary plots
8. Composition API: `plot |> add |> facet`
9. Matrix and correlation heatmaps
10. Violin, density, and swarm plots once transformations are solid
The highest-leverage starting point is Phase 1 plus Phase 2. A seaborn-like API
lives or dies by tidy data and semantic mappings. Once those are clean, every
new chart becomes easier to add.