defmodule Tucan do
@moduledoc """
A high level API interface for creating plots on top of `VegaLite`.
Tucan is an Elixir plotting library built on top of `VegaLite`,
designed to simplify the creation of interactive and visually stunning
plots. With `Tucan`, you can effortlessly generate a wide range of plots,
from simple bar charts to complex composite plots, all while enjoying the
power and flexibility of a clean composable functional API.
Tucan offers a simple API for creating most common plot types similarly
to the widely used python packages `matplotlib` and `seaborn` without
requiring the end user to be familiar with the Vega Lite grammar.
## Features
- **Versatile Plot Types** - Tucan provides an array of plot types, including
bar charts, line plots, scatter plots, histograms, and more, allowing you to
effectively represent diverse data sets.
- **Clean and consistent API** - A clean and consistent plotting API similar
to `matplotlib` or `seaborn` is provided. You should be able to create most
common plots with a single function call and minimal configuration.
- **Grouping and Faceting** - Enhance your visualizations with grouping and
faceting features, enabling you to examine patterns and trends within subgroups
of your data.
- **Customization** - Customize your plots with ease using Tucan's utilities
for adjusting plot dimensions, titles, and **themes**.
- **Thin wrapper on top of VegaLite** - All `VegaLite` functions can be used
seamlessly with `Tucan` for advanced customizations if needed.
- **Low level API** - A low level API with helper functions allow you to modify
any part of a `VegaLite` specification.
## Basic usage
All supported plots expect as first argument some data, a `VegaLite` specification
or a binary which is considered a url to some data. Additionally you can use
one of the available `Tucan.Datasets`.
```tucan
Tucan.scatter(:iris, "petal_width", "petal_length")
```
You can apply semantic grouping by a third variable by modifying the color, the
shape or the size of the points:
```tucan
Tucan.scatter(:iris, "petal_width", "petal_length", color_by: "species", shape_by: "species")
```
Alternatively you could use the helper grouping functions:
```tucan
Tucan.scatter(:iris, "petal_width", "petal_length")
|> Tucan.color_by("species")
|> Tucan.shape_by("species")
```
> #### Use the functional API carefully {: .warning}
>
> For some plot types where transformations are applied on the input data it
> is recommended to use the options instead of the functional API, since in the
> first case any required grouping will also be applied to the transformations.
## Composite plots
Tucan also provides some helper functions for generating composite plots.
`pairplot/3` can be used to plot pairwise relationships across a dataset.
```tucan
fields = ["Beak Length (mm)", "Beak Depth (mm)", "Body Mass (g)"]
Tucan.pairplot(:penguins, fields, diagonal: :density)
```
## Customization & Themes
Various methods and helper modules allow you to easily modify the style of
a plot.
```tucan
Tucan.bubble(:gapminder, "income", "health", "population",
color_by: "region",
width: 400,
tooltip: :data
)
|> Tucan.Axes.set_x_title("Gdp per Capita")
|> Tucan.Axes.set_y_title("Life expectancy")
|> Tucan.Scale.set_x_scale(:log)
|> Tucan.Grid.set_color(:x, "red")
```
Additionally `set_theme/2` allows you to set one of the supported `Tucan.Themes`.
```tucan
Tucan.density_heatmap(:penguins, "Beak Length (mm)", "Beak Depth (mm)")
|> Tucan.set_theme(:latimes)
```
## Encoding channels options
All Tucan plots are building a `VegaLite` specification based on some sane
default parameters. Notice that only a tiny subset of vega-lite configuration
options are exported in Tucan's public API. This is more than enough in most
cases. Additionally, an optional configuration option is added for every
encoding channel that is used, that allows you to add any vega-lite option
or change the default options set by Tucan.
For example:
```tucan
Tucan.bar(:weather, "date", "date",
color_by: "weather",
tooltip: true,
x: [type: :ordinal, time_unit: :month],
y: [aggregate: :count]
)
```
"""
alias Tucan.VegaLiteUtils
alias VegaLite, as: Vl
@type plotdata :: binary() | Table.Reader.t() | Tucan.Datasets.t() | VegaLite.t()
@type field :: binary()
## Custom guards
defguardp is_pos_integer(term) when is_integer(term) and term > 0
## Plots
@doc """
Creates if needed a `VegaLite` plot and adds data to it.
The behaviour of this function depends on the type of `plotdata`:
* if a `VegaLite.t()` struct is passed then it is returned unchanged.
* If it is a binary it is considered a url and the `VegaLite.data_from_url/2` is
called on a newly created `VegaLite` struct.
* if it is an atom then it is considered a `Tucan.Dataset` and it is translated to
the dataset's url. If the dataset name is invalid an exception is raised.
* in any other case it is considered a set of data values and the values are set
as data to a newly created `VegaLite` struct.
"""
@doc section: :utilities
@spec new(plotdata :: plotdata(), opts :: keyword()) :: VegaLite.t()
def new(plotdata, opts \\ []),
do: to_vega_plot(plotdata, opts)
defp to_vega_plot(%VegaLite{} = plot, _opts), do: plot
defp to_vega_plot(dataset, opts) when is_atom(dataset),
do: to_vega_plot(Tucan.Datasets.dataset(dataset), opts)
defp to_vega_plot(dataset, opts) when is_binary(dataset) do
opts
|> new_tucan_plot()
|> Vl.data_from_url(dataset)
end
defp to_vega_plot(data, opts) do
opts
|> new_tucan_plot()
|> Vl.data_from_values(data)
end
defp new_tucan_plot(opts) do
{tucan_opts, opts} = Keyword.pop(opts, :tucan)
case tucan_opts do
nil -> Vl.new(opts)
tucan_opts -> Vl.new(opts) |> VegaLiteUtils.put_in_spec("__tucan__", tucan_opts)
end
end
## Plots
# global_opts should be applicable in all plot types
@global_opts [:width, :height, :title]
@global_mark_opts [:clip, :fill_opacity, :tooltip]
histogram_opts = [
relative: [
type: :boolean,
doc: """
If set a relative frequency histogram is generated.
""",
default: false
],
orient: [
type: {:in, [:horizontal, :vertical]},
doc: """
Histogram's orientation. It specifies the axis along which the field values
are plotted.
""",
default: :horizontal
],
color_by: [
type: :string,
doc: """
The field to group observations by. This will used for coloring the histogram
if set.
""",
section: :grouping
],
maxbins: [
type: :integer,
doc: """
Maximum number of bins.
""",
dest: :bin
],
step: [
type: {:or, [:integer, :float]},
doc: """
An exact step size to use between bins. If provided, options such as `maxbins`
will be ignored.
""",
dest: :bin
],
extent: [
type: {:custom, Tucan.Options, :extent, []},
doc: """
A two-element (`[min, max]`) array indicating the range of desired bin values.
""",
dest: :bin
],
stacked: [
type: :boolean,
doc: """
If set it will stack the group histograms instead of layering one over another. Valid
only if a semantic grouping has been applied.
"""
]
]
@histogram_opts Tucan.Options.take!(
[@global_opts, @global_mark_opts, :x, :x2, :y, :color],
histogram_opts
)
@histogram_schema Tucan.Options.to_nimble_schema!(@histogram_opts)
@doc """
Plots a histogram.
See also `density/3`
## Options
#{Tucan.Options.docs(@histogram_opts)}
## Examples
Histogram of `Horsepower`
```tucan
Tucan.histogram(:cars, "Horsepower")
```
You can flip the plot by setting the `:orient` option to `:vertical`:
```tucan
Tucan.histogram(:cars, "Horsepower", orient: :vertical)
```
By setting the `:relative` flag you can get a relative frequency histogram:
```tucan
Tucan.histogram(:cars, "Horsepower", relative: true)
```
You can increase the number of bins by settings the `maxbins` or the `step` options:
```tucan
Tucan.histogram(:cars, "Horsepower", step: 5)
```
You can draw multiple histograms by grouping the observations by a second
*categorical* variable:
```tucan
Tucan.histogram(:cars, "Horsepower", color_by: "Origin", fill_opacity: 0.5)
```
By default the histograms are plotted layered, but you can also stack them:
```tucan
Tucan.histogram(:cars, "Horsepower", color_by: "Origin", fill_opacity: 0.5, stacked: true)
```
or you can facet it, in order to make the histograms more clear:
```tucan
histograms =
Tucan.histogram(:cars, "Horsepower", color_by: "Origin", tooltip: true)
|> Tucan.facet_by(:column, "Origin")
relative_histograms =
Tucan.histogram(:cars, "Horsepower", relative: true, color_by: "Origin", tooltip: true)
|> Tucan.facet_by(:column, "Origin")
Tucan.vconcat([histograms, relative_histograms])
```
"""
@doc section: :plots
@spec histogram(plotdata :: plotdata(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def histogram(plotdata, field, opts \\ []) do
opts = NimbleOptions.validate!(opts, @histogram_schema)
spec_opts = take_options(opts, @histogram_opts, :spec)
mark_opts = take_options(opts, @histogram_opts, :mark)
plotdata
|> new(spec_opts ++ [tucan: [plot: :histogram]])
|> Vl.mark(:bar, mark_opts)
|> bin_count_transform(field, opts)
|> maybe_add_relative_frequency_transform(field, opts)
|> encode_field(:x, "bin_#{field}", opts, bin: [binned: true], title: field)
|> encode_field(:x2, "bin_#{field}_end", opts)
|> histogram_y_encoding(field, opts)
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts, [])
|> maybe_flip_axes(opts[:orient] == :vertical)
end
defp bin_count_transform(vl, field, opts) do
bin_opts =
case take_options(opts, @histogram_opts, :bin) do
[] -> true
bin_opts -> bin_opts
end
groupby =
case opts[:color_by] do
nil -> ["bin_#{field}", "bin_#{field}_end"]
color_by -> ["bin_#{field}", "bin_#{field}_end", color_by]
end
vl
|> Vl.transform(bin: bin_opts, field: field, as: "bin_#{field}")
|> Vl.transform(
aggregate: [[op: :count, as: "count_#{field}"]],
groupby: groupby
)
end
defp maybe_add_relative_frequency_transform(vl, field, opts) do
case opts[:relative] do
false ->
vl
true ->
groupby =
case opts[:color_by] do
nil -> []
color_by -> [color_by]
end
vl
|> Vl.transform(
joinaggregate: [[op: :sum, field: "count_#{field}", as: "total_count_#{field}"]],
groupby: groupby
)
|> Vl.transform(
calculate: "datum.count_#{field}/datum.total_count_#{field}",
as: "percent_#{field}"
)
end
end
defp histogram_y_encoding(vl, field, opts) do
case opts[:relative] do
false ->
encode_field(vl, :y, "count_#{field}", opts, type: :quantitative, stack: opts[:stacked])
true ->
encode_field(vl, :y, "percent_#{field}", opts,
type: :quantitative,
axis: [format: ".1~%"],
title: "Relative Frequency",
stack: opts[:stacked]
)
end
end
density_opts = [
groupby: [
type: {:list, :string},
doc: """
The data fields to group by. If not specified, a single group containing all data
objects will be used. This is applied only on the density transform.
In most cases you only need to set `color_by` which will automatically handle the
density transform grouping. Use `groupby` only if you want to manually post-process
the generated specification, or if you want to apply grouping by more than one
variable.
If both `groupby` and `color_by` are set then only `groupby` is used for grouping
the density transform and `color_by` is used for encoding the color.
""",
dest: :density_transform
],
cumulative: [
type: :boolean,
doc: """
A boolean flag indicating whether to produce density estimates (false) or cumulative
density estimates (true).
""",
default: false,
dest: :density_transform
],
counts: [
type: :boolean,
doc: """
A boolean flag indicating if the output values should be probability estimates
(false) or smoothed counts (true).
""",
default: false,
dest: :density_transform
],
bandwidth: [
type: :float,
doc: """
The bandwidth (standard deviation) of the Gaussian kernel. If unspecified or set to
zero, the bandwidth value is automatically estimated from the input data using
Scott’s rule.
""",
dest: :density_transform
],
extent: [
type: {:custom, Tucan.Options, :extent, []},
doc: """
A `[min, max]` domain from which to sample the distribution. If unspecified, the extent
will be determined by the observed minimum and maximum values of the density value field.
""",
dest: :density_transform
],
minsteps: [
type: :integer,
doc: """
The minimum number of samples to take along the extent domain for plotting the density.
""",
default: 25,
dest: :density_transform
],
maxsteps: [
type: :integer,
doc: """
The maximum number of samples to take along the extent domain for plotting the density.
""",
default: 200,
dest: :density_transform
],
steps: [
type: :integer,
doc: """
The exact number of samples to take along the extent domain for plotting the density. If
specified, overrides both minsteps and maxsteps to set an exact number of uniform samples.
Potentially useful in conjunction with a fixed extent to ensure consistent sample points
for stacked densities.
""",
dest: :density_transform
]
]
@density_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:color_by,
:x,
:y,
:orient,
:color
],
density_opts
)
@density_schema Tucan.Options.to_nimble_schema!(@density_opts)
@doc """
Plot the distribution of a numeric variable.
Density plots allow you to visualize the distribution of a numeric variable for one
or several groups. If you want to draw the density for several groups you need to
specify the `:color_by` option which is assumed to be a categorical variable.
> ### Avoid calling `color_by/3` with a density plot {: .warning}
>
> Since the grouping variable must also be used for properly calculating the density
> transformation you **should avoid calling the `color_by/3` grouping function** after
> a `density/3` call. Instead use the `:color_by` option, which will ensure that the
> proper settings are applied to the underlying transformation.
>
> Calling `color_by/3` would produce this graph:
>
> ```tucan
> Tucan.density(:penguins, "Body Mass (g)")
> |> Tucan.color_by("Species")
> ```
>
> In the above case the density function has been calculated on the complete dataset
> and you cannot color by the `Species`. Instead you should use the `:color_by`
> option which would calculate the density function per group:
>
> ```tucan
> Tucan.density(:penguins, "Body Mass (g)", color_by: "Species", fill_opacity: 0.2)
> ```
>
> Alternatively you should use the `:groupby` option in order to group the density
> transform by the `Species` field and then apply the `color_by/3` function:
>
> ```elixir
> Tucan.density(:penguins, "Body Mass (g)", groupby: ["Species"], fill_opacity: 0.2)
> |> Tucan.color_by("Species")
> ```
See also `histogram/3`.
## Options
#{Tucan.Options.docs(@density_opts)}
## Examples
```tucan
Tucan.density(:penguins, "Body Mass (g)")
```
It is a common use case to compare the density of several groups in a dataset. Several
options exist to do so. You can plot all items on the same chart, using transparency and
annotation to make the comparison possible.
```tucan
Tucan.density(:penguins, "Body Mass (g)", color_by: "Species", fill_opacity: 0.5)
```
You can also combine it with `facet_by/4` in order to draw a different plot for each value
of the grouping variable. Notice that we need to set the `:groupby` variable in order
to correctly calculate the density plot per field's value.
```tucan
Tucan.density(:penguins, "Body Mass (g)", groupby: ["Species"])
|> Tucan.color_by("Species")
|> Tucan.facet_by(:column, "Species")
```
You can control the smoothing by setting a specific `bandwidth` value (if not set it is
automatically calculated by vega lite):
```tucan
Tucan.density(:penguins, "Body Mass (g)", color_by: "Species", bandwidth: 20.0, fill_opacity: 0.5)
```
You can plot a cumulative density distribution by setting the `:cumulative` option to `true`:
```tucan
Tucan.density(:penguins, "Body Mass (g)", cumulative: true)
```
or calculate a separate cumulative distribution for each group:
```tucan
Tucan.density(:penguins, "Body Mass (g)", cumulative: true, color_by: "Species")
|> Tucan.facet_by(:column, "Species")
```
"""
@doc section: :plots
@spec density(plotdata :: plotdata(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def density(plotdata, field, opts \\ []) do
opts = NimbleOptions.validate!(opts, @density_schema)
spec_opts = take_options(opts, @histogram_opts, :spec)
mark_opts =
take_options(opts, @histogram_opts, :mark)
|> Keyword.merge(orient: :vertical)
transform_opts =
take_options(opts, @density_opts, :density_transform)
|> Keyword.merge(density: field)
|> Tucan.Keyword.put_new_conditionally(:groupby, [opts[:color_by]], fn ->
opts[:color_by] != nil
end)
plotdata
|> new(spec_opts)
|> Vl.transform(transform_opts)
|> Vl.mark(:area, mark_opts)
|> encode_field(:x, "value", opts,
type: :quantitative,
scale: [zero: false],
axis: [title: field]
)
|> encode_field(:y, "density", opts, type: :quantitative)
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts, [])
|> maybe_flip_axes(opts[:orient] == :vertical)
end
stripplot_opts = [
group: [
type: :string,
doc: """
A field to be used for grouping the strip plot. If not set the plot will
be one dimensional.
"""
],
style: [
type: {:in, [:tick, :point, :jitter]},
doc: """
The style of the plot. Can be one of the following:
* `:tick` - use ticks for each data point
* `:point` - use points for each data point
* `:jitter` - use points but also apply some jittering across the other
axis
Use `:jitter` in case of many data points in order to avoid overlaps.
""",
default: :tick
]
]
@stripplot_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:orient,
:x,
:y,
:y_offset,
:color_by,
:color
],
stripplot_opts
)
@stripplot_schema Tucan.Options.to_nimble_schema!(@stripplot_opts)
@doc """
Draws a strip plot (categorical scatterplot).
A strip plot is a single-axis scatter plot used to visualize the distribution of
a numerical field. The values are plotted as dots or ticks along one axis, so
the dots with the same value may overlap.
You can use the `:jitter` mode for a better view of overlapping points. In this
case points are randomly shifted along with other axis, which has no meaning in
itself data-wise.
Typically several strip plots are placed side by side to compare the distribution
of a numerical value among several categories.
## Options
#{Tucan.Options.docs(@stripplot_opts)}
> ### Internal `VegaLite` representation {: .info}
>
> If style is set to `:tick` the following `VegaLite` representation is generated:
>
> ```elixir
> Vl.new()
> |> Vl.mark(:tick)
> |> Vl.encode_field(:x, field, type: :quantitative)
> |> Vl.encode_field(:y, opts[:group], type: :nominal)
> ```
>
> If style is set to `:jitter` then a transform is added to generate Gaussian jitter
> using the Box-Muller transform, and the `y_offset` is also encoded based on this:
>
> ```elixir
> Vl.new()
> |> Vl.mark(:point)
> |> Vl.transform(calculate: "sqrt(-2*log(random()))*cos(2*PI*random())", as: "jitter")
> |> Vl.encode_field(:x, field, type: :quantitative)
> |> Vl.encode_field(:y, opts[:group], type: :nominal)
> |> Vl.encode_field(:y_offset, "jitter", type: :quantitative)
> ```
## Examples
Assigning a single numeric variable shows the univariate distribution. The default
style is the `:tick`:
```tucan
Tucan.stripplot(:tips, "total_bill")
```
For very dense distribution it makes more sense to use the `:jitter` style in order
to reduce overlapping points:
```tucan
Tucan.stripplot(:tips, "total_bill", style: :jitter, height: 30, width: 300)
```
You can set the `:group` option in order to add a second dimension. Notice that
the field must be categorical.
```tucan
Tucan.stripplot(:tips, "total_bill", group: "day", style: :jitter)
```
The plot would be more clear if you also colored the points with the same field:
```tucan
Tucan.stripplot(:tips, "total_bill", group: "day", style: :jitter)
|> Tucan.color_by("day")
```
Or you can color by a distinct variable to show a multi-dimensional relationship:
```tucan
Tucan.stripplot(:tips, "total_bill", group: "day", style: :jitter)
|> Tucan.color_by("sex")
```
or you can color by a numerical variable:
```tucan
Tucan.stripplot(:tips, "total_bill", group: "day", style: :jitter)
|> Tucan.color_by("size", type: :ordinal)
```
You could draw the same with points but without jittering:
```tucan
Tucan.stripplot(:tips, "total_bill", group: "day", style: :point)
|> Tucan.color_by("sex")
```
or with ticks which is the default one:
```tucan
Tucan.stripplot(:tips, "total_bill", group: "day", style: :tick)
|> Tucan.color_by("sex")
```
You can set the `:orient` flag to `:vertical` to change the orientation:
```tucan
Tucan.stripplot(:tips, "total_bill", group: "day", style: :jitter, orient: :vertical)
|> Tucan.color_by("sex")
```
"""
@doc section: :plots
@spec stripplot(plotdata :: plotdata(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def stripplot(plotdata, field, opts \\ []) do
opts = NimbleOptions.validate!(opts, @stripplot_schema)
spec_opts = take_options(opts, @stripplot_opts, :spec)
plotdata
|> new(spec_opts)
|> stripplot_mark(opts[:style], Keyword.take(opts, [:tooltip]))
|> encode_field(:x, field, opts, type: :quantitative)
|> maybe_encode_field(:y, fn -> opts[:group] != nil end, opts[:group], opts, type: :nominal)
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts, [])
|> maybe_add_jitter(opts)
|> maybe_flip_axes(opts[:orient] == :vertical)
end
defp stripplot_mark(vl, :tick, opts), do: Vl.mark(vl, :tick, opts)
defp stripplot_mark(vl, _other, opts), do: Vl.mark(vl, :point, [size: 16] ++ opts)
defp maybe_encode_field(vl, channel, condition_fn, field, opts, extra_opts) do
case condition_fn.() do
false ->
vl
true ->
encode_field(vl, channel, field, opts, extra_opts)
end
end
defp maybe_add_jitter(vl, opts) do
case opts[:style] do
:jitter ->
vl
|> Vl.transform(calculate: "sqrt(-2*log(random()))*cos(2*PI*random())", as: "jitter")
|> encode_field(:y_offset, "jitter", opts, type: :quantitative, axis: nil)
_other ->
vl
end
end
boxplot_opts = [
group_by: [
type: :string,
doc: """
A field to be used for grouping the boxplot. It is used for adding a second dimension to
the plot. If not set the plot will be one dimensional. Notice that a grouping is automatically
applied if the `:color_by` option is set.
""",
section: :grouping
],
mode: [
type: {:in, [:tukey, :min_max]},
doc: """
The type of the box plot. Either a Tukey box plot will be created or a min-max plot.
""",
default: :tukey
],
k: [
type: :float,
doc: """
The constant used for calculating the extent of the whiskers in a Tukey boxplot. Applicable
only if `:mode` is set to `:tukey`.
""",
default: 1.5
]
]
@boxplot_opts Tucan.Options.take!(
[@global_opts, @global_mark_opts, :orient, :color_by, :x, :y, :color],
boxplot_opts
)
@boxplot_schema Tucan.Options.to_nimble_schema!(@boxplot_opts)
@doc """
Returns the specification of a box plot.
By default a one dimensional box plot of the `:field` - which must be a numerical variable - is
generated. You can add a second dimension across a categorical variable by either setting the
`:group` or `:color_by` options.
By default a Tukey box plot will be generated. In the Tukey box plot the whisker spans from
the smallest data to the largest data within the range `[Q1 - k * IQR, Q3 + k * IQR]` where
`Q1`and `Q3` are the first and third quartiles while `IQR` is the interquartile range
`(Q3-Q1)`. You can specify if needed the constant `k` which defaults to 1.5.
Additionally you can set the `mode` to `:min_max` where the lower and upper whiskers are
defined as the min and max respectively. No points will be considered as outliers for this
type of box plots. In this case the `k` value is ignored.
> #### What is a box plot {: .info}
>
> A box plot (box and whisker plot) displays the five-number summary of a set of data. The
> five-number summary is the minimum, first quartile, median, third quartile, and maximum.
> In a box plot, we draw a box from the first quartile to the third quartile. A vertical
> line goes through the box at the median.
## Options
#{Tucan.Options.docs(@boxplot_opts)}
## Examples
A one dimensional Tukey boxplot:
```tucan
Tucan.boxplot(:penguins, "Body Mass (g)")
```
You can set `:group` or `:color_by` in order to set a second dimension:
```tucan
Tucan.boxplot(:penguins, "Body Mass (g)", color_by: "Species")
```
You can set the mode to `:min_max` in order to extend the whiskers to the min and max values:
```tucan
Tucan.boxplot(:penguins, "Body Mass (g)", color_by: "Species", mode: :min_max)
```
By setting the `:orient` to `:vertical` you can change the default horizontal orientation:
```tucan
Tucan.boxplot(:penguins, "Body Mass (g)", color_by: "Species", orient: :vertical)
```
"""
@doc section: :plots
@spec boxplot(plotdata :: plotdata(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def boxplot(plotdata, field, opts \\ []) do
opts = NimbleOptions.validate!(opts, @boxplot_schema)
extent =
case opts[:mode] do
:tukey -> opts[:k]
:min_max -> "min-max"
end
spec_opts = take_options(opts, @boxplot_opts, :spec)
mark_opts =
take_options(opts, @boxplot_opts, :mark)
|> Keyword.merge(extent: extent)
group_field = opts[:group_by] || opts[:color_by]
plotdata
|> new(spec_opts)
|> Vl.mark(:boxplot, mark_opts)
|> encode_field(:x, field, opts, type: :quantitative, scale: [zero: false])
|> maybe_encode_field(:y, fn -> group_field != nil end, group_field, opts, type: :nominal)
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts, [])
|> maybe_flip_axes(opts[:orient] == :vertical)
end
heatmap_opts = [
aggregate: [
type: :atom,
doc: """
The statistic that will be used for aggregating the observations within a heatmap
tile. Defaults to `:mean` which in case of single data will encode the value of
the `color` data field.
Ignored if `:color` is set to `nil`.
"""
],
color_scheme: [
type: :atom,
doc: """
The colorscheme to use, for supported colorschemes check `Tucan.Scale`. Notice that
this is just a helper option for easily setting color schemes. If you need to set
specific colors or customize the scheme, use `Tucan.Scale.set_color_scheme/3`.
""",
section: :style
],
annotate: [
type: :boolean,
default: false,
doc: """
If set to `true` then the values of each cell will be included in the plot.
"""
]
]
@heatmap_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:x,
:y,
:color,
:text
],
heatmap_opts
)
@heatmap_schema Tucan.Options.to_nimble_schema!(@heatmap_opts)
@doc """
Returns the specification of a heatmap.
A heatmap is a graphical representation of data where the individual values
contained in a matrix are represented as colors.
It expects two categorical fields `x`, `y` which will be used for the axes
and a numerical field `color`. If `color` is `nil` then the color represents
the count of the observations for each `x, y`.
If an `:aggregate` is set this statistic will be used for encoding the color.
If no `:aggregate` is set the color encodes by default the `:mean` of the
data.
## Options
#{Tucan.Options.docs(@heatmap_opts)}
## Examples
A simple heatmap of two categorical variables, using a third one for the
color values.
```tucan
data = [
%{"x" => "A", "y" => "K", "value" => 0.5},
%{"x" => "A", "y" => "L", "value" => 1.5},
%{"x" => "A", "y" => "M", "value" => 4.5},
%{"x" => "B", "y" => "K", "value" => 1.5},
%{"x" => "B", "y" => "L", "value" => 2.5},
%{"x" => "B", "y" => "M", "value" => 0.5},
%{"x" => "C", "y" => "K", "value" => -1.5},
%{"x" => "C", "y" => "L", "value" => 5.5},
%{"x" => "C", "y" => "M", "value" => 1.5},
]
Tucan.heatmap(data, "x", "y", "value", width: 200, height: 200)
```
You can change the color scheme:
```tucan
Tucan.heatmap(:glue, "Task", "Model", "Score", color_scheme: :redyellowgreen, tooltip: true)
```
Heatmaps are also useful for visualizing temporal data. Let's use a heatmap to examine
how Seattle's max temperature changes over the year. On the _x-axis_ we will encode the
days of the month along the x-axis, and the months on the _y-axis_. We will aggregate
over the max temperature for the color field. (example borrowed from
[here](https://observablehq.com/@jonfroehlich/basic-time-series-plots-in-vega-lite?collection=@jonfroehlich/intro-to-vega-lite))
```tucan
Tucan.heatmap(:weather, "date", "date", "temp_max",
x: [type: :ordinal, time_unit: :date],
y: [type: :ordinal, time_unit: :month],
tooltip: true
)
|> Tucan.Scale.set_color_scheme(:redyellowblue, reverse: true)
|> Tucan.Axes.set_x_title("Day")
|> Tucan.Axes.set_y_title("Month")
|> Tucan.Legend.set_title(:color, "Avg Max Temp")
|> Tucan.set_title("Heatmap of Avg Max Temperatures in Seattle (2012-2015)")
```
You can enable annotations by setting the `:annotate` flag:
```tucan
Tucan.heatmap(:weather, "date", "date", "temp_max",
annotate: true,
x: [type: :ordinal, time_unit: :date],
y: [type: :ordinal, time_unit: :month],
text: [format: ".1f"],
tooltip: true,
width: 800
)
|> Tucan.Scale.set_color_scheme(:redyellowblue, reverse: true)
|> Tucan.Axes.set_x_title("Day")
|> Tucan.Axes.set_y_title("Month")
|> Tucan.Legend.set_title(:color, "Avg Max Temp")
|> Tucan.set_title("Heatmap of Avg Max Temperatures in Seattle (2012-2015)")
```
"""
@doc section: :plots
@spec heatmap(
plotdata :: plotdata(),
x :: binary(),
y :: binary(),
color :: nil | binary(),
opts :: keyword()
) ::
VegaLite.t()
def heatmap(plotdata, x, y, color, opts \\ []) do
opts = NimbleOptions.validate!(opts, @heatmap_schema)
heatmap_specification(plotdata, x, y, color, :color, :rect, opts, @heatmap_opts)
end
@punchcard_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:x,
:y,
:size
],
heatmap_opts
)
@punchcard_schema Tucan.Options.to_nimble_schema!(@punchcard_opts)
@doc """
Returns the specification of a punch card plot.
A punch card plot is similar to a heatmap but instead of color the third
dimension is encoded by the size of bubbles.
See also `heatmap/5`.
## Options
#{Tucan.Options.docs(@punchcard_opts)}
## Examples
```tucan
Tucan.punchcard(:weather, "date", "date", "temp_max",
tooltip: true,
x: [type: :ordinal, time_unit: :date],
y: [type: :ordinal, time_unit: :month]
)
|> Tucan.Axes.set_x_title("Day")
|> Tucan.Axes.set_y_title("Month")
|> Tucan.set_title("Punch card of Avg Max Temperatures in Seattle (2012-2015)")
```
You can add a fourth dimension by coloring the plot by a fourth variable. Notice how
we use `Tucan.Scale.set_color_scheme/3` to apply a semantically reasonable coloring and
`Tucan.Legend.set_orientation/3` to change the default position of the two legends.
```tucan
Tucan.punchcard(:weather, "date", "date", "precipitation",
tooltip: true,
x: [type: :ordinal, time_unit: :date],
y: [type: :ordinal, time_unit: :month]
)
# we need to set recursive to true since this is a layered plot
|> Tucan.color_by("temp_max", aggregate: :mean, recursive: true)
|> Tucan.Scale.set_color_scheme(:redyellowblue, reverse: true)
|> Tucan.Axes.set_x_title("Day")
|> Tucan.Axes.set_y_title("Month")
|> Tucan.Legend.set_orientation(:color, "bottom")
|> Tucan.Legend.set_orientation(:size, "bottom")
```
"""
@doc section: :plots
@spec punchcard(
plotdata :: plotdata(),
x :: binary(),
y :: binary(),
size :: nil | binary(),
opts :: keyword()
) ::
VegaLite.t()
def punchcard(plotdata, x, y, size, opts \\ []) do
opts = NimbleOptions.validate!(opts, @punchcard_schema)
heatmap_specification(plotdata, x, y, size, :size, :circle, opts, @punchcard_opts)
end
defp heatmap_specification(plotdata, x, y, z, z_encoding, mark, opts, plot_opts) do
spec_opts = take_options(opts, plot_opts, :spec)
mark_opts = take_options(opts, plot_opts, :mark)
opts =
if opts[:color_scheme] do
Keyword.update!(opts, :color, fn color_opts ->
Tucan.Keyword.deep_merge([scale: [scheme: opts[:color_scheme]]], color_opts)
end)
else
opts
end
z_fn = fn vl, encoding ->
case z do
nil ->
encode(vl, encoding, opts, type: :quantitative, aggregate: :count)
field ->
encode_field(vl, encoding, field, opts,
aggregate: opts[:aggregate] || :mean,
type: :quantitative
)
end
end
base_layer =
[
Vl.new()
|> Vl.mark(mark, mark_opts)
|> encode_field(:x, x, opts, type: :nominal)
|> encode_field(:y, y, opts, type: :nominal)
|> z_fn.(z_encoding)
]
text_layer =
if opts[:annotate] do
[
Vl.new()
|> Vl.mark(:text)
|> encode_field(:x, x, opts, type: :nominal)
|> encode_field(:y, y, opts, type: :nominal)
|> z_fn.(:text)
]
else
[]
end
plotdata
|> new(spec_opts ++ [tucan: [multilayer: true]])
|> layers(base_layer ++ text_layer)
end
density_heatmap_opts = [
z: [
type: :string,
doc: """
If set corresponds to the field that will be used for calculating the color fo the
bin using the provided aggregate. If not set (the default behaviour) the count of
observations are used for coloring the bin.
"""
],
aggregate: [
type: :atom,
doc: """
The statistic that will be used for aggregating the observations within a bin. The
`z` field must be set if `aggregate` is set.
"""
]
]
@density_heatmap_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:x,
:y,
:color
],
density_heatmap_opts
)
@density_heatmap_schema Tucan.Options.to_nimble_schema!(@density_heatmap_opts)
@doc """
Draws a density heatmap.
A density heatmap is a bivariate histogram, e.g. the `x`, `y` data are binned
within rectangles that tile the plot and then the count of observations within
each rectangle is shown with the fill color.
By default the `count` of observations within each rectangle is encoded, but you
can calculate the statistic of any field and use it instead.
Density heatmaps are a powerful visualization tool that find their best use cases
in situations where you need to explore and understand the distribution and
concentration of data points in a two-dimensional space. They are particularly
effective when dealing with large datasets, allowing you to uncover patterns,
clusters, and trends that might be difficult to discern in raw data.
## Options
#{Tucan.Options.docs(@density_heatmap_opts)}
## Examples
Let's start with a default density heatmap on the penguins dataset:
```tucan
Tucan.density_heatmap(:penguins, "Beak Length (mm)", "Beak Depth (mm)")
```
You can summarize over another field:
```tucan
Tucan.density_heatmap(:penguins, "Beak Length (mm)", "Beak Depth (mm)", z: "Body Mass (g)", aggregate: :mean)
```
"""
@doc section: :plots
@spec density_heatmap(plotdata :: plotdata(), x :: binary(), y :: binary(), opts :: keyword()) ::
VegaLite.t()
def density_heatmap(plotdata, x, y, opts \\ []) do
opts = NimbleOptions.validate!(opts, @density_heatmap_schema)
spec_opts = take_options(opts, @density_heatmap_opts, :spec)
mark_opts = take_options(opts, @density_heatmap_opts, :mark)
color_fn = fn vl ->
case opts[:z] do
nil ->
encode(vl, :color, opts, type: :quantitative, aggregate: opts[:aggregate] || :count)
field ->
encode_field(vl, :color, field, opts,
aggregate: opts[:aggregate] || :count,
type: :quantitative
)
end
end
plotdata
|> new(spec_opts)
|> Vl.mark(:rect, mark_opts)
|> encode_field(:x, x, opts, type: :quantitative, bin: true)
|> encode_field(:y, y, opts, type: :quantitative, bin: true)
|> color_fn.()
end
bar_opts = [
mode: [
type: {:in, [:stacked, :normalize, :grouped]},
doc: """
The stacking mode, applied only if `:color_by` is set. Can be one of the
following:
* `:stacked` - the default one, bars are stacked
* `:normalize` - the bars are stacked are normalized
* `:grouped` - no stacking is applied, a separate bar for each category
""",
default: :stacked
]
]
@bar_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:color_by,
:orient,
:x,
:y,
:color,
:x_offset
],
bar_opts
)
@bar_schema Tucan.Options.to_nimble_schema!(@bar_opts)
@doc """
Returns the specification of a bar chart.
A bar chart is consisted by a categorical `field` and a numerical `value` field that
defines the height of the bars. You can create a grouped bar chart by setting
the `:color_by` option.
Additionally you should specify the aggregate for the `y` values, if your dataset contains
more than one values per category.
## Options
#{Tucan.Options.docs(@bar_opts)}
## Examples
A simple bar chart:
```tucan
data = [
%{"a" => "A", "b" => 28}, %{"a" => "B", "b" => 55}, %{"a" => "C", "b" => 43},
%{"a" => "D", "b" => 91}, %{"a" => "E", "b" => 81}, %{"a" => "F", "b" => 53},
%{"a" => "G", "b" => 19}, %{"a" => "H", "b" => 87}, %{"a" => "I", "b" => 52}
]
Tucan.bar(data, "a", "b")
```
You can set a `color_by` option that will create a stacked bar chart:
```tucan
Tucan.bar(:weather, "date", "date",
color_by: "weather",
tooltip: true,
x: [type: :ordinal, time_unit: :month],
y: [aggregate: :count]
)
```
If you set the mode option to `:grouped` you will instead have a different bar
per group, you can also change the orientation by setting the `:orient` flag.
Similarly you can set the mode to `:normalize` in order to have normalized
stacked bars.
```tucan
data = [
%{"category" => "A", "group" => "x", "value" => 0.1},
%{"category" => "A", "group" => "y", "value" => 0.6},
%{"category" => "A", "group" => "z", "value" => 0.9},
%{"category" => "B", "group" => "x", "value" => 0.7},
%{"category" => "B", "group" => "y", "value" => 0.2},
%{"category" => "B", "group" => "z", "value" => 1.1},
%{"category" => "C", "group" => "x", "value" => 0.6},
%{"category" => "C", "group" => "y", "value" => 0.1},
%{"category" => "C", "group" => "z", "value" => 0.2}
]
grouped =
Tucan.bar(
data, "category", "value",
color_by: "group",
mode: :grouped,
orient: :vertical
)
normalized =
Tucan.bar(
data, "category", "value",
color_by: "group",
mode: :normalize
)
Tucan.hconcat([grouped, normalized])
```
"""
@doc section: :plots
@spec bar(plotdata :: plotdata(), field :: binary(), value :: binary(), opts :: keyword()) ::
VegaLite.t()
def bar(plotdata, field, value, opts \\ []) do
opts = NimbleOptions.validate!(opts, @bar_schema)
spec_opts = take_options(opts, @bar_opts, :spec)
mark_opts = take_options(opts, @bar_opts, :mark)
y_opts =
case opts[:mode] do
:normalize -> [stack: :normalize]
_other -> []
end
|> Keyword.merge(type: :quantitative)
plotdata
|> new(spec_opts)
|> Vl.mark(:bar, mark_opts)
|> encode_field(:x, field, opts, type: :nominal, axis: [label_angle: 0])
|> encode_field(:y, value, opts, y_opts)
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts, [])
|> maybe_x_offset(opts[:color_by], opts[:mode] == :grouped, opts)
|> maybe_flip_axes(opts[:orient] == :vertical)
end
@doc """
Plot the counts of observations for a categorical variable.
Takes a categorical `field` as input and generates a count plot
visualization. By default the counts are plotted on the *y-axis*
and the categorical `field` across the *x-axis*.
This is similar to `histogram/3` but specifically for a categorical
variable.
This is a simple wrapper around `bar/4` where by default the count of
observations is mapped to the `y` variable.
> #### What is a countplot? {: .tip}
>
> A countplot is a type of bar chart used in data visualization to
> display the **frequency of occurrences of categorical data**. It is
> particularly useful for visualizing the *distribution* and *frequency*
> of different categories within a dataset.
>
> In a countplot, each unique category is represented by a bar, and the
> height of the bar corresponds to the number of occurrences of that
> category in the data.
## Options
See `bar/4`
## Examples
We will use the `:titanic` dataset on the following examples. We can
plot the number of passengers by ticket class:
```tucan
Tucan.countplot(:titanic, "Pclass")
```
You can make the bars horizontal by setting the `:orient` option:
```tucan
Tucan.countplot(:titanic, "Pclass", orient: :vertical)
```
You can set `:color_by` to group it by a second variable:
```tucan
Tucan.countplot(:titanic, "Pclass", color_by: "Survived")
```
By default the bars are stacked. You can unstack them by setting the
`:mode` to `:grouped`
```tucan
Tucan.countplot(:titanic, "Pclass", color_by: "Survived", mode: :grouped)
```
"""
@doc section: :plots
@spec countplot(plotdata :: plotdata(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def countplot(plotdata, field, opts \\ []) do
y_opts =
Keyword.get(opts, :y, [])
|> Keyword.merge(aggregate: :count)
opts = Keyword.put(opts, :y, y_opts)
bar(plotdata, field, field, opts)
end
defp maybe_x_offset(vl, nil, _stacked, _opts), do: vl
defp maybe_x_offset(vl, _field, false, _opts), do: vl
defp maybe_x_offset(vl, field, true, opts), do: encode_field(vl, :x_offset, field, opts)
scatter_opts = [
point_color: [
type: :string,
doc: "The color of the points",
section: :style
],
point_shape: [
type:
{:in,
[
"circle",
"square",
"cross",
"diamond",
"triangle-up",
"triangle-down",
"triangle-right",
"triangle-left"
]},
doc: "Shape of the point marks. Circle by default.",
section: :style
],
point_size: [
type: :pos_integer,
doc: """
The pixel area of the marks. Note that this value sets the area of the symbol;
the side lengths will increase with the square root of this value.
""",
section: :style
]
]
@scatter_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:filled,
:color_by,
:shape_by,
:size_by,
:x,
:y,
:color,
:shape,
:size
],
scatter_opts
)
@scatter_schema Tucan.Options.to_nimble_schema!(@scatter_opts)
@doc """
Returns the specification of a scatter plot with possibility of several semantic
groupings.
Both `x` and `y` must be `:quantitative`.
> #### Semantic groupings {: .tip}
>
> The relationship between `x` and `y` can be shown for different subsets of the
> data using the `color_by`, `size_by` and `shape_by` parameters. This is equivalent
> to calling the corresponding functions after a `scatter/4` call.
>
> These parameters control what visual semantics are used to identify the different
> subsets. It is possible to show up to three dimensions independently by using all
> three semantic types, but this style of plot can be hard to interpret and is often
> ineffective.
>
> ```tucan
> Tucan.scatter(:tips, "total_bill", "tip",
> color_by: "day",
> shape_by: "sex",
> size_by: "size"
> )
> ```
>
> The above is equivalent to calling:
>
> ```elixir
> Tucan.scatter(:tips, "total_bill", "tip")
> |> Tucan.color_by("day", type: :nominal)
> |> Tucan.shape_by("sex", type: :nominal)
> |> Tucan.size_by("size", type: :quantitative)
> ```
>
> Using redundant semantics (i.e. both color and shape for the same variable) can be
> helpful for making graphics more accessible.
>
> ```tucan
> Tucan.scatter(:tips, "total_bill", "tip",
> color_by: "day",
> shape_by: "day"
> )
> ```
## Options
#{Tucan.Options.docs(@scatter_opts)}
## Examples
> We will use the `:tips` dataset throughout the following examples.
Drawing a scatter plot between two variables:
```tucan
Tucan.scatter(:tips, "total_bill", "tip")
```
You can modify the look of the plot by setting various styling options:
```tucan
Tucan.scatter(:tips, "total_bill", "tip",
point_color: "red",
point_shape: "triangle-up",
point_size: 10
)
```
You can combine it with `color_by/3` to color code the points with respect to
another variable:
```tucan
Tucan.scatter(:tips, "total_bill", "tip")
|> Tucan.color_by("time")
```
Assigning the same variable to `shape_by/3` will also vary the markers and create a
more accessible plot:
```tucan
Tucan.scatter(:tips, "total_bill", "tip", width: 400)
|> Tucan.color_by("time")
|> Tucan.shape_by("time")
```
Assigning `color_by/3` and `shape_by/3` to different variables will vary colors and
markers independently:
```tucan
Tucan.scatter(:tips, "total_bill", "tip", width: 400)
|> Tucan.color_by("day")
|> Tucan.shape_by("time")
```
You can also color the points by a numeric variable, the semantic mapping will be
quantitative and will use a different default palette:
```tucan
Tucan.scatter(:tips, "total_bill", "tip", width: 400)
|> Tucan.color_by("size", type: :quantitative)
```
A numeric variable can also be assigned to size to apply a semantic mapping to the
areas of the points:
```tucan
Tucan.scatter(:tips, "total_bill", "tip", width: 400, tooltip: :data)
|> Tucan.color_by("size", type: :quantitative)
|> Tucan.size_by("size", type: :quantitative)
```
You can also combine it with `facet_by/3` in order to group within additional
categorical variables, and plot them across multiple subplots.
```tucan
Tucan.scatter(:tips, "total_bill", "tip", width: 300)
|> Tucan.color_by("day")
|> Tucan.shape_by("day")
|> Tucan.facet_by(:column, "time")
```
You can also apply faceting on more than one variables, both horizontally and
vertically:
```tucan
Tucan.scatter(:tips, "total_bill", "tip", width: 300)
|> Tucan.color_by("day")
|> Tucan.shape_by("day")
|> Tucan.size_by("size")
|> Tucan.facet_by(:column, "time")
|> Tucan.facet_by(:row, "sex")
```
"""
@doc section: :plots
@spec scatter(plotdata :: plotdata(), x :: binary(), y :: binary(), opts :: keyword()) ::
VegaLite.t()
def scatter(plotdata, x, y, opts \\ []) do
opts = NimbleOptions.validate!(opts, @scatter_schema)
spec_opts = take_options(opts, @scatter_opts, :spec)
mark_opts =
take_options(opts, @scatter_opts, :mark)
|> Tucan.Keyword.put_not_nil(:color, opts[:point_color])
|> Tucan.Keyword.put_not_nil(:shape, opts[:point_shape])
|> Tucan.Keyword.put_not_nil(:size, opts[:point_size])
plotdata
|> new(spec_opts)
|> Vl.mark(:point, mark_opts)
|> encode_field(:x, x, opts, type: :quantitative, scale: [zero: false])
|> encode_field(:y, y, opts, type: :quantitative, scale: [zero: false])
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts,
type: :nominal
)
|> maybe_encode_field(:shape, fn -> opts[:shape_by] != nil end, opts[:shape_by], opts,
type: :nominal
)
|> maybe_encode_field(:size, fn -> opts[:size_by] != nil end, opts[:size_by], opts,
type: :quantitative
)
end
@bubble_opts Tucan.Options.take!([
@global_opts,
@global_mark_opts,
:color_by,
:x,
:y,
:size,
:color
])
@bubble_schema Tucan.Options.to_nimble_schema!(@bubble_opts)
@doc """
Returns the specification of a bubble plot.
A bubble plot is a scatter plot with a third parameter defining the size of the dots.
All `x`, `y` and `size` must be numerical data fields.
See also `scatter/4`.
## Options
#{Tucan.Options.docs(@bubble_opts)}
## Examples
```tucan
Tucan.bubble(:gapminder, "income", "health", "population", width: 400)
|> Tucan.Axes.set_x_title("Gdp per Capita")
|> Tucan.Axes.set_y_title("Life expectancy")
```
You could use a fourth variable to color the graph. As always you can set the `tooltip` in
order to make the plot interactive:
```tucan
Tucan.bubble(:gapminder, "income", "health", "population", color_by: "region", width: 400, tooltip: :data)
|> Tucan.Axes.set_x_title("Gdp per Capita")
|> Tucan.Axes.set_y_title("Life expectancy")
```
It makes more sense to use a log scale for the _x axis_:
```tucan
Tucan.bubble(:gapminder, "income", "health", "population", color_by: "region", width: 400, tooltip: :data)
|> Tucan.Axes.set_x_title("Gdp per Capita")
|> Tucan.Axes.set_y_title("Life expectancy")
|> Tucan.Scale.set_x_scale(:log)
```
"""
@doc section: :plots
@spec bubble(
plotdata :: plotdata(),
x :: field(),
y :: field(),
size :: field(),
opts :: keyword()
) :: VegaLite.t()
def bubble(plotdata, x, y, size, opts \\ []) do
opts = NimbleOptions.validate!(opts, @bubble_schema)
spec_opts = take_options(opts, @bubble_opts, :spec)
plotdata
|> new(spec_opts)
|> Vl.mark(:circle, Keyword.take(opts, [:tooltip]))
|> encode_field(:x, x, opts, type: :quantitative, scale: [zero: false])
|> encode_field(:y, y, opts, type: :quantitative, scale: [zero: false])
|> encode_field(:size, size, opts, type: :quantitative)
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts,
type: :nominal
)
end
lineplot_opts = [
group_by: [
type: :string,
doc: "A field to group by the lines without affecting the style of it.",
section: :grouping
],
points: [
type: :boolean,
doc: "Whether points will be included in the chart.",
default: false,
section: :style
],
filled: [
type: :boolean,
doc: "Whether the points will be filled or not. Valid only if `:points` is set.",
default: true,
section: :style
],
line_color: [
type: :string,
doc: "The color of the line",
section: :style
]
]
@lineplot_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:interpolate,
:tension,
:color_by,
:x,
:y,
:color
],
lineplot_opts
)
@lineplot_schema Tucan.Options.to_nimble_schema!(@lineplot_opts)
@doc """
Draw a line plot between `x` and `y`
Both `x` and `y` are considered numerical variables.
## Options
#{Tucan.Options.docs(@lineplot_opts)}
## Examples
Plotting a simple line chart of Google stock price over time. Notice how we change the
`x` axis type from the default (`:quantitative`) to `:temporal` using the generic
`:x` channel configuration option:
```tucan
Tucan.lineplot(:stocks, "date", "price", x: [type: :temporal])
|> VegaLite.transform(filter: "datum.symbol==='GOOG'")
```
You could plot all stocks of the dataset with different colors by setting the `:color_by`
option. If you do not want to color lines differently, you can pass the `:group_by` option
instead of `:color_by`:
```tucan
left = Tucan.lineplot(:stocks, "date", "price", x: [type: :temporal], color_by: "symbol")
right = Tucan.lineplot(:stocks, "date", "price", x: [type: :temporal], group_by: "symbol")
Tucan.hconcat([left, right])
```
You can also overlay the points by setting the `:points` and `:filled` opts. Notice
that below we plot by year and aggregating the `y` values:
```tucan
filled_points =
Tucan.lineplot(:stocks, "date", "price",
x: [type: :temporal, time_unit: :year],
y: [aggregate: :mean],
color_by: "symbol",
points: true,
tooltip: true,
width: 300
)
stroked_points =
Tucan.lineplot(:stocks, "date", "price",
x: [type: :temporal, time_unit: :year],
y: [aggregate: :mean],
color_by: "symbol",
points: true,
filled: false,
tooltip: true,
width: 300
)
Tucan.hconcat([filled_points, stroked_points])
```
You can use various interpolation methods. Some examples follow:
```tucan
plots =
for interpolation <- ["linear", "step", "cardinal", "monotone"] do
Tucan.lineplot(:stocks, "date", "price",
x: [type: :temporal, time_unit: :year],
y: [aggregate: :mean],
color_by: "symbol",
interpolate: interpolation
)
|> Tucan.set_title(interpolation)
end
VegaLite.new(columns: 2)
|> Tucan.concat(plots)
```
"""
@doc section: :plots
@spec lineplot(plotdata :: plotdata(), x :: field(), y :: field(), opts :: keyword()) ::
VegaLite.t()
def lineplot(plotdata, x, y, opts \\ []) do
opts = NimbleOptions.validate!(opts, @lineplot_schema)
spec_opts = take_options(opts, @lineplot_opts, :spec)
mark_opts =
take_options(opts, @lineplot_opts, :mark)
|> maybe_add_point_opts(opts[:points], opts)
|> Tucan.Keyword.put_not_nil(:color, opts[:line_color])
plotdata
|> new(spec_opts)
|> Vl.mark(:line, mark_opts)
|> encode_field(:x, x, opts, type: :quantitative)
|> encode_field(:y, y, opts, type: :quantitative)
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts, [])
|> maybe_encode_field(
:detail,
fn -> opts[:group_by] != nil end,
opts[:group_by],
[detail: []],
type: :nominal
)
end
defp maybe_add_point_opts(mark_opts, false, _opts), do: mark_opts
defp maybe_add_point_opts(mark_opts, true, opts) do
point_opts =
case opts[:filled] do
true -> [point: true]
false -> [point: [filled: false, fill: "white"]]
end
Keyword.merge(mark_opts, point_opts)
end
@doc """
Returns the specification of a step chart.
This is a simple wrapper around `lineplot/4` with `:interpolate` set by default
to `"step"`. If `:interpolate` is set to any of `step, step-before, step-after` it
will be used. In any other case defaults to `step`.
## Options
Check `lineplot/4`
## Examples
```tucan
Tucan.step(:stocks, "date", "price", color_by: "symbol", width: 300, x: [type: :temporal])
|> Tucan.Scale.set_y_scale(:log)
```
"""
@doc section: :plots
@spec step(plotdata :: plotdata(), x :: field(), y :: field(), opts :: keyword()) ::
VegaLite.t()
def step(plotdata, x, y, opts \\ []) do
interpolate =
case opts[:interpolate] do
step when step in ["step", "step-before", "step-after"] -> step
_other -> "step"
end
opts = Keyword.merge(opts, interpolate: interpolate)
lineplot(plotdata, x, y, opts)
end
area_opts = [
points: [
type: :boolean,
doc: "Whether points will be included in the chart.",
default: false
],
line: [
type: :boolean,
doc: "Whether the line will be included in the chart",
default: false,
dest: :mark
],
mode: [
type: {:in, [:stacked, :normalize, :streamgraph, :no_stack]},
doc: """
The stacking mode, applied only if `:color_by` is set. Can be one of the
following:
* `:stacked` - the default one, areas are stacked
* `:normalize` - the stacked charts are normalized
* `:streamgraph` - the chart is displaced around a central axis
* `:no_stack` - no stacking is applied
""",
default: :stacked
]
]
@area_opts Tucan.Options.take!(
[
@global_opts,
@global_mark_opts,
:interpolate,
:tension,
:color_by,
:x,
:y,
:color
],
area_opts
)
@area_schema Tucan.Options.to_nimble_schema!(@area_opts)
@doc """
Returns the specification of an area plot.
## Options
#{Tucan.Options.docs(@area_opts)}
## Examples
A simple area chart of Google stock price over time. Notice how we change the
`x` axis type from the default (`:quantitative`) to `:temporal` using the generic
`:x` channel configuration option:
```tucan
Tucan.area(:stocks, "date", "price", x: [type: :temporal])
|> VegaLite.transform(filter: "datum.symbol==='GOOG'")
```
You can overlay the points and/or the line:
```tucan
Tucan.area(:stocks, "date", "price", x: [type: :temporal], points: true, line: true)
|> VegaLite.transform(filter: "datum.symbol==='GOOG'")
```
If you add the `:color_by` property then the area charts are stacked by default. Below
you can see how the generic encoding options can be used in order to modify any part
of the underlying `VegaLite` specification:
```tucan
Tucan.area(:unemployment, "date", "count",
color_by: "series",
x: [type: :temporal, time_unit: :yearmonth, axis: [format: "%Y"]],
y: [aggregate: :sum],
color: [scale: [scheme: "category20b"]],
width: 300,
height: 200
)
```
You could change the mode to `:normalize` or `:streamgraph`:
```tucan
left =
Tucan.area(:unemployment, "date", "count",
color_by: "series",
mode: :normalize,
x: [type: :temporal, time_unit: :yearmonth, axis: [format: "%Y"]],
y: [aggregate: :sum]
)
|> Tucan.set_title("normalize")
right =
Tucan.area(:unemployment, "date", "count",
color_by: "series",
mode: :streamgraph,
x: [type: :temporal, time_unit: :yearmonth, axis: [format: "%Y"]],
y: [aggregate: :sum]
)
|> Tucan.set_title("streamgraph")
Tucan.hconcat([left, right])
```
Or you could disable the stacking at all:
```tucan
Tucan.area(:stocks, "date", "price",
color_by: "symbol",
mode: :no_stack,
x: [type: :temporal],
width: 400,
fill_opacity: 0.4
)
|> Tucan.Scale.set_y_scale(:log)
```
"""
@doc section: :plots
@spec area(plotdata :: plotdata(), x :: field(), y :: field(), opts :: keyword()) ::
VegaLite.t()
def area(plotdata, x, y, opts \\ []) do
opts = NimbleOptions.validate!(opts, @area_schema)
spec_opts = take_options(opts, @area_opts, :spec)
mark_opts =
take_options(opts, @area_opts, :mark)
|> Keyword.put(:point, Keyword.get(opts, :points, false))
stack =
case opts[:mode] do
:stacked -> true
:normalize -> "normalize"
:streamgraph -> "center"
:no_stack -> false
end
plotdata
|> new(spec_opts)
|> Vl.mark(:area, mark_opts)
|> encode_field(:x, x, opts, type: :quantitative)
|> encode_field(:y, y, opts, type: :quantitative, stack: stack)
|> maybe_encode_field(:color, fn -> opts[:color_by] != nil end, opts[:color_by], opts, [])
end
@doc """
Returns the specification of a streamgraph.
This is a simple wrapper around `area/4` with `:mode` set by default
to `:streamgraph`. Any value set to the `:mode` option will be ignored.
A grouping field must also be provided which will be set as `:color_by` to
the area chart.
## Options
Check `area/4`
## Examples
```tucan
Tucan.streamgraph(:stocks, "date", "price", "symbol",
width: 300,
x: [type: :temporal],
tooltip: true
)
```
"""
@doc section: :plots
@spec streamgraph(
plotdata :: plotdata(),
x :: field(),
y :: field(),
group :: field(),
opts :: keyword()
) ::
VegaLite.t()
def streamgraph(plotdata, x, y, group, opts \\ []) do
opts = Keyword.merge(opts, mode: :streamgraph, color_by: group)
area(plotdata, x, y, opts)
end
pie_opts = [
inner_radius: [
type: :integer,
doc: """
The inner radius in pixels. `0` for a pie chart, `> 0` for a donut chart. If not
set it defaults to 0
""",
dest: :mark
],
# TODO: custom validation with supported types
aggregate: [
type: :atom,
doc: "The statistic to use (if any) for aggregating values per pie slice (e.g. `:mean`).",
dest: :theta
]
]
@pie_opts Tucan.Options.take!([@global_opts, @global_mark_opts, :theta, :color], pie_opts)
@pie_schema Tucan.Options.to_nimble_schema!(@pie_opts)
@doc """
Draws a pie chart.
A pie chart is a circle divided into sectors that each represents a proportion
of the whole. The `field` specifies the data column that contains the proportions
of each category. The chart will be colored by the `category` field.
> #### Avoid using pie charts {: .warning}
>
> Despite it's popularity pie charts should rarely be used. Pie charts are best
> suited for displaying a small number of categories and can make it challenging
> to accurately compare data. They rely on angle perception, which can lead to
> misinterpretation, and lack the precision offered by other charts like bar
> charts or line charts.
>
> Instead, opt for alternatives such as bar charts for straightforward comparisons,
> stacked area charts for cumulative effects.
>
> The following example showcases the limitations of a pie chart, compared to a
> bar chart:
>
> ```tucan
> alias VegaLite, as: Vl
>
> data = [
> %{value: 30, category: "A"},
> %{value: 33, category: "B"},
> %{value: 38, category: "C"}
> ]
>
> pie = Tucan.pie(data, "value", "category")
> bar = Tucan.bar(data, "category", "value", orient: :vertical)
>
> Tucan.hconcat([pie, bar])
> |> Tucan.set_title("Pie vs Bar chart", anchor: :middle, offset: 15)
> ```
## Options
#{Tucan.Options.docs(@pie_opts)}
## Examples
```tucan
Tucan.pie(:barley, "yield", "site", aggregate: :sum, tooltip: true)
|> Tucan.facet_by(:column, "year", type: :nominal)
```
"""
@doc section: :plots
@spec pie(plotdata :: plotdata(), field :: binary(), category :: binary(), opts :: keyword()) ::
VegaLite.t()
def pie(plotdata, field, category, opts \\ []) do
opts = NimbleOptions.validate!(opts, @pie_schema)
spec_opts = take_options(opts, @pie_opts, :spec)
mark_opts = take_options(opts, @pie_opts, :mark)
theta_opts =
opts
|> take_options(@pie_opts, :theta)
|> Keyword.merge(type: :quantitative)
plotdata
|> new(spec_opts)
|> Vl.mark(:arc, mark_opts)
|> encode_field(:theta, field, opts, theta_opts)
|> encode_field(:color, category, opts)
end
@doc """
Draw a donut chart.
A donut chart is a circular visualization that resembles a pie chart but
features a hole at its center. This central hole creates a _donut_ shape,
distinguishing it from traditional pie charts.
This is a wrapper around `pie/4` that sets by default the `:inner_radius`.
## Options
See `pie/4`
## Examples
```tucan
Tucan.donut(:barley, "yield", "site", aggregate: :sum, tooltip: true)
|> Tucan.facet_by(:column, "year", type: :nominal)
```
"""
@doc section: :plots
@spec donut(plotdata :: plotdata(), field :: binary(), category :: binary(), opts :: keyword()) ::
VegaLite.t()
def donut(plotdata, field, category, opts \\ []) do
opts = Keyword.put_new(opts, :inner_radius, 50)
pie(plotdata, field, category, opts)
end
## Composite plots
pairplot_opts = [
diagonal: [
type: {:in, [:scatter, :density, :histogram]},
default: :scatter,
doc: """
The plot type to be used for the diagonal subplots. Can be one on
`:scatter`, `:density` and `:histogram`.
"""
],
plot_fn: [
type: {:fun, 3},
doc: """
An optional function for customizing the look any subplot. It expects a
function with the following signature:
```elixir
(vl :: VegaLite.t(), row :: {binary(), integer()}, column :: {binary(), integer()})
:: VegaLite.t()
```
where both `row` and `column` are tuples containing the index and field of
the current and row and column respectively.
You are free to specify any function for every cell of the grid.
"""
]
]
@pairplot_opts Tucan.Options.take!([@global_opts], pairplot_opts)
@pairplot_schema Tucan.Options.to_nimble_schema!(@pairplot_opts)
@doc """
Plot pairwise relationships in a dataset.
This function expects an array of fields to be provided. A grid will be created
where each numeric variable in `fields` will be shared across the y-axes across
a single row and the x-axes across a single column.
> #### Numerical field types {: .warning}
>
> Notice that currently `pairplot/3` works only with numerical (`:quantitative`)
> variables. If you need to create a pair plot containing other variable types
> you need to manually build the grid using the `VegaLite` concatenation operations.
## Options
#{Tucan.Options.docs(@pairplot_opts)}
Notice that if set `width` and `height` will be applied to individual sub plots. On
the other hand `title` is applied to the composite plot.
## Examples
By default a scatter plot will be drawn for all pairwise plots:
```tucan
fields = ["petal_width", "petal_length", "sepal_width", "sepal_length"]
Tucan.pairplot(:iris, fields, width: 130, height: 130)
```
You can color the points by another field in to add some semantic mapping. Notice
that you need the `recursive` option to `true` for the grouping to be applied on all
internal subplots.
```tucan
fields = ["petal_width", "petal_length", "sepal_width", "sepal_length"]
Tucan.pairplot(:iris, fields, width: 130, height: 130)
|> Tucan.color_by("species", recursive: true)
```
By specifying the `:diagonal` option you can change the default plot for the diagonal
elements to a histogram:
```tucan
fields = ["petal_width", "petal_length", "sepal_width", "sepal_length"]
Tucan.pairplot(:iris, fields, width: 130, height: 130, diagonal: :histogram)
|> Tucan.color_by("species", recursive: true)
```
Additionally you have the option to configure a `plot_fn` with which we can go crazy and
modify any part of the grid based on our needs. `plot_fn` should accept as input a `VegaLite`
struct and two tuples containing the row and column fields and indexes. In the following
example we draw differently the diagonal, the lower and the upper grid. Notice that we don't
call `color_by/3` since we color differently the plots based on their index positions.
```tucan
Tucan.pairplot(:iris, ["petal_width", "petal_length", "sepal_width", "sepal_length"],
width: 150,
height: 150,
plot_fn: fn vl, {row_field, row_index}, {col_field, col_index} ->
cond do
# For the first two diagonal elements we plot a histogram, no
row_index == col_index and row_index < 2 ->
Tucan.histogram(vl, row_field)
row_index == 2 and col_index == 2 ->
Tucan.stripplot(vl, row_field, group: "species", style: :tick)
|> Tucan.color_by("species")
|> Tucan.Axes.put_options(:y, labels: false)
# For the other diagonal plots we plot a histogram colored_by the species
row_index == col_index ->
Tucan.histogram(vl, row_field, color_by: "species")
# For the upper part of the diagram we apply a scatter plot
row_index < col_index ->
Tucan.scatter(vl, col_field, row_field)
|> Tucan.color_by("species")
# for anything else scatter plot with a quantitative color scale
# and size
true ->
Tucan.scatter(vl, col_field, row_field)
|> Tucan.size_by("petal_width", type: :quantitative)
end
end
)
```
"""
@doc section: :composite
@spec pairplot(plotdata :: plotdata(), fields :: [binary()], opts :: keyword()) :: VegaLite.t()
def pairplot(plotdata, fields, opts \\ []) when is_list(fields) do
opts = NimbleOptions.validate!(opts, @pairplot_schema)
children =
for {row_field, row_index} <- Enum.with_index(fields),
{col_field, col_index} <- Enum.with_index(fields) do
pairplot_child_spec({row_field, row_index}, {col_field, col_index}, length(fields), opts)
end
spec_opts = Keyword.take(opts, [:title]) ++ [columns: length(fields)]
plotdata
|> new(spec_opts)
|> Vl.concat(children, :wrappable)
end
defp pairplot_child_spec({row_field, row_index}, {col_field, col_index}, fields_count, opts) do
x_axis_title = fn vl, row_index ->
if row_index == fields_count - 1 do
Tucan.Axes.put_options(vl, :x, title: col_field)
else
Tucan.Axes.put_options(vl, :x, title: nil)
end
end
y_axis_title = fn vl, col_index ->
if col_index == 0 do
Tucan.Axes.put_options(vl, :y, title: row_field)
else
Tucan.Axes.put_options(vl, :y, title: nil)
end
end
spec_opts = Keyword.take(opts, [:width, :height])
Vl.new(spec_opts)
|> pairplot_child_plot(row_field, row_index, col_field, col_index, opts)
|> x_axis_title.(row_index)
|> y_axis_title.(col_index)
end
defp pairplot_child_plot(vl, row_field, row_index, col_field, col_index, opts) do
diagonal = opts[:diagonal] || :scatter
plot_fn = opts[:plot_fn]
cond do
plot_fn != nil ->
plot_fn.(vl, {row_field, row_index}, {col_field, col_index})
row_index == col_index and diagonal == :histogram ->
Tucan.histogram(vl, row_field)
row_index == col_index and diagonal == :density ->
Tucan.density(vl, row_field)
true ->
Tucan.scatter(vl, col_field, row_field)
end
end
jointplot_opts = [
width: [
type: :integer,
default: 200,
doc: """
The dimension of the central (joint) plot. The same value is used for
both the width and height of the plot.
"""
],
ratio: [
type: :float,
default: 0.45,
doc: """
The ratio of the marginal plots secondary dimension with respect to
the joint plot dimension.
"""
],
joint: [
type: {:in, [:scatter, :density_heatmap]},
default: :scatter,
doc: """
The plot type to be used for the main (joint) plot. Can be one of
`:scatter` and `:density_heatmap`.
"""
],
joint_opts: [
type: :keyword_list,
default: [],
doc: """
Arbitrary options list for the joint plot. The supported options
depend on the selected `:joint` type.
"""
],
marginal: [
type: {:in, [:histogram, :density]},
default: :histogram,
doc: """
The plot type to be used for the marginal plots. Can be one of
`:histogram` and `:density`.
"""
],
marginal_opts: [
type: :keyword_list,
default: [],
doc: """
Arbitrary options list for the marginal plots. The supported options
depend on the selected `:marginal` type.
"""
],
spacing: [
type: :pos_integer,
doc: "The spacing between the marginals and the joint plot.",
default: 15,
section: :style
]
]
@jointplot_opts Tucan.Options.take!([:width, :title, :color_by, :fill_opacity], jointplot_opts)
@jointplot_schema Tucan.Options.to_nimble_schema!(@jointplot_opts)
@doc """
Returns the specification of a jointplot.
A jointplot is a plot of two numerical variables along with marginal univariate
graphs. If no options are set the joint is a scatter plot and the marginal are
the histograms of the two variables.
> #### Marginal plots dimensions {: .info}
>
> By default a jointplot will have a square shape, e.g. it will have the same
> width and height. The `:width` option affects the width of the central (joint)
> plot.
>
> For the marginal distributions you can the `:ratio` option which specifies
> the ratio of joint axes height to marginal axes height.
## Options
#{Tucan.Options.docs(@jointplot_opts)}
## Examples
A simple joint plot between two variables.
```tucan
Tucan.jointplot(:iris, "petal_width", "petal_length", width: 200)
```
You can also pass `:color_by` to apply a semantic grouping. If set it will be
applied both to the joint and the marginal plots.
```tucan
Tucan.jointplot(
:iris, "petal_width", "petal_length",
color_by: "species",
fill_opacity: 0.5,
width: 200
)
```
You can change the type of the join plot and the marginal distributions:
```tucan
Tucan.jointplot(
:penguins, "Beak Length (mm)", "Beak Depth (mm)",
joint: :density_heatmap,
marginal: :density,
ratio: 0.3
)
```
"""
@doc section: :composite
@spec jointplot(plotdata :: plotdata(), x :: field(), y :: field(), opts :: keyword()) ::
VegaLite.t()
def jointplot(plotdata, x, y, opts \\ []) do
opts = NimbleOptions.validate!(opts, @jointplot_schema)
# TODO: maybe enable this in the future (we need to properly set the legends for this)
if opts[:joint] == :density_heatmap and opts[:color_by] do
raise ArgumentError,
"combining a density_heatmap with the :color_by option is not supported"
end
joint_opts =
opts
|> Keyword.take([:color_by, :fill_opacity])
|> Keyword.merge(opts[:joint_opts])
joint_plot = Vl.new(width: opts[:width], height: opts[:width])
joint_plot =
case opts[:joint] do
:scatter ->
scatter(joint_plot, x, y, joint_opts)
:density_heatmap ->
joint_opts = Keyword.drop(joint_opts, [:color_by])
density_heatmap(joint_plot, x, y, joint_opts)
end
marginal_dimension = ceil(opts[:ratio] * opts[:width])
marginal_opts =
Keyword.take(opts, [:color_by, :fill_opacity])
|> Tucan.Keyword.deep_merge(x: [axis: nil])
|> Tucan.Keyword.deep_merge(opts[:marginal_opts])
{marginal_x, marginal_y} =
marginal_plots(x, y, marginal_dimension, opts[:marginal], marginal_opts)
plotdata
|> new(spacing: opts[:spacing], bounds: "flush")
|> Vl.concat(
[
marginal_x,
Vl.concat(
Vl.new(spacing: opts[:spacing], bounds: "flush"),
[joint_plot, marginal_y],
:horizontal
)
],
:vertical
)
end
defp marginal_plots(x, y, dimension, type, opts) do
marginal_x =
Vl.new(height: dimension)
|> marginal_plot(x, type, opts)
marginal_y =
Vl.new(width: dimension)
|> marginal_plot(y, type, opts ++ [orient: :vertical])
{marginal_x, marginal_y}
end
defp marginal_plot(vl, x, :histogram, opts), do: histogram(vl, x, opts)
defp marginal_plot(vl, x, :density, opts), do: density(vl, x, opts)
## Grouping functions
grouping_options = [
recursive: [
type: :boolean,
default: false,
doc: """
If set the grouping function will be applied recursively in all valid sub plots. This
includes both layers and concatenated plots.
"""
]
]
@grouping_opts Keyword.keys(grouping_options)
@grouping_schema NimbleOptions.new!(grouping_options)
@doc """
Adds a `color` encoding for the given field.
## Options
#{NimbleOptions.docs(@grouping_schema)}
`opts` can also contain an arbitrary set of vega-lite supported options that
will be passed to the underlying encoding.
"""
@doc section: :grouping
@spec color_by(vl :: VegaLite.t(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def color_by(vl, field, opts \\ []), do: group_by(vl, :color, field, opts)
@doc """
Adds a `shape` encoding for the given field.
## Options
#{NimbleOptions.docs(@grouping_schema)}
`opts` can also contain an arbitrary set of vega-lite supported options that
will be passed to the underlying encoding.
"""
@doc section: :grouping
@spec shape_by(vl :: VegaLite.t(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def shape_by(vl, field, opts \\ []), do: group_by(vl, :shape, field, opts)
@doc """
Adds a `stroke_dash` encoding for the given field.
## Options
#{NimbleOptions.docs(@grouping_schema)}
`opts` can also contain an arbitrary set of vega-lite supported options that
will be passed to the underlying encoding.
"""
@doc section: :grouping
@spec stroke_dash_by(vl :: VegaLite.t(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def stroke_dash_by(vl, field, opts \\ []), do: group_by(vl, :stroke_dash, field, opts)
@doc """
Adds a `fill` encoding for the given field.
## Options
#{NimbleOptions.docs(@grouping_schema)}
`opts` can also contain an arbitrary set of vega-lite supported options that
will be passed to the underlying encoding.
"""
@doc section: :grouping
@spec fill_by(vl :: VegaLite.t(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def fill_by(vl, field, opts \\ []), do: group_by(vl, :fill, field, opts)
@doc """
Adds a `size` encoding for the given field.
By default the type of the `field` is set to `:quantitative`. You can override it in the
`opts` by setting another `:type`.
## Options
#{NimbleOptions.docs(@grouping_schema)}
`opts` can also contain an arbitrary set of vega-lite supported options that
will be passed to the underlying encoding.
"""
@doc section: :grouping
@spec size_by(vl :: VegaLite.t(), field :: binary(), opts :: keyword()) :: VegaLite.t()
def size_by(vl, field, opts \\ []),
do: group_by(vl, :size, field, [type: :quantitative] ++ opts)
defp group_by(vl, encoding, field, opts) do
{group_opts, opts} = Keyword.split(opts, @grouping_opts)
group_opts = NimbleOptions.validate!(group_opts, @grouping_schema)
case group_opts[:recursive] do
true ->
apply_recursively(vl, fn spec ->
VegaLiteUtils.encode_field_raw(spec, encoding, field, opts)
end)
_ ->
VegaLiteUtils.validate_single_view!(vl, "#{encoding}_by", [
:layers,
:concat,
:vconcat,
:hconcat
])
Vl.encode_field(vl, encoding, field, opts)
end
end
@doc """
Apply facetting on the input plot `vl` by the given `field`.
This will create multiple plots either horizontally (`:column` faceting mode),
vertically (`:row` faceting mode) or arbitrarily (`:wrapped` mode). One plot will
be created for each distinct value of the given `field`, which must be a
categorical variable.
In the case of `:wrapped` a `:columns` option should also be provided which
will determine the number of columns of the composite plot.
`opts` is an arbitrary keyword list that will be passed to the `:row` or `:column`
encoding.
> #### Facet plots {: .info}
>
> Facet plots, also known as trellis plots or small multiples, are figures made up
> of multiple subplots which have the same set of axes, where each subplot shows
> a subset of the data.
## Examples
```tucan
Tucan.scatter(:iris, "petal_width", "petal_length")
|> Tucan.facet_by(:column, "species")
|> Tucan.color_by("species")
```
With `:wrapped` mode and custom sorting:
```tucan
Tucan.density(:movies, "IMDB Rating", color_by: "Major Genre")
|> Tucan.facet_by(:wrapped, "Major Genre", columns: 4, sort: [op: :mean, field: "IMDB Rating"])
|> Tucan.Legend.set_enabled(:color, false)
|> Tucan.set_title("Density of IMDB rating by Genre", offset: 20)
```
"""
@doc section: :grouping
@spec facet_by(
vl :: VegaLite.t(),
faceting_mode :: :row | :column,
field :: binary(),
opts :: keyword()
) :: VegaLite.t()
def facet_by(vl, faceting_mode, field, opts \\ [])
def facet_by(vl, :row, field, opts) do
Vl.encode_field(vl, :row, field, opts)
end
def facet_by(vl, :column, field, opts) do
Vl.encode_field(vl, :column, field, opts)
end
def facet_by(vl, :wrapped, field, opts) do
Vl.encode_field(vl, :facet, field, opts)
end
defp apply_recursively(%VegaLite{} = vl, fun) do
put_in(vl.spec, do_apply_recursively(vl.spec, fun))
end
defp do_apply_recursively(%{"layer" => layers} = spec, fun) do
layers = do_apply_recursively(layers, fun)
Map.put(spec, "layer", layers)
end
defp do_apply_recursively(%{"vconcat" => vconcat} = spec, fun) do
vconcat = do_apply_recursively(vconcat, fun)
Map.put(spec, "vconcat", vconcat)
end
defp do_apply_recursively(%{"hconcat" => hconcat} = spec, fun) do
hconcat = do_apply_recursively(hconcat, fun)
Map.put(spec, "hconcat", hconcat)
end
defp do_apply_recursively(%{"concat" => concat} = spec, fun) do
concat = do_apply_recursively(concat, fun)
Map.put(spec, "concat", concat)
end
defp do_apply_recursively(spec, fun) when is_map(spec) do
fun.(spec)
end
defp do_apply_recursively(spec, fun) when is_list(spec) do
Enum.map(spec, fn item -> do_apply_recursively(item, fun) end)
end
## Utilities functions
line_opts = [
stroke_width: [
type: :integer,
doc: "The stroke width in pixels",
dest: :mark,
section: :style,
default: 1
],
line_color: [
type: :string,
doc: "The color of the line",
section: :style,
default: "black"
],
aggregate: [
type: :atom,
doc: "The aggregate to used for calculating the line's coordinate",
default: :mean
]
]
@line_opts Tucan.Options.take!([:color_by, :color], line_opts)
@line_schema Tucan.Options.to_nimble_schema!(@line_opts)
@doc """
Adds a vertical or horizontal ruler at the given position.
`position` can either be a number representing a coordinate of the _x/y-axis_ or a
binary representing a field. In the latter case an aggregation can also
be provided which will be used for aggregating the field distribution
to a single number. If not set defaults to `:mean`.
`axis` specifies the orientation of the line. Use `:x` for a vertical
line and `:y` for a horizontal one.
See also `vruler/3`, `hruler/3`.
## Options
#{Tucan.Options.docs(@line_opts)}
## Examples
You can add a vertical ruler to any _x-axis_ point:
```tucan
Tucan.scatter(:iris, "petal_width", "petal_length")
|> Tucan.ruler(:x, 1.1, stroke_width: 3, line_color: "blue")
|> Tucan.ruler(:x, 1.4, line_color: "green")
```
Additionally you can can add a vertical line to an aggregated value of
a data field. For example:
```tucan
Tucan.scatter(:iris, "petal_width", "petal_length")
|> Tucan.ruler(:x, "petal_width", line_color: "red")
```
You can add multiple lines for each group of the data if you pass the
`color_by` option. Also you can combine vertical with horizontal
lines.
```tucan
Tucan.scatter(:iris, "petal_width", "petal_length", color_by: "species")
|> Tucan.ruler(:x, "petal_width", color_by: "species", stroke_width: 3)
|> Tucan.ruler(:y, "petal_length", color_by: "species")
```
"""
@doc section: :utilities
@spec ruler(
vl :: VegaLite.t(),
axis :: :x | :y,
position :: number() | binary(),
opts :: keyword()
) :: VegaLite.t()
def ruler(vl, axis, position, opts) when axis in [:x, :y] do
opts = NimbleOptions.validate!(opts, @line_schema)
mark_opts =
take_options(opts, @line_opts, :mark)
|> Keyword.merge(color: opts[:line_color])
ruler =
Vl.new()
|> Vl.mark(:rule, mark_opts)
|> encode_ruler(axis, position, opts)
|> maybe_encode_field(
:color,
fn -> opts[:color_by] != nil and is_binary(position) end,
opts[:color_by],
opts,
[]
)
VegaLiteUtils.append_layers(vl, ruler)
end
@doc """
Adds a vertical line at the given `x` position.
For supported options check `line/4`.
"""
@doc section: :utilities
@spec vruler(vl :: VegaLite.t(), position :: number() | binary(), opts :: keyword()) ::
VegaLite.t()
def vruler(vl, x, opts \\ []) do
ruler(vl, :x, x, opts)
end
@doc """
Adds a horizontal line at the given `h` position.
For supported options check `line/4`.
"""
@doc section: :utilities
@spec hruler(vl :: VegaLite.t(), position :: number() | binary(), opts :: keyword()) ::
VegaLite.t()
def hruler(vl, y, opts \\ []) do
ruler(vl, :y, y, opts)
end
defp encode_ruler(vl, channel, number, _opts) when is_number(number),
do: Vl.encode(vl, channel, datum: number)
defp encode_ruler(vl, channel, field, opts) when is_binary(field) do
Vl.encode_field(vl, channel, field, type: :quantitative, aggregate: opts[:aggregate])
end
@doc """
Concatenates horizontally the given plots.
"""
@doc section: :utilities
@spec hconcat(vl :: VegaLite.t(), plots :: [VegaLite.t()]) :: VegaLite.t()
def hconcat(vl \\ Vl.new(), plots) when is_list(plots) do
VegaLite.concat(vl, plots, :horizontal)
end
@doc """
Concatenates vertically the given plots.
"""
@doc section: :utilities
@spec vconcat(vl :: VegaLite.t(), plots :: [VegaLite.t()]) :: VegaLite.t()
def vconcat(vl \\ Vl.new(), plots) when is_list(plots) do
VegaLite.concat(vl, plots, :vertical)
end
@doc """
Concatenates the given plots.
This corresponds to the general concatenation of vega-lite (wrappable).
"""
@doc section: :utilities
@spec concat(vl :: VegaLite.t(), plots :: [VegaLite.t()]) :: VegaLite.t()
def concat(vl \\ Vl.new(), plots) when is_list(plots) do
VegaLite.concat(vl, plots, :wrappable)
end
@doc """
Creates a layered plot.
This is a simple wrapper around `VegaLite.layers/2` which by default adds
the layers under an empty plot.
"""
@doc section: :utilities
@spec layers(vl :: VegaLite.t(), plots :: [VegaLite.t()]) :: VegaLite.t()
def layers(vl \\ Vl.new(), plots) do
VegaLite.layers(vl, plots)
end
@doc """
Flips the axes of the provided chart.
This works for both one dimensional and two dimensional charts. All positional channels
that are defined will be flipped.
This is used internally by plots that support setting orientation.
"""
@doc section: :utilities
@spec flip_axes(vl :: VegaLite.t()) :: VegaLite.t()
def flip_axes(vl) when is_struct(vl, VegaLite) do
axis_pairs = [{:x, :y}, {:x2, :y2}, {:x_offset, :y_offset}]
new_vl = VegaLiteUtils.drop_encoding_channels(vl, [:x, :y, :x2, :y2, :x_offset, :y_offset])
Enum.reduce(axis_pairs, new_vl, fn {left, right}, new_vl ->
new_vl
|> copy_encoding(left, right, vl)
|> copy_encoding(right, left, vl)
end)
|> maybe_flip_mark_orient()
end
defp maybe_flip_mark_orient(%VegaLite{spec: %{"mark" => %{"orient" => orient}}} = vl),
do:
update_in(vl.spec, fn spec ->
new_orient =
case orient do
"vertical" -> "horizontal"
"horizontal" -> "vertical"
end
mark_opts = Map.merge(spec["mark"], %{"orient" => new_orient})
Map.put(spec, "mark", mark_opts)
end)
defp maybe_flip_mark_orient(vl), do: vl
# copies to left channel, the right channel options from the vl_origin specification
defp copy_encoding(vl, left, right, vl_origin) do
case VegaLiteUtils.has_encoding?(vl_origin, left) do
false ->
vl
true ->
opts = VegaLiteUtils.encoding_options(vl_origin, left) || []
VegaLiteUtils.encode_raw(vl, right, opts)
end
end
## Styling functions
@doc """
Sets the plot size.
This sets both width and height at once.
"""
@doc section: :styling
@spec set_size(vl :: VegaLite.t(), width :: pos_integer(), height :: pos_integer()) ::
VegaLite.t()
def set_size(vl, width, height)
when is_struct(vl, VegaLite) and is_pos_integer(width) and is_pos_integer(height) do
vl
|> set_width(width)
|> set_height(height)
end
@doc """
Sets the width of the plot (in pixels).
"""
@doc section: :styling
@spec set_width(vl :: VegaLite.t(), width :: pos_integer()) :: VegaLite.t()
def set_width(vl, width) when is_struct(vl, VegaLite) and is_pos_integer(width) do
update_in(vl.spec, fn spec -> Map.merge(spec, %{"width" => width}) end)
end
@doc """
Sets the height of the plot (in pixels).
"""
@doc section: :styling
@spec set_height(vl :: VegaLite.t(), height :: pos_integer()) :: VegaLite.t()
def set_height(vl, height) when is_struct(vl, VegaLite) and is_pos_integer(height) do
update_in(vl.spec, fn spec -> Map.merge(spec, %{"height" => height}) end)
end
@doc """
Sets the title of the plot.
You can optionally pass any title option supported by vega-lite to customize the
style of it.
## Examples
```tucan
Tucan.scatter(:iris, "petal_width", "petal_length")
|> Tucan.set_title("My awesome plot",
color: "red",
subtitle: "with a subtitle",
subtitle_color: "green",
anchor: "start"
)
```
"""
@doc section: :styling
@spec set_title(vl :: VegaLite.t(), title :: binary(), opts :: keyword()) :: VegaLite.t()
def set_title(vl, title, opts \\ [])
when is_struct(vl, VegaLite) and is_binary(title) and is_list(opts) do
title_opts = Keyword.merge(opts, text: title)
VegaLiteUtils.put_in_spec(vl, :title, title_opts)
end
@doc """
Sets the plot's theme.
Check `Tucan.Themes` for more details on theming.
"""
@doc section: :styling
@spec set_theme(vl :: VegaLite.t(), theme :: atom()) :: VegaLite.t()
def set_theme(vl, theme) do
theme = Tucan.Themes.theme(theme)
Vl.config(vl, theme)
end
## Private functions
defp maybe_flip_axes(vl, false), do: vl
defp maybe_flip_axes(vl, true), do: flip_axes(vl)
defp take_options(opts, schema, dest) do
dest_opts =
schema
|> Enum.filter(fn {_key, opts} ->
opts[:dest] == dest
end)
|> Keyword.keys()
Keyword.take(opts, dest_opts)
end
# we use encode_field and encode instead of Vl.encode_field and Vl.encode in all
# tucan plots for the following reason:
#
# - we want to support setting custom vega-lite options on each encoding
# that may be included in the specification.
# - these options are passed in the options of the plots as encoding: [options]
# e.g. x: [...], y: []
# - by having this custom function we can ensure that:
# - the encoding options are extracted by the opts on each call and merged
# with the extra_opts the function call may set
# - if they are missing the tests will raise ensuring that we have properly
# set all possible options for each plot type
# - they are set with the proper precedence and deep merged with the extra
defp encode_field(vl, encoding, field, opts, extra_opts \\ []) do
encoding_opts = Tucan.Keyword.deep_merge(extra_opts, Keyword.fetch!(opts, encoding))
Vl.encode_field(vl, encoding, field, encoding_opts)
end
defp encode(vl, encoding, opts, extra_opts) do
encoding_opts = Tucan.Keyword.deep_merge(extra_opts, Keyword.fetch!(opts, encoding))
Vl.encode(vl, encoding, encoding_opts)
end
end