defmodule StreamData do
@moduledoc """
Functions to create and combine generators.
A generator is a `StreamData` struct. Generators can be created through the
functions exposed in this module, like `constant/1`, and by combining other
generators through functions like `bind/2`.
Similar to the `Stream` module, the functions in this module return a lazy
construct. We can get values out of a generator by enumerating the generator.
Generators always generate an infinite stream of values (which are randomized
most of the time).
For example, to get an infinite stream of integers that starts with small
integers and progressively grows the boundaries, you can use `integer/0`:
Enum.take(StreamData.integer(), 10)
#=> [-1, 0, -3, 4, -4, 5, -1, -3, 5, 8]
As you can see above, values emitted by a generator are not unique.
In many applications of generators, the longer the generator runs the larger
the generated values will be. For integers, a larger integer means a bigger number.
For lists, it may mean a list with more elements. This is controlled by a parameter
that we call the **generation size** (see the "Generation size" section below).
StreamData is often used to generate random values. It is also the foundation
for property-based testing. See `ExUnitProperties` for more information.
## Enumeration
Generators implement the `Enumerable` protocol. The enumeration starts with a
small generation size, which increases when the enumeration continues (up to a
fixed maximum size).
Since generators are proper streams, functions from the `Stream` module can be
used to stream values out of them. For example, to build an infinite stream of
positive even integers, you can do:
StreamData.integer()
|> Stream.filter(& &1 > 0)
|> Stream.map(& &1 * 2)
|> Enum.take(10)
#=> [4, 6, 4, 10, 14, 16, 4, 16, 36, 16]
Generators that are manipulated via the `Stream` and `Enum` modules are no
longer **shrinkable** (see the section about shrinking below). If you want
generation through the `Enumerable` protocol to be reproducible, see `seeded/2`.
## Generation size
Generators have access to a generation parameter called the **generation
size**, which is a non-negative integer. This parameter is meant to bind the
data generated by each generator in a way that is completely up to the
generator. For example, a generator that generates integer can use the `size`
parameter to generate integers inside the `-size..size` range. In a similar
way, a generator that generates lists could use this parameter to generate a
list with `0` to `size` elements. During composition, it is common for the
"parent generator" to pass the size to the composed generators.
When creating generators, they can access the generation size using the
`sized/1` function. Generators can be resized to a fixed generation size using
`resize/2`.
## Shrinking
`StreamData` generators are also shrinkable. The idea behind shrinking is
to find the simplest value that respects a certain condition. For example,
during property-based tests, we use shrinking to find the integer closest to 0
or the smallest list that makes a test fail. By reporting the simplest data
structure that triggers an error, the failure becomes easier to understand
and reproduce.
Each generator has its own logic to shrink values. Those are outlined in each
generator documentation.
Note that the generation size is not related in any way to shrinking: while
intuitively one may think that shrinking just means decreasing the generation
size, in reality the shrinking rule is bound to each generated value. One way
to look at it is that shrinking a list is always the same, regardless of its
generated length.
## Special generators
Some Elixir types are implicitly converted to `StreamData` generators when
composed or used in property-based testing. These types are:
* atoms - they generate themselves. For example, `:foo` is equivalent to
`StreamData.constant(:foo)`.
* tuples of generators - they generate tuples where each value is a value
generated by the corresponding generator, exactly like described in
`tuple/1`. For example, `{StreamData.integer(), StreamData.boolean()}`
generates entries like `{10, false}`.
Note that *these terms must be explicitly converted to StreamData generators*.
This means that these terms are not full-fledged generators. For example, atoms
cannot be enumerated directly as they don't implement the `Enumerable` protocol.
However, `StreamData.constant(:foo)` is enumerable as it has been wrapped in
a `StreamData` function.
"""
alias StreamData.LazyTree
@typep seed() :: :rand.state()
@typep size() :: non_neg_integer()
@typep generator_fun(a) :: (seed(), size() -> LazyTree.t(a))
@typedoc """
An opaque type that represents a `StreamData` generator that generates values
of type `a`.
"""
@opaque t(a) :: %__MODULE__{generator: generator_fun(a)} | atom() | tuple()
# TODO: remove once we depend on OTP 20+ since :exs64 is deprecated.
if String.to_integer(System.otp_release()) >= 20 do
@rand_algorithm :exsp
else
@rand_algorithm :exs64
end
defstruct [:generator]
defmodule FilterTooNarrowError do
defexception [:max_consecutive_failures, :last_generated_value]
def message(exception) do
%{
max_consecutive_failures: max_consecutive_failures,
last_generated_value: last_generated_value
} = exception
max_consecutive_failures_part =
if max_consecutive_failures do
" (#{max_consecutive_failures} elements in this case)"
else
""
end
last_element_part =
case last_generated_value do
{:value, value} -> " The last element to be filtered out was: #{inspect(value)}."
:none -> ""
end
"""
too many consecutive elements#{max_consecutive_failures_part} were filtered out.
#{last_element_part} To avoid this:
* make sure the generation space contains enough values that the chance of a generated
value being filtered out is small. For example, don't generate all integers and filter
out odd ones in order to have a generator of even integers (since you'd be taking out
half the generation space).
* keep an eye on how the generation size affects the generator being filtered. For
example, you might be filtering out only a handful of values from the generation space,
but small generation sizes might make the generation space much smaller hence increasing
the probability of values that you'd filter out being generated.
* try to restructure your generator so that instead of generating many values and taking
out the ones you don't want, you instead generate values and turn all of them into
values that are suitable. For example, multiply integers by two to have a generator of
even values instead of filtering out all odd integers.
"""
end
end
defmodule TooManyDuplicatesError do
defexception [:max_tries, :remaining_to_generate, :generated]
def message(%{max_tries: max_tries, remaining_to_generate: remaining, generated: generated}) do
"too many (#{max_tries}) non-unique elements were generated consecutively. " <>
"Make sure to avoid generating from a small space of data (such as only a " <>
"handful of terms) and make sure a small generation size doesn't affect " <>
"uniqueness too heavily. There were still #{remaining} elements left to " <>
"generate, while the generated elements were:\n\n#{inspect(generated)}"
end
end
### Minimal interface
## Helpers
@compile {:inline, new: 1}
defp new(generator) when is_function(generator, 2) do
%__MODULE__{generator: generator}
end
# We support multiple types of generators through call/3: this is basically a
# poor implementation of a protocol (which we don't want to add just for
# this).
@doc false
@spec __call__(StreamData.t(a), seed(), size()) :: a when a: term()
def __call__(data, seed, size) do
call(data, seed, size)
end
@compile {:inline, call: 3}
defp call(%__MODULE__{generator: generator}, seed, size) do
%LazyTree{} = generator.(seed, size)
end
defp call(atom, _seed, _size) when is_atom(atom) do
lazy_tree_constant(atom)
end
defp call(tuple, seed, size) when is_tuple(tuple) do
case tuple_size(tuple) do
0 ->
lazy_tree_constant({})
tuple_size ->
{trees, _seed} =
Enum.map_reduce(0..(tuple_size - 1), seed, fn index, acc ->
{seed1, seed2} = split_seed(acc)
data = elem(tuple, index)
{call(data, seed1, size), seed2}
end)
trees
|> LazyTree.zip()
|> LazyTree.map(&List.to_tuple/1)
end
end
defp call(other, _seed, _size) do
raise ArgumentError,
"expected a generator, which can be a %StreamData{} struct, an atom, " <>
"or a tuple with generators in it, but got:\n\n #{inspect(other)}\n\n" <>
"If you want to use a term as a \"constant\" generator, wrap it in a call to " <>
"StreamData.constant/1 instead."
end
## Generators
@compile {:inline, constant: 1}
@doc """
A generator that always generates the given term.
## Examples
iex> Enum.take(StreamData.constant(:some_term), 3)
[:some_term, :some_term, :some_term]
## Shrinking
This generator doesn't shrink.
"""
@spec constant(a) :: t(a) when a: var
def constant(term) do
new(fn _seed, _size -> lazy_tree_constant(term) end)
end
## Combinators
@compile {:inline, map: 2}
@doc """
Maps the given function `fun` over the given generator `data`.
Returns a new generator that returns elements from `data` after applying `fun`
to them.
## Examples
iex> data = StreamData.map(StreamData.integer(), &Integer.to_string/1)
iex> Enum.take(data, 3)
["1", "0", "3"]
## Shrinking
This generator shrinks exactly like `data`, but with `fun` mapped over the
shrunk data.
"""
@spec map(t(a), (a -> b)) :: t(b) when a: term(), b: term()
def map(data, fun) when is_function(fun, 1) do
new(fn seed, size ->
data
|> call(seed, size)
|> LazyTree.map(fun)
end)
end
@doc """
Binds each element generated by `data` and to a new generator returned by
applying `fun` or filters the generated element.
Works similarly to `bind/2` but allows to filter out unwanted values. It takes
a generator `data` and invokes `fun` with each element generated by `data`.
`fun` must return one of:
* `{:cont, generator}` - `generator` is then used to generate the next
element
* `:skip` - the value generated by `data` is filtered out and a new element
is generated
Since this function acts as a filter as well, it behaves similarly to
`filter/3`: when more than `max_consecutive_failures` elements are filtered
out (that is, `fun` returns `:skip`), a `StreamData.FilterTooNarrowError` is
raised. See the documentation for `filter/3` for suggestions on how to avoid
such errors.
The function can accept one or two arguments. If a two-argument function is
passed, the second argument will be the number of tries left before raising
`StreamData.FilterTooNarrowError`.
## Examples
Say we wanted to create a generator that generates two-element tuples where
the first element is a list of integers with an even number of members and the
second element is a member of that list. We can do that by generating a list
and, if it has even length, taking an element out of it, otherwise filtering
it out.
require Integer
list_data = StreamData.list_of(StreamData.integer(), min_length: 1)
data =
StreamData.bind_filter(list_data, fn
list when Integer.is_even(length(list)) ->
inner_data = StreamData.bind(StreamData.member_of(list), fn member ->
StreamData.constant({list, member})
end)
{:cont, inner_data}
_odd_list ->
:skip
end)
Enum.at(data, 0)
#=> {[-6, -7, -4, 5, -9, 8, 7, -9], 5}
## Shrinking
This generator shrinks like `bind/2` but values that are skipped are not used
for shrinking (similarly to how `filter/3` works).
"""
@spec bind_filter(
t(a),
(a -> {:cont, t(b)} | :skip) | (a, non_neg_integer() -> {:cont, t(b)} | :skip),
non_neg_integer()
) :: t(b)
when a: term(),
b: term()
def bind_filter(data, fun, max_consecutive_failures \\ 10)
def bind_filter(data, fun, max_consecutive_failures) when is_function(fun, 1) do
bind_filter(data, fn elem, _tries_left -> fun.(elem) end, max_consecutive_failures)
end
def bind_filter(data, fun, max_consecutive_failures)
when is_function(fun, 2) and is_integer(max_consecutive_failures) and
max_consecutive_failures >= 0 do
new(fn seed, size ->
case bind_filter(seed, size, data, fun, max_consecutive_failures) do
{:ok, lazy_tree} ->
lazy_tree
:too_many_failures ->
raise FilterTooNarrowError,
max_consecutive_failures: max_consecutive_failures,
last_generated_value: :none
{:too_many_failures, last_generated_value} ->
raise FilterTooNarrowError,
max_consecutive_failures: max_consecutive_failures,
last_generated_value: {:value, last_generated_value}
end
end)
end
defp bind_filter(_seed, _size, _data, _mapper, _tries_left = 0) do
:too_many_failures
end
defp bind_filter(seed, size, data, mapper, tries_left) do
fun = fn elem ->
mapper.(elem, tries_left)
end
{seed1, seed2} = split_seed(seed)
lazy_tree = call(data, seed1, size)
case LazyTree.filter_map(lazy_tree, fun) do
{:ok, filter_mapped_tree} ->
tree =
filter_mapped_tree
|> LazyTree.map(&call(&1, seed2, size))
|> LazyTree.flatten()
{:ok, tree}
:error when tries_left == 1 ->
{:too_many_failures, lazy_tree.root}
:error ->
bind_filter(seed2, size, data, mapper, tries_left - 1)
end
end
@compile {:inline, bind: 2}
@doc """
Binds each element generated by `data` to a new generator returned by applying `fun`.
This function is the basic mechanism for composing generators. It takes a
generator `data` and invokes `fun` with each element in `data`. `fun` must
return a new *generator* that is effectively used to generate items from
now on.
## Examples
Say we wanted to create a generator that returns two-element tuples where the
first element is a non-empty list, and the second element is a random element from that
list. To do that, we can first generate a list and then bind a function to
that list; this function will return the list and a random element from it.
StreamData.bind(StreamData.list_of(StreamData.integer(), min_length: 1), fn list ->
StreamData.bind(StreamData.member_of(list), fn elem ->
StreamData.constant({list, elem})
end)
end)
## Shrinking
The generator returned by `bind/2` shrinks by first shrinking the value
generated by the inner generator and then by shrinking the outer generator
given as `data`. When `data` shrinks, `fun` is once more applied on the
shrunk value and returns a whole new generator, which will most likely
emit new items.
"""
@spec bind(t(a), (a -> t(b))) :: t(b) when a: term(), b: term()
def bind(data, fun) when is_function(fun, 1) do
bind_filter(data, fn generated_term -> {:cont, fun.(generated_term)} end)
end
@doc """
Filters the given generator `data` according to the given `predicate` function.
Only elements generated by `data` that pass the filter are kept in the
resulting generator.
If the filter is too strict, it can happen that too few values generated by `data` satisfy it.
In case more than `max_consecutive_failures` consecutive values don't satisfy the filter, a
`StreamData.FilterTooNarrowError` will be raised. There are a few ways you can avoid risking
`StreamData.FilterTooNarrowError` errors.
* Try to make sure that your filter filters out only a small subset of the elements generated
by `data`. For example, having something like `StreamData.filter(StreamData.integer(), &(&1 !=
0))` is usually fine because only a very tiny part of the generation space (integers) is
being filtered out.
* Keep an eye on how the generation size affects the generator being filtered. For example,
take something like `StreamData.filter(StreamData.positive_integer(), &(&1 not in 1..5)`.
While it seems like this filter is not that strict (as we're filtering out only a handful of
numbers out of all natural numbers), this filter will fail with small generation sizes.
Since `positive_integer/0` returns an integer between `0..size`, if `size` is small (for
example, less than 10) then the probability of generating many consecutive values in `1..5`
is high.
* Try to restructure your generator so that instead of generating many values and taking out
the ones you don't want, you instead generate values and turn all of them into values that
are suitable. A good example is a generator for even integers. You could write it as
def even_integers() do
StreamData.filter(StreamData.integer(), &Integer.is_even/1)
end
but this would generate many unused values, increasing likeliness of
`StreamData.FilterTooNarrowError` errors and performing inefficiently. Instead, you can use
`map/2` to turn all integers into even integers:
def even_integers() do
StreamData.map(StreamData.integer(), &(&1 * 2))
end
## Shrinking
All the values that each generated value shrinks to satisfy `predicate` as
well.
"""
@spec filter(t(a), (a -> as_boolean(term())), non_neg_integer()) :: t(a) when a: term()
def filter(data, predicate, max_consecutive_failures \\ 25)
when is_function(predicate, 1) and is_integer(max_consecutive_failures) and
max_consecutive_failures >= 0 do
bind_filter_fun = fn term ->
if predicate.(term), do: {:cont, constant(term)}, else: :skip
end
bind_filter(data, bind_filter_fun, max_consecutive_failures)
end
### Rich API
@compile {:inline, integer: 1}
@doc """
Generates an integer in the given `range`.
The generation size is ignored since the integer always lies inside `range`.
## Examples
Enum.take(StreamData.integer(4..8), 3)
#=> [6, 7, 7]
## Shrinking
Shrinks towards the smallest absolute value that still lie in `range`.
"""
@spec integer(Range.t()) :: t(integer())
# Range step syntax was introduced in Elixir v1.12.0
if Version.compare(System.version(), "1.12.0") == :lt do
def integer(left..right = _range) do
{lower, upper} = order(left, right)
new(fn seed, _size ->
{init, _next_seed} = uniform_in_range(lower, upper, seed)
integer_lazy_tree(init, lower, upper)
end)
end
else
# Keep the original, somewhat more efficient implementation
# for ranges with a step of 1
def integer(%Range{first: left, last: right, step: 1} = _range) do
if left > right do
raise "cannot generate elements from an empty range"
end
new(fn seed, _size ->
{init, _next_seed} = uniform_in_range(left, right, seed)
integer_lazy_tree(init, left, right)
end)
end
def integer(%Range{first: left, last: right, step: step} = _range) do
require Integer
lower_stepless = Integer.floor_div(left, step)
upper_stepless = Integer.floor_div(right, step)
if lower_stepless > upper_stepless do
raise "cannot generate elements from an empty range"
end
fn seed, _size ->
{init, _next_seed} = uniform_in_range(lower_stepless, upper_stepless, seed)
integer_lazy_tree(init, lower_stepless, upper_stepless)
end
|> new()
|> map(fn result -> result * step end)
end
end
defp integer_lazy_tree(int, lower, upper) do
lazy_tree(int, &integer_lazy_tree(int, lower, upper, _current = int, &1, &2))
end
defp integer_lazy_tree(_int, _lower, _upper, _current, {:halt, acc}, _fun) do
{:halted, acc}
end
defp integer_lazy_tree(int, lower, upper, current, {:suspend, acc}, fun) do
{:suspended, acc, &integer_lazy_tree(int, lower, upper, current, &1, fun)}
end
defp integer_lazy_tree(int, lower, upper, current, {:cont, acc}, fun) do
case int - current do
^int ->
{:done, acc}
to_emit when to_emit >= lower and to_emit <= upper ->
lazy_tree = integer_lazy_tree(to_emit, lower, upper)
integer_lazy_tree(int, lower, upper, div(current, 2), fun.(lazy_tree, acc), fun)
_ ->
integer_lazy_tree(int, lower, upper, div(current, 2), {:cont, acc}, fun)
end
end
## Generator modifiers
@compile {:inline, resize: 2, sized: 1, scale: 2}
@doc """
Resize the given generated `data` to have fixed generation size `new_size`.
The new generator will ignore the generation size and always use `new_size`.
See the "Generation size" section in the documentation for `StreamData` for
more information about the generation size.
## Examples
data = StreamData.resize(StreamData.integer(), 10)
Enum.take(data, 3)
#=> [4, -5, -9]
"""
@spec resize(t(a), size()) :: t(a) when a: term()
def resize(data, new_size) when is_integer(new_size) and new_size >= 0 do
new(fn seed, _size -> call(data, seed, new_size) end)
end
@doc """
Returns the generator returned by calling `fun` with the generation size.
`fun` takes the generation size and has to return a generator, that can use
that size to its advantage.
See the "Generation size" section in the documentation for `StreamData` for
more information about the generation size.
## Examples
Let's build a generator that generates integers in double the range `integer/0`
does:
data = StreamData.sized(fn size ->
StreamData.resize(StreamData.integer(), size * 2)
end)
Enum.take(data, 3)
#=> [0, -1, 5]
"""
@spec sized((size() -> t(a))) :: t(a) when a: term()
def sized(fun) when is_function(fun, 1) do
new(fn seed, size ->
call(fun.(size), seed, size)
end)
end
@doc """
Scales the generation size of the given generator `data` according to
`size_changer`.
When generating data from `data`, the generation size will be the result of
calling `size_changer` with the generation size as its argument. This is
useful, for example, when a generator needs to grow faster or slower than
the default.
See the "Generation size" section in the documentation for `StreamData` for
more information about the generation size.
## Examples
Let's create a generator that generates much smaller integers than `integer/0`
when size grows. We can do this by scaling the generation size to the
logarithm of the generation size.
data = StreamData.scale(StreamData.integer(), fn size ->
trunc(:math.log(size))
end)
Enum.take(data, 3)
#=> [0, 0, -1]
Another interesting example is creating a generator with a fixed maximum
generation size. For example, say we want to generate binaries but we never
want them to be larger than 64 bytes:
small_binaries = StreamData.scale(StreamData.binary(), fn size ->
min(size, 64)
end)
"""
@spec scale(t(a), (size() -> size())) :: t(a) when a: term()
def scale(data, size_changer) when is_function(size_changer, 1) do
new(fn seed, size ->
new_size = size_changer.(size)
call(data, seed, new_size)
end)
end
@doc """
Makes the values generated by `data` not shrink.
## Examples
Let's build a generator of bytes (integers in the `0..255`) range. We can
build this on top of `integer/1`, but for our purposes, it doesn't make sense for
a byte to shrink towards `0`:
byte = StreamData.unshrinkable(StreamData.integer(0..255))
Enum.take(byte, 3)
#=> [190, 181, 178]
## Shrinking
The generator returned by `unshrinkable/1` generates the same values as `data`,
but such values will not shrink.
"""
@spec unshrinkable(t(a)) :: t(a) when a: term()
def unshrinkable(data) do
new(fn seed, size ->
%LazyTree{call(data, seed, size) | children: []}
end)
end
@doc """
Calls the provided zero argument function to generate values.
## Examples
Generating a UUID
uuid = StreamData.repeatedly(&Ecto.UUID.generate/0)
Enum.take(uuid, 3)
#=> ["2712ec5b-bc50-4b4a-8a8a-ca85d37a457b", "2092570d-8fb0-4e67-acbe-92db4c8a2bae", "1bef1fb1-8f86-46ac-a49e-3bffaa51e40b"]
Generating a unique integer
integer = StreamData.repeatedly(&System.unique_integer([:positive, :monotonic]))
Enum.take(integer, 3)
#=> [1, 2, 3]
## Shrinking
By nature, this generator is not shrinkable.
"""
@doc since: "0.6.0"
@spec repeatedly((-> returns)) :: t(returns) when returns: term()
def repeatedly(fun) when is_function(fun, 0) do
new(fn _seed, _size ->
%LazyTree{root: fun.()}
end)
end
@doc """
Makes the given generator `data` always use the same given `seed` when generating.
This function is useful when you want a generator to have a predictable generating
behaviour. It's especially useful when using a generator with the `Enumerable` protocol
since you can't set the seed specifically in that case (while you can with `check_all/3`
for example).
`seed` must be an integer.
## Examples
int = StreamData.seeded(StreamData.integer(), 10)
Enum.take(int, 3)
#=> [-1, -2, 1]
Enum.take(int, 4)
#=> [-1, -2, 1, 2]
"""
@spec seeded(t(a), integer()) :: t(a) when a: term()
def seeded(data, seed) when is_integer(seed) do
seed = new_seed({0, 0, seed})
new(fn _seed, size ->
call(data, seed, size)
end)
end
@doc """
Generates values from different generators with specified probability.
`frequencies` is a list of `{frequency, data}` where `frequency` is an integer
and `data` is a generator. The resulting generator will generate data from one
of the generators in `frequency`, with probability `frequency / vsum_of_frequencies`.
## Examples
Let's build a generator that returns a binary around 25% of the time and an
integer around 75% of the time. We'll use `integer/0` first so that generated values
will shrink towards integers.
ints_and_some_bins = StreamData.frequency([
{3, StreamData.integer()},
{1, StreamData.binary()},
])
Enum.take(ints_and_some_bins, 3)
#=> ["", -2, -1]
## Shrinking
Each generated value is shrunk, and then this generator shrinks towards
values generated by generators earlier in the list of `frequencies`.
"""
# Right now, it shrinks by first shrinking the generated value, and then
# shrinking towards earlier generators in "frequencies". Clojure shrinks
# towards earlier generators *first*, and then shrinks the generated value.
# An implementation that does this can be:
#
# new(fn seed, size ->
# {frequency, next_seed} = uniform_in_range(0..sum - 1, seed)
# index = pick_index(Enum.map(frequencies, &elem(&1, 0)), frequency)
# {_frequency, data} = Enum.fetch!(frequencies, index)
#
# tree = call(data, next_seed, size)
#
# earlier_children =
# frequencies
# |> Stream.take(index)
# |> Stream.map(&call(elem(&1, 1), seed2, size))
#
# %Lazytree{root: tree.root, children: Stream.concat(earlier_children, tree.children)}
# end)
#
@spec frequency([{pos_integer(), t(a)}]) :: t(a) when a: term()
def frequency(frequencies) when is_list(frequencies) do
sum = List.foldl(frequencies, 0, fn {frequency, _data}, acc -> acc + frequency end)
bind(integer(0..(sum - 1)), &pick_frequency(frequencies, &1))
end
defp pick_frequency([{frequency, data} | rest], int) do
if int < frequency do
data
else
pick_frequency(rest, int - frequency)
end
end
@doc """
Generates values out of one of the given `datas`.
`datas` must be a list of generators. The values generated by this generator
are values generated by generators in `datas`, chosen each time at random.
## Examples
data = StreamData.one_of([StreamData.integer(), StreamData.binary()])
Enum.take(data, 3)
#=> [-1, <<28>>, ""]
## Shrinking
The generated value will be shrunk first according to the generator that
generated it, and then this generator will shrink towards earlier generators
in `datas`.
"""
@spec one_of([t(a)]) :: t(a) when a: term()
def one_of([_ | _] = datas) do
datas = List.to_tuple(datas)
bind(integer(0..(tuple_size(datas) - 1)), fn index -> elem(datas, index) end)
end
@doc """
Generates elements taken randomly out of `enum`.
`enum` must be a non-empty and **finite** enumerable. If given an empty
enumerable, this function raises an error. If given an infinite enumerable,
this function will not terminate.
## Examples
Enum.take(StreamData.member_of([:ok, 4, "hello"]), 3)
#=> [4, 4, "hello"]
## Shrinking
This generator shrinks towards elements that appear earlier in `enum`.
"""
@spec member_of(Enumerable.t()) :: t(term())
def member_of(enum) do
enum_length = Enum.count(enum)
if enum_length == 0 do
raise "cannot generate elements from an empty enumerable"
end
map(integer(0..(enum_length - 1)), fn index -> Enum.fetch!(enum, index) end)
end
## Compound data types
@doc """
Generates lists where each values is generated by the given `data`.
Each generated list can contain duplicate elements. The length of the
generated list is bound by the generation size. If the generation size is `0`,
the empty list will always be generated. Note that the accepted options
provide finer control over the size of the generated list. See the "Options"
section below.
## Options
* `:length` - (integer or range) if an integer, the exact length the
generated lists should be; if a range, the range in which the length of
the generated lists should be. If provided, `:min_length` and
`:max_length` are ignored.
* `:min_length` - (integer) the minimum length of the generated lists.
* `:max_length` - (integer) the maximum length of the generated lists.
## Examples
Enum.take(StreamData.list_of(StreamData.binary()), 3)
#=> [[""], [], ["", "w"]]
Enum.take(StreamData.list_of(StreamData.integer(), length: 3), 3)
#=> [[0, 0, -1], [2, -1, 1], [0, 3, -3]]
Enum.take(StreamData.list_of(StreamData.integer(), max_length: 1), 3)
#=> [[1], [], []]
## Shrinking
This generator shrinks by taking elements out of the generated list and also
by shrinking the elements of the generated list. Shrinking still respects any
possible length-related option: for example, if `:min_length` is provided, all
shrunk list will have more than `:min_length` elements.
"""
# We could have an implementation that relies on fixed_list/1 and List.duplicate/2,
# it would look like this:
#
# new(fn seed, size ->
# {length, next_seed} = uniform_in_range(0..size, seed)
# data
# |> List.duplicate(length)
# |> fixed_list()
# |> call(next_seed, size)
# |> LazyTree.map(&list_lazy_tree/1)
# |> LazyTree.flatten()
# end)
#
@spec list_of(t(a), keyword()) :: t([a]) when a: term()
def list_of(data, options) do
list_length_range_fun = list_length_range_fun(options)
new(fn seed, size ->
{min_length, max_length} = list_length_range_fun.(size)
{length, next_seed} = uniform_in_range(min_length, max_length, seed)
data
|> call_n_times(next_seed, size, length, [])
|> LazyTree.zip()
|> LazyTree.map(&list_lazy_tree(&1, min_length))
|> LazyTree.flatten()
end)
end
@doc """
Generates lists where each values is generated by the given `data`.
The same as calling `list_of/2` with `[]` as options.
"""
@spec list_of(t(a)) :: t([a]) when a: term()
def list_of(data) do
new(fn seed, size ->
{length, next_seed} = uniform_in_range(0, size, seed)
data
|> call_n_times(next_seed, size, length, [])
|> LazyTree.zip()
|> LazyTree.map(&list_lazy_tree(&1, 0))
|> LazyTree.flatten()
end)
end
defp list_length_range_fun(options) do
{min, max} =
case Keyword.fetch(options, :length) do
{:ok, length} when is_integer(length) and length >= 0 ->
{length, length}
{:ok, min..max} when min >= 0 and max >= 0 ->
order(min, max)
{:ok, other} ->
raise ArgumentError,
":length must be a positive integer or a range " <>
"of positive integers, got: #{inspect(other)}"
:error ->
min_length = Keyword.get(options, :min_length, 0)
max_length = Keyword.get(options, :max_length, :infinity)
unless is_integer(min_length) and min_length >= 0 do
raise ArgumentError,
":min_length must be a positive integer, got: #{inspect(min_length)}"
end
unless (is_integer(max_length) and max_length >= 0) or max_length == :infinity do
raise ArgumentError,
":max_length must be a positive integer, got: #{inspect(max_length)}"
end
{min_length, max_length}
end
fn size -> {min, max |> min(size) |> max(min)} end
end
defp call_n_times(_data, _seed, _size, 0, acc) do
acc
end
defp call_n_times(data, seed, size, length, acc) do
{seed1, seed2} = split_seed(seed)
call_n_times(data, seed2, size, length - 1, [call(data, seed1, size) | acc])
end
defp list_lazy_tree(list, min_length) do
length = length(list)
if length == min_length do
lazy_tree_constant(list)
else
children =
Stream.map(0..(length - 1), fn index ->
list_lazy_tree(List.delete_at(list, index), min_length)
end)
lazy_tree(list, children)
end
end
@doc """
Generates a list of elements generated by `data` without duplicates (possibly
according to a given uniqueness function).
This generator will generate lists where each list is unique according to the
value returned by applying the given uniqueness function to each element
(similarly to how `Enum.uniq_by/2` works). If more than the value of the
`:max_tries` option consecutive elements are generated that are considered
duplicates according to the uniqueness function, a
`StreamData.TooManyDuplicatesError` error is raised. For this reason, try to
make sure to not make the uniqueness function return values out of a small
value space. The uniqueness function and the max number of tries can be
customized via options.
## Options
* `:uniq_fun` - (a function of arity one) a function that is called with
each generated element and whose return value is used as the value to
compare with other values for uniqueness (similarly to `Enum.uniq_by/2`).
* `:max_tries` - (non-negative integer) the maximum number of times that
this generator tries to generate the next element of the list before
giving up and raising a `StreamData.TooManyDuplicatesError` in case it
can't find a unique element to generate. Note that the generation size
often affects this: for example, if you have a generator like
`uniq_list_of(integer(), min_length: 4)` and you start generating elements
out of it with a generation size of `1`, it will fail by definition
because `integer/0` generates in `-size..size` so it would only generate
in a set (`[-1, 0, 1]`) with three elements. Use `resize/2` or `scale/2`
to manipulate the size (for example by setting a minimum generation size
of `3`) in such cases.
* `:length` - (non-negative integer) same as in `list_of/2`.
* `:min_length` - (non-negative integer) same as in `list_of/2`.
* `:max_length` - (non-negative integer) same as in `list_of/2`.
## Examples
data = StreamData.uniq_list_of(StreamData.integer())
Enum.take(data, 3)
#=> [[1], [], [2, 3, 1]]
## Shrinking
This generator shrinks like `list_of/1`, but the shrunk values are unique
according to the `:uniq_fun` option as well.
"""
@spec uniq_list_of(t(a), keyword()) :: t([a]) when a: term()
def uniq_list_of(data, options \\ []) do
uniq_fun = Keyword.get(options, :uniq_fun, & &1)
max_tries = Keyword.get(options, :max_tries, 10)
list_length_range_fun = list_length_range_fun(options)
new(fn seed, size ->
{min_length, max_length} = list_length_range_fun.(size)
{length, next_seed} = uniform_in_range(min_length, max_length, seed)
data
|> uniq_list_of(
uniq_fun,
next_seed,
size,
_seen = MapSet.new(),
max_tries,
max_tries,
length,
[]
)
|> LazyTree.zip()
|> LazyTree.map(&list_lazy_tree(&1, min_length))
|> LazyTree.flatten()
|> LazyTree.map(&Enum.uniq_by(&1, uniq_fun))
|> LazyTree.filter(&(length(&1) >= min_length))
end)
end
defp uniq_list_of(
_data,
_uniq_fun,
_seed,
_size,
seen,
_tries_left = 0,
max_tries,
remaining,
_acc
) do
raise TooManyDuplicatesError,
max_tries: max_tries,
remaining_to_generate: remaining,
generated: seen
end
defp uniq_list_of(
_data,
_uniq_fun,
_seed,
_size,
_seen,
_tries_left,
_max_tries,
_remaining = 0,
acc
) do
acc
end
defp uniq_list_of(data, uniq_fun, seed, size, seen, tries_left, max_tries, remaining, acc) do
{seed1, seed2} = split_seed(seed)
tree = call(data, seed1, size)
key = uniq_fun.(tree.root)
if MapSet.member?(seen, key) do
uniq_list_of(data, uniq_fun, seed2, size, seen, tries_left - 1, max_tries, remaining, acc)
else
uniq_list_of(
data,
uniq_fun,
seed2,
size,
MapSet.put(seen, key),
max_tries,
max_tries,
remaining - 1,
[tree | acc]
)
end
end
@doc ~S"""
Generates non-empty improper lists where elements of the list are generated
out of `first` and the improper ending out of `improper`.
## Examples
data = StreamData.nonempty_improper_list_of(StreamData.byte(), StreamData.binary())
Enum.take(data, 3)
#=> [[42], [56 | <<140, 137>>], [226 | "j"]]
## Shrinking
Shrinks towards smaller lists (that are still non-empty, having the improper
ending) and towards shrunk elements of the list and a shrunk improper
ending.
"""
@spec nonempty_improper_list_of(t(a), t(b)) :: t(nonempty_improper_list(a, b))
when a: term(),
b: term()
def nonempty_improper_list_of(first, improper) do
map({list_of(first, min_length: 1), improper}, fn
{list, ending} ->
list ++ ending
end)
end
@doc """
Generates lists of elements out of `first` with a chance of them being
improper with the improper ending taken out of `improper`.
Behaves similarly to `nonempty_improper_list_of/2` but can generate empty
lists and proper lists as well.
## Examples
data = StreamData.maybe_improper_list_of(StreamData.byte(), StreamData.binary())
Enum.take(data, 3)
#=> [[60 | "."], [], [<<212>>]]
## Shrinking
Shrinks towards smaller lists and shrunk elements in those lists, and
ultimately towards proper lists.
"""
@spec maybe_improper_list_of(t(a), t(b)) :: t(maybe_improper_list(a, b))
when a: term(),
b: term()
def maybe_improper_list_of(first, improper) do
frequency([
{2, list_of(first)},
{1, nonempty_improper_list_of(first, improper)}
])
end
@doc """
Generates a list of fixed length where each element is generated from the
corresponding generator in `data`.
## Examples
data = StreamData.fixed_list([StreamData.integer(), StreamData.binary()])
Enum.take(data, 3)
#=> [[1, <<164>>], [2, ".T"], [1, ""]]
## Shrinking
Shrinks by shrinking each element in the generated list according to the
corresponding generator. Shrunk lists never lose elements.
"""
@spec fixed_list([t(a)]) :: t([a]) when a: term()
def fixed_list(datas) when is_list(datas) do
new(fn seed, size ->
{trees, _seed} =
Enum.map_reduce(datas, seed, fn data, acc ->
{seed1, seed2} = split_seed(acc)
{call(data, seed1, size), seed2}
end)
LazyTree.zip(trees)
end)
end
@doc """
Generates tuples where each element is taken out of the corresponding
generator in the `tuple_datas` tuple.
## Examples
data = StreamData.tuple({StreamData.integer(), StreamData.binary()})
Enum.take(data, 3)
#=> [{-1, <<170>>}, {1, "<"}, {1, ""}]
## Shrinking
Shrinks by shrinking each element in the generated tuple according to the
corresponding generator.
"""
@spec tuple(tuple()) :: t(tuple())
def tuple(tuple_datas) when is_tuple(tuple_datas) do
new(fn seed, size -> call(tuple_datas, seed, size) end)
end
@doc """
Generates maps with keys from `key_data` and values from `value_data`.
Since maps require keys to be unique, this generator behaves similarly to
`uniq_list_of/2`: if more than `max_tries` duplicate keys are generated
consequently, it raises a `StreamData.TooManyDuplicatesError` exception.
## Options
* `:length` - (non-negative integer) same as in `list_of/2`.
* `:min_length` - (non-negative integer) same as in `list_of/2`.
* `:max_length` - (non-negative integer) same as in `list_of/2`.
## Examples
Enum.take(StreamData.map_of(StreamData.integer(), StreamData.boolean()), 3)
#=> [%{}, %{1 => false}, %{-2 => true, -1 => false}]
## Shrinking
Shrinks towards smallest maps and towards shrinking keys and values according
to the respective generators.
"""
@spec map_of(t(key), t(value), keyword()) :: t(%{optional(key) => value})
when key: term(),
value: term()
def map_of(key_data, value_data, options \\ []) do
options = Keyword.put(options, :uniq_fun, fn {key, _value} -> key end)
{key_data, value_data}
|> uniq_list_of(options)
|> map(&Map.new/1)
end
@doc """
Generates maps with fixed keys and generated values.
`data_map` is a map or keyword list of `fixed_key => data` pairs. Maps generated by this
generator will have the same keys as `data_map` and values corresponding to values generated by
the generator under those keys.
See also `optional_map/1`.
## Examples
data = StreamData.fixed_map(%{
integer: StreamData.integer(),
binary: StreamData.binary(),
})
Enum.take(data, 3)
#=> [%{binary: "", integer: 1}, %{binary: "", integer: -2}, %{binary: "R1^", integer: -3}]
## Shrinking
This generator shrinks by shrinking the values of the generated map.
"""
@spec fixed_map(map() | keyword()) :: t(map())
def fixed_map(data)
def fixed_map(data_map) when is_list(data_map) or is_map(data_map) do
data_map
|> Enum.map(fn {key, value_data} -> {constant(key), value_data} end)
|> fixed_list()
|> map(&Map.new/1)
end
@doc """
Generates maps with fixed but optional keys and generated values.
`data_map` is a map or keyword list of `fixed_key => data` pairs. Maps generated by this
generator will have a subset of the keys of `data_map` and values corresponding to the values
generated by the generator unders those keys.
By default, all keys are considered optional. A list of exactly which keys are optional can be
provided as the second argument, allowing for a map of mixed optional and required keys. The second argument is available since StreamData 0.6.0.
See also `fixed_map/1`.
## Examples
data = StreamData.optional_map(%{
integer: StreamData.integer(),
binary: StreamData.binary(),
})
Enum.take(data, 3)
#=> [%{binary: "", integer: 1}, %{integer: -2}, %{binary: "R1^"}]
data = StreamData.optional_map(%{
integer: StreamData.integer(),
binary: StreamData.binary(),
}, [:integer])
Enum.take(data, 3)
#=> [%{binary: ""}, %{binary: "R1^", integer: -2}, %{binary: "R2^"}]
## Shrinking
This generator shrinks by first shrinking the map by taking out keys until the map is empty, and
then by shrinking the generated values.
"""
@spec optional_map(map() | keyword(), list(any) | nil) :: t(map())
def optional_map(data, optional_keys \\ nil)
def optional_map(data, optional_keys) when is_list(data) do
optional_map(Map.new(data), optional_keys)
end
def optional_map(data_map, optional_keys) when is_map(data_map) do
keys = Map.keys(data_map)
subkeys_data = sublist(optional_keys || keys)
constant_keys =
if optional_keys do
keys -- optional_keys
else
[]
end
new(fn seed, size ->
{seed1, seed2} = split_seed(seed)
subkeys_tree = call(subkeys_data, seed1, size)
# This map contains the constant keys and the data value for the map we are going to
# generate. Since we are only going to take keys away, doing this here instead of generating
# the map first with all the keys and then taking away values saves us from generating keys
# that we are never going to use.
base_data_map = Map.take(data_map, constant_keys ++ subkeys_tree.root)
call(fixed_map(base_data_map), seed2, size)
|> LazyTree.map(fn fixed_map ->
LazyTree.map(subkeys_tree, &Map.take(fixed_map, constant_keys ++ &1))
end)
|> LazyTree.flatten()
end)
end
defp sublist(list) do
map(list_of(boolean(), length: length(list)), fn indexes_to_keep ->
for {elem, true} <- Enum.zip(list, indexes_to_keep), do: elem
end)
end
@doc """
Generates keyword lists where values are generated by `value_data`.
Keys are always atoms.
## Examples
Enum.take(StreamData.keyword_of(StreamData.integer()), 3)
#=> [[], [sY: 1], [t: -1]]
## Shrinking
This generator shrinks equivalently to a list of key-value tuples generated by
`list_of/1`, that is, by shrinking the values in each tuple and also reducing
the size of the generated keyword list.
"""
@spec keyword_of(t(a)) :: t(keyword(a)) when a: term()
def keyword_of(value_data) do
list_of({atom(:alphanumeric), value_data})
end
@doc """
Generates sets where values are generated by `data`.
## Options
* `:max_tries` - (non-negative integer) the maximum number of times that
this generator tries to generate the next element of the set before
giving up and raising a `StreamData.TooManyDuplicatesError` in case it
can't find a unique element to generate.
## Examples
Enum.take(StreamData.mapset_of(StreamData.integer()), 3)
#=> [#MapSet<[-1]>, #MapSet<[1, 2]>, #MapSet<[-3, 2, 3]>]
## Shrinking
This generator shrinks in the same way as `uniq_list_of/2`, by removing
elements and shrinking elements as well.
"""
@spec mapset_of(t(a), keyword()) :: t(MapSet.t(a)) when a: term()
def mapset_of(data, options \\ []) do
options = Keyword.take(options, [:max_tries])
data
|> uniq_list_of(options)
|> map(&MapSet.new/1)
end
@doc """
Constrains the given `enum_data` to be non-empty.
`enum_data` must be a generator that emits enumerables, such as lists
and maps. `nonempty/1` will filter out enumerables that are empty
(`Enum.empty?/1` returns `true`).
## Examples
Enum.take(StreamData.nonempty(StreamData.list_of(StreamData.integer())), 3)
#=> [[1], [-1, 0], [2, 1, -2]]
"""
@spec nonempty(t(Enumerable.t())) :: t(Enumerable.t())
def nonempty(enum_data) do
filter(enum_data, &(not Enum.empty?(&1)))
end
@doc ~S"""
Generates trees of values generated by `leaf_data` and `subtree_fun`.
`leaf_data` generates the leaf nodes. `subtree_fun` is a function that is
called by `tree`, if an inner node of the tree shall be generated. It takes a
generator `child_gen` for child nodes and returns a generator for an inner
node using `child_gen` to go "one level deeper" in the tree. The frequency
between leaves and inner nodes is 1:2.
This is best explained with an example. Say that we want to generate binary
trees of integers, and that we represent binary trees as either an integer (a
leaf) or a `%Branch{}` struct:
defmodule Branch do
defstruct [:left, :right]
end
Now, we can generate trees by using the `integer()` generator to generate
the leaf nodes. Then we can use the `subtree_fun` function to generate inner
nodes (that is, `%Branch{}` structs or `integer()`s).
tree_data =
StreamData.tree(StreamData.integer(), fn child_data ->
StreamData.map({child_data, child_data}, fn {left, right} ->
%Branch{left: left, right: right}
end)
end)
Enum.at(StreamData.resize(tree_data, 10), 0)
#=> %Branch{left: %Branch{left: 4, right: -1}, right: -2}
## Examples
A common example is a nested list:
data = StreamData.tree(StreamData.integer(), &StreamData.list_of/1)
Enum.at(StreamData.resize(data, 10), 0)
#=> [[], '\t', '\a', [1, 2], -3, [-7, [10]]]
A more complex example is generating data that could represent the Elixir
equivalent of a JSON document. The code below is slightly simplified
compared to the JSON spec.
scalar_generator =
StreamData.one_of([
StreamData.integer(),
StreamData.boolean(),
StreamData.string(:ascii),
nil
])
json_generator =
StreamData.tree(scalar_generator, fn nested_generator ->
StreamData.one_of([
StreamData.list_of(nested_generator),
StreamData.map_of(StreamData.string(:ascii, min_length: 1), nested_generator)
])
end)
Enum.at(StreamData.resize(json_generator, 10), 0)
#=> [%{"#" => "5"}, true, %{"4|B" => nil, "7" => true, "yt(3y" => 4}, [[false]]]
## Shrinking
Shrinks values and shrinks towards less deep trees.
"""
@spec tree(t(a), (child_data :: t(a | b) -> t(b))) :: t(a | b) when a: term(), b: term()
def tree(leaf_data, subtree_fun) do
new(fn seed, size ->
leaf_data = resize(leaf_data, size)
{seed1, seed2} = split_seed(seed)
nodes_on_each_level = random_pseudofactors(trunc(:math.pow(size, 1.1)), seed1)
data =
Enum.reduce(nodes_on_each_level, leaf_data, fn nodes_on_this_level, data_acc ->
frequency([
{1, data_acc},
{2, resize(subtree_fun.(data_acc), nodes_on_this_level)}
])
end)
call(data, seed2, size)
end)
end
defp random_pseudofactors(n, _seed) when n < 2 do
[n]
end
defp random_pseudofactors(n, seed) do
{seed1, seed2} = split_seed(seed)
{factor, _seed} = :rand.uniform_s(trunc(:math.log2(n)), seed1)
if factor == 1 do
[n]
else
[factor | random_pseudofactors(div(n, factor), seed2)]
end
end
## Data types
@doc """
Generates boolean values.
## Examples
Enum.take(StreamData.boolean(), 3)
#=> [true, true, false]
## Shrinking
Shrinks towards `false`.
"""
@spec boolean() :: t(boolean())
def boolean() do
new(fn seed, _size ->
case uniform_in_range(0, 1, seed) do
{1, _} -> lazy_tree(true, [lazy_tree_constant(false)])
{0, _} -> lazy_tree_constant(false)
end
end)
end
@doc """
Generates integers bound by the generation size.
## Examples
Enum.take(StreamData.integer(), 3)
#=> [1, -1, -3]
## Shrinking
Generated values shrink towards `0`.
"""
@spec integer() :: t(integer())
def integer() do
new(fn seed, size ->
{init, _next_seed} = uniform_in_range(-size, size, seed)
integer_lazy_tree(init, -size, size)
end)
end
@doc """
Generates positive integers bound by the generation size.
## Examples
Enum.take(StreamData.positive_integer(), 3)
#=> [1, 1, 3]
## Shrinking
Generated values shrink towards `1`.
"""
@spec positive_integer() :: t(pos_integer())
def positive_integer() do
new(fn seed, size ->
size = max(size, 1)
{init, _next_seed} = uniform_in_range(1, size, seed)
integer_lazy_tree(init, 1, size)
end)
end
@doc """
Generates non-negative integers bound by the generation size.
## Examples
Enum.take(StreamData.non_negative_integer(), 3)
#=> [0, 2, 0]
## Shrinking
Generated values shrink towards `0`.
"""
@doc since: "0.6.0"
@spec non_negative_integer() :: t(non_neg_integer())
def non_negative_integer() do
new(fn seed, size ->
size = max(size, 0)
{init, _next_seed} = uniform_in_range(0, size, seed)
integer_lazy_tree(init, 0, size)
end)
end
@doc """
Generates floats according to the given `options`.
The complexity of the generated floats grows proportionally to the generation size.
## Options
* `:min` - (float) if present, the generated floats will be greater than or equal to this
value.
* `:max` - (float) if present, the generated floats will be less than or equal to this value.
If neither of `:min` or `:max` is provided, then unbounded floats will be generated.
## Shrinking
Values generated by this generator will shrink towards simpler floats. Such values are not
guaranteed to shrink towards smaller or larger values (but they will never violate the `:min` or
`:max` options).
"""
@spec float(keyword()) :: t(float())
def float(options \\ []) do
case {Keyword.get(options, :min), Keyword.get(options, :max)} do
{nil, nil} ->
bind(boolean(), fn negative? ->
map(positive_float_without_bounds(), &if(negative?, do: -&1, else: &1))
end)
{min, nil} ->
map(positive_float_without_bounds(), &(&1 + min))
{nil, max} ->
map(positive_float_without_bounds(), &(-&1 + max))
{min, max} when min <= max ->
float_with_bounds(min, max)
end
end
defp positive_float_without_bounds() do
sized(fn size ->
abs_exp = min(size, 1023)
decimal_part = float_in_0_to_1(abs_exp)
bind(boolean(), fn negative_exp? ->
if negative_exp? do
decimal_part
else
int_part = power_of_two_with_zero(abs_exp)
map({decimal_part, int_part}, fn {decimal, int} -> decimal + int end)
end
end)
end)
end
defp float_with_bounds(min, max) do
sized(fn size ->
exponent_data = integer(0..min(size, 1023))
bind(exponent_data, fn exponent ->
map(float_in_0_to_1(exponent), fn float -> float * (max - min) + min end)
end)
end)
end
defp float_in_0_to_1(abs_exp) do
factor = :math.pow(2, -abs_exp)
map(power_of_two_with_zero(abs_exp), &(&1 * factor))
end
defp power_of_two_with_zero(abs_exp) do
new(fn seed, _size ->
{integer, _} = uniform_in_range(0, power_of_two(abs_exp), seed)
powers = Stream.map(abs_exp..0, &lazy_tree_constant(power_of_two(&1)))
lazy_tree(integer, Enum.concat(powers, [lazy_tree_constant(0)]))
end)
end
defp power_of_two(0), do: 1
defp power_of_two(n), do: 2 * power_of_two(n - 1)
@doc """
Generates bytes.
A byte is an integer between `0` and `255`.
## Examples
Enum.take(StreamData.byte(), 3)
#=> [102, 161, 13]
## Shrinking
Values generated by this generator shrink like integers, so towards bytes
closer to `0`.
"""
@spec byte() :: t(byte())
def byte() do
integer(0..255)
end
@doc """
Generates binaries.
The length of the generated binaries is limited by the generation size.
## Options
* `:length` - (non-negative integer) sets the exact length of the generated
binaries (same as in `list_of/2`).
* `:min_length` - (non-negative integer) sets the minimum length of the
generated binaries (same as in `list_of/2`). Ignored if `:length` is
present.
* `:max_length` - (non-negative integer) sets the maximum length of the
generated binaries (same as in `list_of/2`). Ignored if `:length` is
present.
## Examples
Enum.take(StreamData.binary(), 3)
#=> [<<1>>, "", "@Q"]
## Shrinking
Values generated by this generator shrink by becoming smaller binaries and by
having individual bytes that shrink towards `0`.
"""
@spec binary(keyword()) :: t(binary())
def binary(options \\ []) do
list_options = Keyword.take(options, [:length, :min_length, :max_length])
map(list_of(byte(), list_options), &:binary.list_to_bin/1)
end
@doc """
Generates bitstrings.
The length of the generated bitstring is limited by the generation size.
## Options
* `:length` - (non-negative integer) sets the exact length of the generated
bitstrings (same as in `list_of/2`).
* `:min_length` - (non-negative integer) sets the minimum length of the
generated bitstrings (same as in `list_of/2`). Ignored if `:length` is
present.
* `:max_length` - (non-negative integer) sets the maximum length of the
generated bitstrings (same as in `list_of/2`). Ignored if `:length` is
present.
## Examples
Enum.take(StreamData.bitstring(), 3)
#=> [<<0::size(1)>>, <<2::size(2)>>, <<5::size(3)>>]
## Shrinking
Values generated by this generator shrink by becoming smaller bitstrings and
by having the individual bits go towards `0`.
"""
@spec bitstring(keyword()) :: t(bitstring())
def bitstring(options \\ []) do
list_options = Keyword.take(options, [:length, :min_length, :max_length])
map(list_of(integer(0..1), list_options), fn bits ->
Enum.reduce(bits, <<>>, fn bit, acc -> <<bit::1, acc::bits>> end)
end)
end
@ascii_chars ?\s..?~
# "UTF-8 prohibits encoding character numbers between U+D800 and U+DFFF"
@utf8_chars [0..0xD7FF, 0xE000..0x10FFFF]
@alphanumeric_chars [?a..?z, ?A..?Z, ?0..?9]
@printable_chars [
?\n,
?\r,
?\t,
?\v,
?\b,
?\f,
?\e,
?\d,
?\a,
0x20..0x7E,
0xA0..0xD7FF,
0xE000..0xFFFD,
0x10000..0x10FFFF
]
@doc """
Generates an integer corresponding to a valid UTF-8 codepoint of the given kind.
`kind` can be:
* `:ascii` - only ASCII characters are generated. Shrinks towards lower codepoints.
* `:alphanumeric` - only alphanumeric characters (`?a..?z`, `?A..?Z`, `?0..?9`)
are generated. Shrinks towards `?a` following the order shown previously.
* `:printable` - only printable codepoints
(`String.printable?(<<codepoint::utf8>>)` returns `true`)
are generated. Shrinks towards lower codepoints.
* `:utf8` - all valid codepoints (`<<codepoint::utf8>>)` does not raise)
are generated. Shrinks towards lower codepoints.
Defaults to `:utf8`.
## Examples
Enum.take(StreamData.codepoint(), 3)
#=> [889941, 349615, 1060099]
Enum.take(StreamData.codepoint(:ascii), 3)
#=> ~c"Kk:"
"""
@doc since: "0.6.0"
@spec codepoint(:ascii | :alphanumeric | :printable | :utf8) :: t(char())
def codepoint(kind \\ :utf8)
def codepoint(:ascii), do: integer(@ascii_chars)
def codepoint(:alphanumeric), do: codepoint_with_frequency(@alphanumeric_chars)
def codepoint(:printable), do: codepoint_with_frequency(@printable_chars)
def codepoint(:utf8), do: codepoint_with_frequency(@utf8_chars)
defp codepoint_with_frequency(chars_or_ranges) do
chars_or_ranges
|> Enum.map(fn
%Range{} = range ->
{Enum.count(range), integer(range)}
codepoint when is_integer(codepoint) ->
{1, constant(codepoint)}
end)
|> frequency()
end
@doc """
Generates a string of the given kind or from the given characters.
`kind_or_codepoints` can be:
* `:ascii` - strings containing only ASCII characters are generated. Such
strings shrink towards lower codepoints.
* `:alphanumeric` - strings containing only alphanumeric characters
(`?a..?z`, `?A..?Z`, `?0..?9`) are generated. Such strings shrink towards
`?a` following the order shown previously.
* `:printable` - printable strings (`String.printable?/1` returns `true`)
are generated. Such strings shrink towards lower codepoints.
* `:utf8` - valid strings (`String.valid?/1` returns `true`)
are generated. Such strings shrink towards lower codepoints. *Available
since 0.6.0.*
* a range - strings with characters from the range are generated. Such
strings shrink towards characters that appear earlier in the range.
* a list of ranges or single codepoints - strings with characters from the
ranges or codepoints are generated. Such strings shrink towards earlier
elements of the given list and towards the beginning of ranges.
## Options
See the documentation of `list_of/2` for the possible values of options.
## Examples
Enum.take(StreamData.string(:ascii), 3)
#=> ["c", "9A", ""]
Enum.take(StreamData.string(Enum.concat([?a..?c, ?l..?o])), 3)
#=> ["c", "oa", "lb"]
## Shrinking
Shrinks towards smaller strings and as described in the description of the
possible values of `kind_or_codepoints` above.
"""
@spec string(
:ascii
| :alphanumeric
| :printable
| :utf8
| Range.t()
| [Range.t() | pos_integer()]
) ::
t(String.t())
def string(kind_or_codepoints, options \\ [])
def string(atom, options) when atom in [:ascii, :alphanumeric, :printable, :utf8] do
atom
|> codepoint()
|> string_from_codepoint_data(options)
end
def string(%Range{} = codepoints_range, options) do
string_from_codepoint_data(integer(codepoints_range), options)
end
def string(codepoints, options) when is_list(codepoints) and is_list(options) do
codepoints
|> codepoint_with_frequency()
|> string_from_codepoint_data(options)
end
def string(other, _options) do
raise ArgumentError,
"unsupported string kind, has to be one of :ascii, " <>
":alphanumeric, :printable, :utf8, a range, or a list of " <>
"ranges or single codepoints, got: #{inspect(other)}"
end
defp string_from_codepoint_data(codepoint_data, options) do
codepoint_data
|> list_of(options)
|> map(&List.to_string/1)
end
@doc """
Generates atoms of various `kind`s.
`kind` can be:
* `:alphanumeric` - this generates alphanumeric atoms that don't need to be quoted when
written as literals. For example, it will generate `:foo` but not `:"foo bar"`.
* `:alias` - generates Elixir aliases like `Foo` or `Foo.Bar.Baz`.
These are some of the most common kinds of atoms usually used in Elixir applications. If you
need completely arbitrary atoms, you can use a combination of `map/2`, `String.to_atom/1`,
and string-focused generators to transform arbitrary strings into atoms:
printable_atom =
StreamData.map(
StreamData.string(:printable, max_length: 255),
&String.to_atom/1
)
Bear in mind the [system limit](http://erlang.org/doc/efficiency_guide/advanced.html#system-limits)
of 255 characters in an atom when doing so.
## Examples
Enum.take(StreamData.atom(:alphanumeric), 3)
#=> [:xF, :y, :B_]
## Shrinking
Shrinks towards smaller atoms and towards "simpler" letters (like towards only alphabet
letters).
"""
@spec atom(:alphanumeric | :alias) :: t(atom())
def atom(kind)
@unquoted_atom_characters [?a..?z, ?A..?Z, ?0..?9, ?_, ?@]
def atom(:alphanumeric) do
starting_char =
frequency([
{4, integer(?a..?z)},
{2, integer(?A..?Z)},
{1, constant(?_)}
])
# We limit the size to 254 so that adding the first character doesn't
# break the system limit of 255 chars in an atom.
rest = scale(string(@unquoted_atom_characters), &min(&1, 254))
{starting_char, rest}
|> map(fn {first, rest} -> String.to_atom(<<first, rest::binary>>) end)
|> scale_with_exponent(0.75)
end
@alias_atom_characters [?a..?z, ?A..?Z, ?0..?9, ?_]
def atom(:alias) do
sized(fn size ->
max_list_length =
size
|> min(100)
|> :math.pow(0.75)
|> Float.ceil()
|> trunc()
first_letter = integer(?A..?Z)
other_letters = string(@alias_atom_characters, max_length: trunc(255 / max_list_length))
{first_letter, other_letters}
|> map(fn {first, rest} -> <<first, rest::binary>> end)
|> list_of(length: 1..max_list_length)
|> map(&Module.concat/1)
end)
end
@doc """
Generates iolists.
Iolists are values of the `t:iolist/0` type.
## Examples
Enum.take(StreamData.iolist(), 3)
#=> [[164 | ""], [225], ["" | ""]]
## Shrinking
Shrinks towards smaller and less nested lists and towards bytes instead of
binaries.
"""
@spec iolist() :: t(iolist())
def iolist() do
iolist_or_chardata_tree(byte(), binary())
end
@doc """
Generates iodata.
Iodata are values of the `t:iodata/0` type.
## Examples
Enum.take(StreamData.iodata(), 3)
#=> [[""], <<198>>, [115, 172]]
## Shrinking
Shrinks towards less nested iodata and ultimately towards smaller binaries.
"""
@spec iodata() :: t(iodata())
def iodata() do
frequency([
{3, binary()},
{2, iolist()}
])
end
@doc """
Generates chardata.
Chardata are values of the `t:IO.chardata/0` type.
## Examples
Enum.take(StreamData.chardata(), 3)
#=> ["", [""], [12174]]
## Shrinking
Shrinks towards less nested chardata and ultimately towards smaller binaries.
"""
@doc since: "0.6.0"
@spec chardata() :: t(IO.chardata())
def chardata() do
frequency([
{3, string(:utf8)},
{2, iolist_or_chardata_tree(codepoint(:utf8), string(:utf8))}
])
end
defp iolist_or_chardata_tree(int_type, binary_type) do
# We try to use binaries that scale slower otherwise we end up with iodata with
# big binaries at many levels deep.
scaled_binary = scale_with_exponent(binary_type, 0.6)
improper_ending = one_of([scaled_binary, constant([])])
tree = tree(one_of([int_type, scaled_binary]), &maybe_improper_list_of(&1, improper_ending))
map(tree, &List.wrap/1)
end
@doc """
Generates any term.
The terms that this generator can generate are simple terms or compound terms. The simple terms
are:
* integers (through `integer/0`)
* binaries (through `binary/1`)
* floats (through `float/0`)
* booleans (through `boolean/0`)
* atoms (through `atom/1`)
* references (which are not shrinkable)
Compound terms are terms that contain other terms (which are generated recursively with
`term/0`):
* lists (through `list_of/2`)
* maps (through `map_of/2`)
* tuples
## Examples
Enum.take(StreamData.term(), 3)
#=> [0.5119003572251588, {{true, ""}}, :WJg]
## Shrinking
The terms generated by this generator shrink based on the generator used to create them (see the
list of possible generated terms above).
"""
@spec term() :: t(simple | [simple] | %{optional(simple) => simple} | tuple())
when simple: boolean() | integer() | binary() | float() | atom() | reference()
def term() do
ref = new(fn _seed, _size -> lazy_tree_constant(make_ref()) end)
simple_term = one_of([boolean(), integer(), binary(), float(), atom(:alphanumeric), ref])
tree(simple_term, fn leaf ->
one_of([list_of(leaf), map_of(leaf, leaf), one_to_four_element_tuple(leaf)])
end)
end
defp one_to_four_element_tuple(leaf) do
bind(integer(0..9), fn
int when int >= 6 -> {leaf, leaf, leaf}
int when int >= 3 -> {leaf, leaf}
int when int >= 1 -> {leaf}
_ -> {}
end)
end
defp scale_with_exponent(data, exponent) do
scale(data, fn size -> trunc(:math.pow(size, exponent)) end)
end
@doc """
Checks the behaviour of a given function on values generated by `data`.
This function takes a generator and a function `fun` and verifies that that
function "holds" for all generated data. `fun` is called with each generated
value and can return one of:
* `{:ok, term}` - means that the function "holds" for the given value. `term`
can be anything and will be used for internal purposes by `StreamData`.
* `{:error, term}` - means that the function doesn't hold for the given
value. `term` is the term that will be shrunk to find the minimal value
for which `fun` doesn't hold. See below for more information on shrinking.
When a value is found for which `fun` doesn't hold (returns `{:error, term}`),
`check_all/3` tries to shrink that value in order to find a minimal value that
still doesn't satisfy `fun`.
The return value of this function is one of:
* `{:ok, ok_map}` - if all generated values satisfy `fun`. `ok_map` is a map
of metadata that contains no keys for now.
* `{:error, error_map}` - if a generated value doesn't satisfy `fun`.
`error_map` is a map of metadata that contains the following keys:
* `:original_failure` - if `fun` returned `{:error, term}` for a generated
value, this key in the map will be `term`.
* `:shrunk_failure` - the value returned in `{:error, term}` by `fun`
when invoked with the smallest failing value that was generated.
* `:nodes_visited` - the number of nodes (a positive integer) visited in
the shrinking tree in order to find the smallest value. See also the
`:max_shrinking_steps` option.
* `:successful_runs` - the number of successful runs before a failing value was found.
## Options
This function takes the following options:
* `:initial_seed` - three-element tuple with three integers that is used as
the initial random seed that drives the random generation. This option is
required.
* `:initial_size` - (non-negative integer) the initial generation size used
to start generating values. The generation size is then incremented by `1`
on each iteration. See the "Generation size" section of the module
documentation for more information on generation size. Defaults to `1`.
* `:max_runs` - (non-negative integer) the total number of elements to
generate out of `data` and check through `fun`. Defaults to `100`.
* `:max_run_time` - (non-negative integer) the total number of time (in milliseconds)
to run a given check for. This is not used by default, so unless a value
is given, then the length of the check will be determined by `:max_runs`.
If both `:max_runs` and `:max_run_time` are given, then the check will finish at
whichever comes first, `:max_runs` or `:max_run_time`.
* `:max_shrinking_steps` - (non-negative integer) the maximum numbers of
shrinking steps to perform in case `check_all/3` finds an element that
doesn't satisfy `fun`. Defaults to `100`.
## Examples
Let's try out a contrived example: we want to verify that the `integer/0`
generator generates integers that are not `0` or multiples of `11`. This
verification is broken by design because `integer/0` is likely to generate
multiples of `11` at some point, but it will show the capabilities of
`check_all/3`. For the sake of the example, let's say we want the values that
fail to be represented as strings instead of the original integers that
failed. We can implement what we described like this:
options = [initial_seed: :os.timestamp()]
{:error, metadata} = StreamData.check_all(StreamData.integer(), options, fn int ->
if int == 0 or rem(int, 11) != 0 do
{:ok, nil}
else
{:error, Integer.to_string(int)}
end
end)
metadata.nodes_visited
#=> 7
metadata.original_failure
#=> 22
metadata.shrunk_failure
#=> 11
As we can see, the function we passed to `check_all/3` "failed" for `int =
22`, and `check_all/3` was able to shrink this value to the smallest failing
value, which in this case is `11`.
"""
@spec check_all(t(a), Keyword.t(), (a -> {:ok, term()} | {:error, b})) ::
{:ok, map()} | {:error, map()}
when a: term(),
b: term()
def check_all(data, options, fun) when is_list(options) and is_function(fun, 1) do
seed = new_seed(Keyword.fetch!(options, :initial_seed))
size = Keyword.get(options, :initial_size, 1)
max_shrinking_steps = Keyword.get(options, :max_shrinking_steps, 100)
start_time = System.system_time(:millisecond)
config = %{max_shrinking_steps: max_shrinking_steps}
config =
case {Keyword.get(options, :max_runs), Keyword.get(options, :max_run_time, :infinity)} do
{:infinity, :infinity} ->
raise ArgumentError,
"both the :max_runs and :max_run_time options are set to :infinity. " <>
"This would result in an infinite loop. Be sure to set at least one of " <>
"these options to an integer to avoid this error. Note that :max_run_time " <>
"defaults to :infinity, so if you set \"max_runs: :infinity\" then you need " <>
"to explicitly set :max_run_time to an integer."
{max_runs, :infinity} ->
Map.put(config, :max_runs, max_runs || 100)
{:infinity, max_run_time} ->
Map.put(config, :max_end_time, start_time + max_run_time)
{max_runs, max_run_time} ->
Map.merge(config, %{max_end_time: start_time + max_run_time, max_runs: max_runs})
end
check_all(data, seed, size, fun, _runs = 0, start_time, config)
end
defp check_all(_data, _seed, _size, _fun, _runs, current_time, %{max_end_time: end_time})
when current_time >= end_time do
{:ok, %{}}
end
defp check_all(_data, _seed, _size, _fun, runs, _current_time, %{max_runs: runs}) do
{:ok, %{}}
end
defp check_all(data, seed, size, fun, runs, _current_time, config) do
{seed1, seed2} = split_seed(seed)
%LazyTree{root: root, children: children} = call(data, seed1, size)
case fun.(root) do
{:ok, _term} ->
check_all(data, seed2, size + 1, fun, runs + 1, System.system_time(:millisecond), config)
{:error, reason} ->
shrinking_result =
shrink_failure(shrink_initial_cont(children), nil, reason, fun, 0, config)
|> Map.put(:original_failure, reason)
|> Map.put(:successful_runs, runs)
{:error, shrinking_result}
end
end
defp shrink_initial_cont(nodes) do
&Enumerable.reduce(nodes, &1, fn elem, acc -> {:suspend, [elem | acc]} end)
end
defp shrink_failure(_cont, _parent_cont, smallest, _fun, nodes_visited, %{
max_shrinking_steps: nodes_visited
}) do
%{shrunk_failure: smallest, nodes_visited: nodes_visited}
end
# We try to get the next element out of the current nodes. If the current
# nodes are finished, we check if this was the first check: if it was, it
# means we were visiting children of a node and this node has no children, so
# we recurse on the siblings of that node. Otherwise, we return the smallest
# failure. If we get the next nodes out of the current nodes, we check if it
# also fails: if it does, we "go down" and recurse on the children of that
# node but only if it has children, otherwise we move to the siblings. If it
# doesn't fail, we move to the siblings.
defp shrink_failure(cont, parent_cont, smallest, fun, nodes_visited, config) do
case cont.({:cont, []}) do
# If this list of nodes is over, we backtrack to the parent nodes and
# keep shrinking.
{state, _acc} when state in [:halted, :done] and not is_nil(parent_cont) ->
shrink_failure(parent_cont, nil, smallest, fun, nodes_visited, config)
# If this list of nodes is over and we don't have parent nodes, we
# return what we have now.
{state, _acc} when state in [:halted, :done] ->
%{shrunk_failure: smallest, nodes_visited: nodes_visited}
{:suspended, [child], cont} ->
case fun.(child.root) do
# If this child passes the property, we don't go down anymore on this node
# anymore and move to the siblings (`cont` is the enumerable representing
# the rest of the children).
{:ok, _term} ->
shrink_failure(cont, parent_cont, smallest, fun, nodes_visited + 1, config)
# If this child still fails, we update the smallest failure to this failure.
# Then, we go down to the children of this node and update the parent continuations
# so that if we encounter a success in the children then we can resume from the
# siblings of this child.
{:error, reason} ->
shrink_failure(
shrink_initial_cont(child.children),
cont,
reason,
fun,
nodes_visited + 1,
config
)
end
end
end
defp new_seed({int1, int2, int3} = tuple)
when is_integer(int1) and is_integer(int2) and is_integer(int3) do
:rand.seed_s(@rand_algorithm, tuple)
end
defp new_seed({_, _} = exported_seed) do
:rand.seed_s(exported_seed)
end
@compile {
:inline,
split_seed: 1, order: 2, uniform_in_range: 3, lazy_tree: 2, lazy_tree_constant: 1
}
defp split_seed(seed) do
{int, seed} = :rand.uniform_s(1_000_000_000, seed)
new_seed = :rand.seed_s(@rand_algorithm, {int, 0, 0})
{new_seed, seed}
end
defp order(left, right) when left > right, do: {right, left}
defp order(left, right), do: {left, right}
defp uniform_in_range(left, right, seed) do
{random_int, next_seed} = :rand.uniform_s(right - left + 1, seed)
{random_int - 1 + left, next_seed}
end
defp lazy_tree(root, children) do
%LazyTree{root: root, children: children}
end
defp lazy_tree_constant(term) do
%LazyTree{root: term}
end
# This is the implementation of Enumerable.reduce/3. It's here because it
# needs split_seed/1 and call/3 which are private.
@doc false
def __reduce__(data, acc, fun) do
reduce(data, acc, fun, new_seed(:os.timestamp()), _initial_size = 1, _max_size = 100)
end
defp reduce(_data, {:halt, acc}, _fun, _seed, _size, _max_size) do
{:halted, acc}
end
defp reduce(data, {:suspend, acc}, fun, seed, size, max_size) do
{:suspended, acc, &reduce(data, &1, fun, seed, size, max_size)}
end
defp reduce(data, {:cont, acc}, fun, seed, size, max_size) do
{seed1, seed2} = split_seed(seed)
%LazyTree{root: next} = call(data, seed1, size)
reduce(data, fun.(next, acc), fun, seed2, min(max_size, size + 1), max_size)
end
## Enumerable
defimpl Enumerable do
def reduce(data, acc, fun), do: @for.__reduce__(data, acc, fun)
def count(_data), do: {:error, __MODULE__}
def member?(_data, _term), do: {:error, __MODULE__}
def slice(_data), do: {:error, __MODULE__}
end
## Inspect
defimpl Inspect do
def inspect(%StreamData{generator: generator}, opts) do
case @protocol.inspect(generator, opts) do
"#Function<" <> rest -> "#StreamData<" <> rest
other -> "#StreamData<#{other}>"
end
end
end
end