# Changelog
## v1.0.0-rc.0 (2026-06-08)
This is the first release candidate for evision **v1.0.0**, the first major
version. Two headline changes land together:
- evision now targets **OpenCV 5.0.0**. This is a breaking upgrade (C++17, the
legacy C API removed, modules reorganised, new core element types, and a
rewritten DNN engine); see OpenCV's release notes for the upstream details.
- **`Evision.Backend` is now a working Nx backend**, implementing the majority
of the `Nx.Backend` behaviour with results verified against
`Nx.BinaryBackend`.
### Added
- **`Evision.Backend` is now a working Nx backend
([#48](https://github.com/cocoa-xu/evision/issues/48)).** 72 of the 85
`Nx.Backend` callbacks are implemented, each verified element-for-element
against `Nx.BinaryBackend` across dtypes, ranks, axes, and broadcasting, so
`cv::Mat` can back `Nx` tensors for most workflows. Newly implemented
callbacks include:
- Reductions: `sum`, `product`, `reduce_max`, `reduce_min`, `all`, `any`, and
`argmax`/`argmin` (backed by `cv::reduceArgMax`/`reduceArgMin`, with Nx
tie-break semantics).
- Cumulative ops: `cumulative_sum`, `cumulative_product`, `cumulative_min`,
`cumulative_max`.
- Sorting: `sort` and `argsort` along any axis, stable in both directions,
including the wide-integer depths (`u32`/`s64`/`u64`) that `cv::sort`
rejects.
- Indexing: `take`, `gather`, `indexed_add`, and `indexed_put`, with
Nx-compatible bounds checking.
- Elementwise unary math and `select`.
- `init/1` (required by the `Nx.Backend` behaviour since Nx 0.7) and scalar
(0-dimensional) tensors in `from_binary`.
- OpenCV 5.0's new native depths are mapped to Nx dtypes: `CV_64S` to `{:s, 64}`,
`CV_32U` to `{:u, 32}`, `CV_64U` to `{:u, 64}`, and `CV_16BF` to `{:bf, 16}`.
64-bit values now round-trip through `cv::Mat` losslessly (the old s64-to-s32
downcast that truncated values above 2^31 is gone), and `Evision.Mat.at/2`
returns full-width 64-bit values.
- Haar/HOG parity: `Evision.CascadeClassifier` and `Evision.HOGDescriptor` build
again via the contrib `xobjdetect` module, where OpenCV 5.0 moved them.
- `mix evision.backend.bench`, a benchmark task for the Evision backend with an
optional Torchx comparison.
### Changed
- Uses OpenCV 5.0.0.
- Requires Nx `~> 0.12.1`.
- Module reorganisation follows OpenCV 5.0. Most classes keep their `Evision.*`
names, but a few feature detectors moved to the contrib `xfeatures2d` module
and are now under `Evision.XFeatures2D.*`: `AKAZE`, `KAZE`,
`AgastFeatureDetector`, and `BRISK`.
- Multi-channel `raw_type` codes changed. OpenCV 5.0 bumped `CV_CN_SHIFT` from 3
to 5, so a multi-channel `Mat`'s integer type code differs (for example
`cv_8UC3` is now 64, was 16). `Evision.Constant.cv_8UC3/0` and friends compute
the correct 5.0 values; code that hardcoded these numbers must be updated.
- `Evision.VideoWriter.write/2` now returns a boolean. OpenCV 5.0 changed
`cv::VideoWriter::write` to return a success flag instead of `void`, so the
call no longer returns the writer; write to the same writer handle in a loop.
### Removed
- OpenCV 4.x support and the multi-version selection mechanism. evision targets
OpenCV 5.0.x only.
- The DNN Darknet and Caffe importers (removed upstream in 5.0). Use
`Evision.DNN.readNetFromONNX/1` or ONNX-converted models instead.
- The DNN Halide backend (removed upstream in 5.0).
### Fixed
- `Evision.Backend` N-dimensional broadcasting now matches Nx semantics for
rank-differing and multi-axis broadcasts (e.g. `{3, 4}` to `{2, 3, 4}`,
`{2, 1, 4}` to `{2, 3, 4}`) and honours Nx's explicit `:axes`, so non
right-aligned broadcasts are correct. Previously every elementwise binary op
(`add`/`subtract`/`divide`/`min`/`max`/comparisons) could silently disagree
with `Nx.BinaryBackend`, and on AArch64 a divide-by-zero in the tiling path
returned garbage instead of trapping.
- `Evision.Backend` `logical_and`/`logical_or`/`logical_xor` now use truthiness
semantics, so they are correct for non-boolean inputs (`logical_and(2, 1)` is
`1`, not `0`) and `logical_xor` no longer raises on non-`u8` types.
- `Evision.Backend` integer scalar operands in `multiply`/`divide` no longer
raise a `to_nx/2` clause error (OpenCV only treats a 1x1 operand as a scalar
when it is `f64`).
- Building from source no longer re-runs the OpenCV install on every
`mix compile`; the cmake-config gate now matches OpenCV 5.0's install path.
- 32-bit ARMv8 Nerves targets (rpi3/rpi3a/rpi0_2, Cortex-A53 built as armv7hf)
compile again. OpenCV 5.0.0's `v_floor` NEON fast path uses the AArch64-only
`vcvtmq_s32_f32` intrinsic under `#if __ARM_ARCH > 7`, which GCC's AArch32
`arm_neon.h` does not provide; the fast path is now gated on `__aarch64__` so
AArch32 keeps the portable floor fallback.
- Building without contrib modules compiles again against OpenCV 5.0.0. The
hand-written `cv::stereo::MatchQuasiDense` vector converter was gated on
`HAVE_OPENCV_STEREO`, but in OpenCV 5.0 `stereo` is a new main module (so that
macro is always defined) while the quasi-dense stereo types moved to the
contrib `xstereo` module. The converter is now gated on `HAVE_OPENCV_XSTEREO`,
so a build without contrib modules no longer references the absent
`cv::stereo` namespace.
- Nerves ARM targets build again with OpenCV 5.0.0. Nerves cross toolchains set
the C/C++ compiler but not the ASM compiler, so cmake's `enable_language(ASM)`
falls back to the host x86 assembler and hands it OpenCV's bundled MLAS AArch64
`.S` GEMM kernels, which it cannot assemble (`no such instruction: 'fmla
v5.4s...'`). This broke both the 32-bit boards (which report an arm64 processor
name) and the 64-bit rpi5. MLAS is now skipped when it needs ASM and the build
is cross-compiling, and the DNN module falls back to its built-in SGEMM.
- Building on FreeBSD compiles again with OpenCV 5.0.0. Intel IPP ICV's
vendored safestring header declares a 3-argument `memset_s` that conflicts
with FreeBSD libc's C11 4-argument `memset_s`, so the optional IPP
accelerator is now disabled on FreeBSD.
### Performance
- `Evision.Backend` elementwise loops are parallelised with `cv::parallel_for_`,
with stripe counts sized to the thread pool rather than the range length (a
naive port dispatched one block per element and ran some ops slower in
parallel than serially). Read-only NIF inputs are marked `INPUT_ONLY` so a
cv-owned source `Mat` is shared instead of deep-copied; a no-op reshape of a
2 MB tensor drops from ~110us to ~3us.
- Reductions read their input in its native dtype and promote per element to the
wide accumulator, dropping a separate cast pass, and a new leading-axis path
avoids transposing first: `reduce_max` over axis 0 of a 512x1024 tensor drops
from ~3.2ms to ~0.25ms.
- Conv gains an im2row + GEMM fast path for the common 2-D case (single batch
group, no input dilation, `f32`/`f64`), reaching parity with the libtorch
backend; other shapes fall back to the general N-d kernel.
- Scalar-operand elementwise ops (`add`/`subtract`/`multiply`/`divide`/`min`/
`max`/comparisons/`pow`/`atan2`/`quotient`/`remainder`/shifts) take a fast
path that casts the scalar to a single element instead of materialising a full
broadcast array: scalar `add` on a 2048x2048 tensor drops from ~20ms to ~1.6ms.
Change logs for `v0.2.x` are in
[CHANGELOG.v0.2.md](https://github.com/cocoa-xu/evision/blob/main/CHANGELOG.v0.2.md);
`v0.1.x` is in
[CHANGELOG.v0.1.md](https://github.com/cocoa-xu/evision/blob/main/CHANGELOG.v0.1.md).