CHANGELOG.md

# Changelog

## v0.9.2 (2024-11-16)

### Enhancements

  * Support cross-compilation for use with Nerves
  * Optimize LU with a custom call

## v0.9.1 (2024-10-08)

### Enhancements

  * Improve compilation times of native code

### Bug fixes

  * Fix encoding of binary floats

## v0.9.0 (2024-09-26)

### Enhancements

  * Overall improvements to the Nx.Defn compiler
  * Compiled functions now work across BEAM nodes
  * Add `cache: "path/to/file"` for disk caching JIT/compiled functions

### Bug fixes

  * Use a single thread pool for MLIR contexts

## v0.8.0 (2024-08-19)

  * Add `EXLA.to_mlir_module/2`
  * The precompiled XLA CUDA binaries now require CUDA 12.1+ and cuDNN 9.1+
  * Renamed `XLA_TARGET` value "cuda120" to "cuda12"
  * `XLA_TARGET` automatically defaults to "cuda12" when CUDA installation is detected
  * Allow NIF modules to be upgradable

## v0.7.1 (2024-02-27)

  * Add CustomCallOp for QR decomposition
  * Minor improvements to the MLIR modules generated
  * MLIR Context pooling for better concurrency

## v0.7.0 (2024-02-22)

  * Update to latest Nx
  * Introduce a `:mlir` based compiler and use it by default. The previous `:xla` based compiler is deprecatead. You can temporarily revert to the previous compiler by setting `config :exla, :compiler_mode, :xla`

## v0.6.4 (2023-11-13)

  * Update to latest Nx
  * Allow `:automatic_transfers` configuration on client
  * Do not discard client/device in `EXLA.Backend` when it is host
  * Always sort NaN last
  * Improve the `:axes` option in `gather`, `indexed_add`, and `indexed_put`

## v0.6.3 (2023-11-09)

  * Update to latest Nx
  * Fix mixed device usage on EXLA.Backend

## v0.6.1 (2023-09-12)

  * Update to latest Nx

## v0.6.0 (2023-08-15)

  * Allow cross-device transfers on host
  * Update dependencies to OpenXLA
  * Update to latest Nx

## v0.5.3 (2023-04-14)

  * Fix compilation issue on certain macOS caused by O3
  * Fix optimization which would cause EXLA to return a complete tuple instead of a subset

## v0.5.2 (2023-03-21)

  * Automatically transfer tensors between nodes

## v0.5.1 (2023-02-18)

  * Support `top_k`

## v0.5.0 (2023-02-10)

  * Optimize backend_transfer/backend_copy within EXLA backends
  * Use relative symlinks on compilation whenever possible

## v0.4.2 (2023-01-13)

### Enhancements

  * Automatically transfer from `:host` to other devices
  * Support `lazy_transfers: :always` on `EXLA.jit/3` and `EXLA.compile/2`
  * Run hooks concurrently once they have received the data

### Bug fixes

  * Respect default `EXLA.Backend` client when jitting argumentless operations
  * Do not pick client without devices when loading initial client
  * Consider the first conditional of a `cond` part of the current scope

## v0.4.1 (2022-12-07)

### Enhancements

  * [EXLA] Require Nx ~> 0.4.1
  * [EXLA] Update `XLA` to TF2.11
  * [EXLA.Defn] Send telemetry event after XLA compilation
  * [EXLA.Op] Add optimization barriers as operations

### Bug fixes

  * [EXLA] Validate backend options
  * [EXLA.Backend] Fix `Nx.{any,all}` with `:keep_axes`
  * [EXLA.Backend] Make SVD return `V` instead of `transpose(V)`
  * [EXLA.Backend] Preserve NaNs in `window` and `reduce` operations

## v0.4.0 (2022-10-25)

### Enhancements

  * Support zero copy binaries
  * Redirect group leader for EXLA hooks

### Bug fixes

  * Always hoist `cond` expressions
  * Fix conditional inside `Nx.map`

## v0.3.0 (2022-08-13)

### Enhancements

  * Support `debug: true` option on `defn` compiler
  * Allow specifying preferred clients via the application environment
  * Support new callbacks added in Nx v0.3.0

### Deprecations

  * Deprecate `set_as_nx_default`

## v0.2.3 (2022-07-05)

### Bug fixes

  * Fix predicate handling inside `cond`/`while`
  * Set Nx backend globally

## v0.2.2 (2022-06-15)

### Bug fixes

  * Fix invalid cache expiration when defn received functions as arguments

## v0.2.1 (2022-06-04)

### Enhancements

  * Implement `EXLA.Backend.to_batched_list/3`

### Bug fixes

  * Improve support for non-finite values in `EXLA` compiler
  * Fix segmentation fault while deallocating tensors

## v0.2.0 (2022-04-28)

First release.