# Changelog
## v0.9.2 (2024-11-16)
### Enhancements
* Support cross-compilation for use with Nerves
* Optimize LU with a custom call
## v0.9.1 (2024-10-08)
### Enhancements
* Improve compilation times of native code
### Bug fixes
* Fix encoding of binary floats
## v0.9.0 (2024-09-26)
### Enhancements
* Overall improvements to the Nx.Defn compiler
* Compiled functions now work across BEAM nodes
* Add `cache: "path/to/file"` for disk caching JIT/compiled functions
### Bug fixes
* Use a single thread pool for MLIR contexts
## v0.8.0 (2024-08-19)
* Add `EXLA.to_mlir_module/2`
* The precompiled XLA CUDA binaries now require CUDA 12.1+ and cuDNN 9.1+
* Renamed `XLA_TARGET` value "cuda120" to "cuda12"
* `XLA_TARGET` automatically defaults to "cuda12" when CUDA installation is detected
* Allow NIF modules to be upgradable
## v0.7.1 (2024-02-27)
* Add CustomCallOp for QR decomposition
* Minor improvements to the MLIR modules generated
* MLIR Context pooling for better concurrency
## v0.7.0 (2024-02-22)
* Update to latest Nx
* Introduce a `:mlir` based compiler and use it by default. The previous `:xla` based compiler is deprecatead. You can temporarily revert to the previous compiler by setting `config :exla, :compiler_mode, :xla`
## v0.6.4 (2023-11-13)
* Update to latest Nx
* Allow `:automatic_transfers` configuration on client
* Do not discard client/device in `EXLA.Backend` when it is host
* Always sort NaN last
* Improve the `:axes` option in `gather`, `indexed_add`, and `indexed_put`
## v0.6.3 (2023-11-09)
* Update to latest Nx
* Fix mixed device usage on EXLA.Backend
## v0.6.1 (2023-09-12)
* Update to latest Nx
## v0.6.0 (2023-08-15)
* Allow cross-device transfers on host
* Update dependencies to OpenXLA
* Update to latest Nx
## v0.5.3 (2023-04-14)
* Fix compilation issue on certain macOS caused by O3
* Fix optimization which would cause EXLA to return a complete tuple instead of a subset
## v0.5.2 (2023-03-21)
* Automatically transfer tensors between nodes
## v0.5.1 (2023-02-18)
* Support `top_k`
## v0.5.0 (2023-02-10)
* Optimize backend_transfer/backend_copy within EXLA backends
* Use relative symlinks on compilation whenever possible
## v0.4.2 (2023-01-13)
### Enhancements
* Automatically transfer from `:host` to other devices
* Support `lazy_transfers: :always` on `EXLA.jit/3` and `EXLA.compile/2`
* Run hooks concurrently once they have received the data
### Bug fixes
* Respect default `EXLA.Backend` client when jitting argumentless operations
* Do not pick client without devices when loading initial client
* Consider the first conditional of a `cond` part of the current scope
## v0.4.1 (2022-12-07)
### Enhancements
* [EXLA] Require Nx ~> 0.4.1
* [EXLA] Update `XLA` to TF2.11
* [EXLA.Defn] Send telemetry event after XLA compilation
* [EXLA.Op] Add optimization barriers as operations
### Bug fixes
* [EXLA] Validate backend options
* [EXLA.Backend] Fix `Nx.{any,all}` with `:keep_axes`
* [EXLA.Backend] Make SVD return `V` instead of `transpose(V)`
* [EXLA.Backend] Preserve NaNs in `window` and `reduce` operations
## v0.4.0 (2022-10-25)
### Enhancements
* Support zero copy binaries
* Redirect group leader for EXLA hooks
### Bug fixes
* Always hoist `cond` expressions
* Fix conditional inside `Nx.map`
## v0.3.0 (2022-08-13)
### Enhancements
* Support `debug: true` option on `defn` compiler
* Allow specifying preferred clients via the application environment
* Support new callbacks added in Nx v0.3.0
### Deprecations
* Deprecate `set_as_nx_default`
## v0.2.3 (2022-07-05)
### Bug fixes
* Fix predicate handling inside `cond`/`while`
* Set Nx backend globally
## v0.2.2 (2022-06-15)
### Bug fixes
* Fix invalid cache expiration when defn received functions as arguments
## v0.2.1 (2022-06-04)
### Enhancements
* Implement `EXLA.Backend.to_batched_list/3`
### Bug fixes
* Improve support for non-finite values in `EXLA` compiler
* Fix segmentation fault while deallocating tensors
## v0.2.0 (2022-04-28)
First release.