# Changelog
## 0.5.0 - 2026-05-20
Initial public release. Native Elixir Whisper speech-to-text backed by
CTranslate2 through a Rustler NIF over `ct2rs::sys::Whisper`. No Python.
### Features
- `WhisperCt2.load_model/2` loads a CTranslate2-converted Whisper model
directory and returns a `%WhisperCt2.Model{}` with resolved `:device`
and `:compute_type`.
- `WhisperCt2.transcribe/3` accepts `{:pcm_f32, binary}` (mono, 16 kHz,
little-endian f32) and returns a `%WhisperCt2.Transcription{}` whose
`:segments` carry absolute start/end times, `:no_speech_prob`,
`:avg_logprob`, the underlying token IDs, and optional per-word timing.
- `WhisperCt2.transcribe_batch/3` stacks every chunk of every input into
one encoder forward pass - a large speedup for diarization-driven
workflows with many short turns.
- `:initial_prompt` and `:prefix` bias decoding; `:word_timestamps` adds a
batched DTW alignment pass attaching `%WhisperCt2.Word{}` entries;
`:with_timestamps` toggles `<|t_..|>` segment timestamps for plain-text
fine-tunes.
- English-only checkpoints (`*.en`) use the `[<|startoftranscript|>]`
prompt; multilingual checkpoints use `[sot, lang, transcribe]`.
- `WhisperCt2.Pcm.slice/4` carves sub-windows out of an already-decoded
f32 buffer with loud bounds checking.
- `WhisperCt2.available_devices/0` reports CPU/CUDA device counts and the
build's CUDA-support flag.
- Structured `%WhisperCt2.Error{}` taxonomy: `:invalid_request`,
`:load_error`, `:inference_error`, `:runtime_error`, `:nif_panic`,
`:native_error`.
### Backends
- Precompiled NIF artefacts via `rustler_precompiled` for
`aarch64-apple-darwin` (Accelerate), `x86_64-unknown-linux-gnu`
(oneDNN, optional `mkl` variant), and `aarch64-unknown-linux-gnu`
(oneDNN). CUDA is loaded lazily via `cuda-dynamic` on every Linux
artefact, so one binary runs on CPU-only and CUDA hosts alike.
- Opt into a source build with `WHISPER_CT2_BUILD=1`, or pick the MKL
artefact on x86_64 Linux with `WHISPER_CT2_VARIANT=mkl`.