Skip to main content

README.md

# erli18n

[![Hex.pm](https://img.shields.io/hexpm/v/erli18n.svg)](https://hex.pm/packages/erli18n)
[![HexDocs](https://img.shields.io/badge/hex-docs-8e44ad.svg)](https://hexdocs.pm/erli18n/)
[![CI](https://github.com/eagle-head/erli18n/actions/workflows/ci.yml/badge.svg)](https://github.com/eagle-head/erli18n/actions/workflows/ci.yml)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![OTP 27+](https://img.shields.io/badge/OTP-27%2B-a90533)](https://www.erlang.org/downloads)

Modern, GNU `gettext`–compatible internationalization (i18n) for Erlang/OTP β€” in pure Erlang.

> ### Why erli18n?
>
> It's first-class `gettext` i18n for Erlang/OTP, **natively** β€” no polyglot build, no routing through Elixir, no stalled dependency.
>
> - πŸ“¦ **Drop-in `.po` / `.pot`** β€” load the exact files your translators already produce in Poedit, Crowdin, Transifex, Weblate, or `xgettext`.
> - 🌍 **Real CLDR pluralization** β€” a true `Plural-Forms` evaluator, with CLDR rules inlined for **49 locales**.
> - ⚑ **Lock-free lookups** β€” reads run straight from ETS in your own process; only writes go through a `gen_server`. No bottleneck on the hot path.

## Quickstart

Add the dependency to `rebar.config`:

```erlang
{deps, [{erli18n, "0.2.0"}]}.
```

Then load a catalog and translate:

```erlang
application:ensure_all_started(erli18n).

%% Load a `.po` catalog for a (domain, locale). Parse -> compile plural rule ->
%% validate against CLDR -> insert: one atomic step. Returns {ok, NewlyLoaded}
%% (or {ok, already} if it was already loaded).
{ok, _Loaded} = erli18n_server:ensure_loaded(my_domain, <<"pt_BR">>,
    <<"priv/locale/pt_BR/LC_MESSAGES/my_domain.po">>).

%% Singular.
<<"OlΓ‘, mundo">> = erli18n:gettext(my_domain, <<"Hello, world">>, <<"pt_BR">>).

%% Plural. ngettext returns the correct plural FORM for N (it selects the
%% form; the `f` family below splices the number in β€” see "Interpolation").
<<"arquivo">>  = erli18n:ngettext(my_domain, <<"file">>, <<"files">>, 1,  <<"pt_BR">>).
<<"arquivos">> = erli18n:ngettext(my_domain, <<"file">>, <<"files">>, 42, <<"pt_BR">>).

%% Contextual. The same source word, disambiguated by a msgctxt.
<<"Maio">> = erli18n:pgettext(my_domain, <<"month">>, <<"May">>, <<"pt_BR">>).
<<"pode">> = erli18n:pgettext(my_domain, <<"verb">>,  <<"May">>, <<"pt_BR">>).

%% Interpolating. The `f`-suffix family resolves the translation, then
%% splices named `%{var}` placeholders from a Bindings map (see below).
<<"3 arquivos">> = erli18n:ngettextf(my_domain, <<"%{count} file">>,
    <<"%{count} files">>, 3, <<"pt_BR">>, #{}).   %% count => 3 auto-bound
```

That is the whole surface: `gettext` (singular), `ngettext` (plural), `pgettext` (contextual), and `npgettext` (contextual + plural), each with `d` / `dc` domain-explicit variants β€” the full GNU gettext C-macro family, as Erlang functions. Each also has an interpolating `f`-suffix sibling (`gettextf`, `ngettextf`, `pgettextf`, `npgettextf`) that splices named `%{var}` values into the resolved string.

## Common patterns

**Set the locale once per process** (e.g. one web request) β€” then every lookup in that process uses it, with no locale argument to thread around. App-wide, `set_default_locale/1` does the same for processes that never call `setlocale/1`:

```erlang
erli18n:setlocale(<<"pt_BR">>),                                  %% this process
%% erli18n:set_default_locale(<<"pt_BR">>),                      %% (or: app-wide default)

<<"OlΓ‘, mundo">> = erli18n:gettext(my_domain, <<"Hello, world">>),
<<"arquivos">>   = erli18n:ngettext(my_domain, <<"file">>, <<"files">>, 42).
```

**Set a default domain** so the shortest forms work without naming it each time:

```erlang
erli18n:textdomain(my_domain),
<<"OlΓ‘, mundo">> = erli18n:gettext(<<"Hello, world">>).   %% default domain + resolved locale
```

**Format a pluralized count** β€” use the `f`-suffix `ngettextf`: it selects the plural form *and* splices the number in. The count is auto-bound as `%{count}`, so the translator controls where the number lands in each language:

```erlang
%% Source: msgid "%{count} file" / msgid_plural "%{count} files"
%% pt_BR:  msgstr[0] "%{count} arquivo" / msgstr[1] "%{count} arquivos"
<<"3 arquivos">> = erli18n:ngettextf(my_domain,
    <<"%{count} file">>, <<"%{count} files">>, 3, <<"pt_BR">>, #{}).
```

**Context + plural together** (`npgettext` β€” domain, context, singular, plural, N, locale):

```erlang
<<"comentΓ‘rios">> = erli18n:npgettext(my_domain, <<"ui">>,
    <<"comment">>, <<"comments">>, 5, <<"pt_BR">>).
```

**Load several catalogs at startup** in one batch:

```erlang
%% Each entry is {Domain, Locale, PoPath, Opts}; the result is one
%% {Domain, Locale, {ok, NewlyLoaded} | {ok, already} | {error, _}} per entry.
Results = erli18n_server:ensure_loaded_many([
    {my_domain, <<"pt_BR">>, <<"priv/locale/pt_BR/LC_MESSAGES/my_domain.po">>, #{}},
    {my_domain, <<"en_US">>, <<"priv/locale/en_US/LC_MESSAGES/my_domain.po">>, #{}}
]).
```

**Observe at runtime with telemetry** (optional) β€” for example, get notified whenever a lookup falls through to the source string:

```erlang
telemetry:attach(<<"erli18n-misses">>, [erli18n, lookup, miss],
    fun(_Event, _Measurements, Metadata, _Config) ->
        logger:info("i18n miss: ~p", [Metadata])
    end, undefined).
```

## Interpolation

Every lookup family has an interpolating `f`-suffix sibling β€” `gettextf`, `ngettextf`, `pgettextf`, `npgettextf` (plus the `d` / `dc` aliases) β€” that takes a trailing `Bindings :: map()`. Each `f` function resolves the translation exactly like its non-`f` sibling, then substitutes **named `%{var}` placeholders** in the result:

```erlang
erli18n:setlocale(<<"pt_BR">>),

%% Source msgid "Hello, %{name}!" with pt_BR msgstr "OlΓ‘, %{name}!"
<<"OlΓ‘, Ada!">> = erli18n:gettextf(my_domain, <<"Hello, %{name}!">>,
    #{name => <<"Ada">>}).
```

Named placeholders (rather than positional `~s`) decouple the wording from argument order: a translator can move `%{name}` anywhere in the sentence β€” or repeat it β€” and the binding still resolves by name. Binding keys are atoms; values may be a binary, an iolist/string, an integer, a float, or an atom, and are coerced to UTF-8 text. **Plural members auto-bind `count => N`**, so `%{count}` is always available without passing it yourself (a caller-supplied `count` wins):

```erlang
%% pt_BR msgstr[1] "%{count} arquivos" β€” count auto-bound to 42
<<"42 arquivos">> = erli18n:ngettextf(my_domain,
    <<"%{count} file">>, <<"%{count} files">>, 42, <<"pt_BR">>, #{}).
```

**Escaping.** A literal percent is `%%`; to emit a literal `%{name}` un-substituted, write `%%{name}` (the `%%` collapses to `%`, leaving `{name}` untouched):

```erlang
<<"100% sure">>   = erli18n:gettextf(<<"100%% sure">>, #{}).
<<"use %{name}">> = erli18n:gettextf(<<"use %%{name}">>, #{name => <<"X">>}).
```

**Missing bindings β€” `lenient` vs `strict`.** The `f` functions on `erli18n` are **lenient**: an unbound `%{name}` is left in place literally and nothing crashes. Interpolation is total and fail-soft β€” for any input and any bindings it returns a binary and never raises. When you want an unbound placeholder to be a hard error instead, call `erli18n_interp:format/3` directly with the `strict` policy:

```erlang
%% Lenient (the f-family default): unknown placeholder stays literal.
<<"Hi %{who}">> = erli18n:gettextf(<<"Hi %{who}">>, #{}).

%% Strict: opt in via erli18n_interp:format/3 β€” raises on a missing binding.
erli18n_interp:format(<<"Hi %{who}">>, #{}, #{on_missing => strict}).
%% ** exception error: {erli18n_interp, {missing_binding, who}}
```

> ### Bidi / RTL caveat
>
> Interpolation does **not** auto-insert Unicode bidi isolation marks (U+2066–U+2069) around spliced values. Placing an RTL value (Arabic, Hebrew) into an LTR sentence β€” or the reverse β€” can reorder neighbouring punctuation under the Unicode Bidirectional Algorithm. If you mix directions, isolate the values yourself until a future version offers opt-in isolation.

## Core concepts

A few things worth knowing before you reach for the API:

- **Locale is per-process.** `erli18n:setlocale(<<"pt_BR">>)` sets the locale for the *calling* process (stored in its process dictionary); `which_locale/0` reads it back. It is **not** inherited by processes you `spawn`. When a process hasn't set one, lookups fall back to the application-wide default. Passing the locale explicitly always wins.
- **Catalogs are keyed by domain + locale.** A *domain* is a gettext text domain (e.g. `my_domain`) β€” your way of grouping translations. You load each `(domain, locale)` catalog once; lookups then target a domain explicitly or use the default.
- **The `.po` header drives pluralization.** Each catalog's `Plural-Forms` header is the runtime source of truth for plural selection. CLDR rules (inlined for **49 locales**) are consulted only at load time β€” to emit a telemetry warning when a header diverges from CLDR, never to override it.
- **Misses degrade gracefully.** A lookup with no catalog, no entry, or an empty translation returns the original `msgid` (or `msgid_plural`), so your UI never shows a blank. And a crash of the catalog server does **not** wipe loaded translations: the ETS table is held by a dedicated owner/heir, so it survives and is handed back intact on restart.

## Why erli18n

Most Erlang projects today either reach for the venerable but [largely-stalled `gettexter`](https://github.com/seriyps/gettexter), or route strings through Elixir's `gettext` (which forces a polyglot build). `erli18n` is for projects that want **first-class i18n in pure Erlang/OTP** without giving up compatibility with the standard `gettext` translation tooling.

- **Drop-in `.po` / `.pot` compatibility** β€” a hand-written parser that handles real-world catalogs: contexts, plurals, fuzzy entries, charsets, BOMs, and obsolete entries. Works with Poedit, Crowdin, Transifex, Weblate, and `msgfmt` out of the box. (The exact `.po`-semantics decisions are documented in [`CHANGELOG.md`](CHANGELOG.md).)
- **CLDR-backed pluralization** β€” a real evaluator for the `Plural-Forms` C-expression, with CLDR plural rules inlined for **49 locales**.
- **The full gettext API** β€” `gettext` / `ngettext` / `pgettext` / `npgettext`, plus the `d` / `dc` domain-explicit variants, and an interpolating `f`-suffix family (`gettextf`, …) for named `%{var}` substitution.
- **Optional, first-class observability** β€” **7** [`telemetry`](https://github.com/beam-telemetry/telemetry) events (catalog load/reload/unload spans, lookup misses, plural divergence, rate-limited memory warnings). `telemetry` is an *optional* dependency: events fire only when your app ships it.
- **A lock-free hot path** β€” `lookup_*` reads run directly from ETS in the *calling* process; only writes (loading and reloading catalogs) go through the owning `gen_server`. No process bottleneck on the read side.
- **Heavily tested** β€” Common Test suites, PropEr property-based tests, fuzzing, and a parity suite that checks output byte-for-byte against GNU `msgfmt` as a ground-truth oracle. 100% behavioral coverage.

String **extraction** uses the standard GNU `xgettext` CLI β€” the same model as Spring `MessageSource`, Django, Rails I18n, and Symfony Translation. Compile-time key checking is intentionally out of scope; runtime lookup plus tests is the mainstream pattern.

## Installation

```erlang
{deps, [
    {erli18n, "0.2.0"}
]}.
```

For [`telemetry`](https://github.com/beam-telemetry/telemetry) observability (optional β€” `erli18n` runs fine without it), add it too:

```erlang
{deps, [
    {erli18n, "0.2.0"},
    {telemetry, "~> 1.3"}
]}.
```

## Compatibility

|                      | OTP 27 (minimum) | OTP 28 |
| -------------------- | :--------------: | :----: |
| Tier-1 (CI)          |        βœ…        |   βœ…   |

OTP 27 is the floor because the public modules use the native `-doc` / `-moduledoc` documentation attributes (EEP-59), which only compile on OTP 27+; on OTP 25.3 / 26 the compiler rejects them with `attribute doc after function definitions`. CI exercises OTP 27 and 28 on every push.

## Status

**Initial development (`0.2.0`).** Per [SemVer 2.0.0 Β§4](https://semver.org/#spec-item-4), the public API is functional but may change on a minor bump (`0.2.0` β†’ `0.3.0`); patch bumps (`0.2.0` β†’ `0.2.1`) stay backward-compatible. The criteria for a stable `1.0.0` are in [`CHANGELOG.md`](CHANGELOG.md).

## Documentation

- **API reference** β€” published on [HexDocs](https://hexdocs.pm/erli18n/), generated from the native `-doc` / `-moduledoc` attributes (OTP 27+ documentation). Every public module and function is documented there.
- **Changelog & design decisions** β€” [`CHANGELOG.md`](CHANGELOG.md) records each release, the versioning policy, and the `.po`-semantics and pluralization decisions behind the implementation.
- **Examples** β€” the `.po` fixtures under [`test/`](test/) cover plural forms, contexts, fuzzy entries, encodings, and edge cases β€” a practical reference for what `erli18n` accepts.

## Development

```sh
git clone git@github.com:eagle-head/erli18n.git
cd erli18n
rebar3 compile
bin/quality-gate.sh --fast    # ~30s:  compile + xref + erlfmt + elvis + hank + elp lint
bin/quality-gate.sh --full    # ~5min: + dialyzer + eqwalize-all + Common Test (+ coverage)
```

See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the full setup: toolchain pinning with `mise`, git hooks, local CI emulation with `act`, and the contribution workflow.

## Security

To report a vulnerability, see [`SECURITY.md`](SECURITY.md) β€” please do **not** open a public GitHub issue for security reports.

## License

[Apache License 2.0](LICENSE) (SPDX: `Apache-2.0`).

## References

- [GNU gettext manual](https://www.gnu.org/software/gettext/manual/gettext.html) β€” `.po` format and runtime semantics.
- [Unicode CLDR plural rules](https://cldr.unicode.org/index/cldr-spec/plural-rules) β€” pluralization data source.
- [`telemetry`](https://github.com/beam-telemetry/telemetry) β€” the observability framework.
- [`gettexter`](https://github.com/seriyps/gettexter) β€” historical Erlang gettext library whose API surface `erli18n` mirrors for easy migration.