# BEAM Lab Languages
Linguistic metadata for human languages: grammatical gender, writing direction, canonical and native names, and BCP 47 normalization. Curated, compile-time data with zero runtime dependencies.
Sibling library to [`beamlab_countries`](https://hex.pm/packages/beamlab_countries) — `beamlab_countries` knows where languages are spoken, `beamlab_languages` knows what they are like.
## What it answers
- "Does Russian use grammatical gender? If so, what genders?"
- "Is Arabic written right-to-left?"
- "What's the canonical English name of `fr`? The endonym?"
- "Does the user's locale string `en-US` collapse to a base I can use as a key?"
## Installation
```elixir
defp deps do
[
{:beamlab_languages, "~> 0.1"}
]
end
```
Then `mix deps.get`.
## Quick start
```elixir
BeamlabLanguages.has_gender?("fr")
# true
BeamlabLanguages.genders("de")
# ["m", "f", "n"]
BeamlabLanguages.direction("ar")
# :rtl
BeamlabLanguages.name("ja")
# "Japanese"
BeamlabLanguages.native_name("ja")
# "日本語"
BeamlabLanguages.normalize("en-US")
# "en"
BeamlabLanguages.get("fr")
# %BeamlabLanguages.Language{
# code: "fr",
# name: "French",
# native_name: "Français",
# direction: :ltr,
# has_gender: true,
# genders: ["m", "f"]
# }
```
Every function that takes a language code runs `normalize/1` internally, so `"en-US"`, `"FR"`, and `" fr "` all work. Predicates (`has_gender?/1`, `known?/1`) return `false` for `nil` or unknown input rather than raising — handy in form-validation paths.
## Documentation
Full API docs at [HexDocs](https://hexdocs.pm/beamlab_languages).
## Coverage
v1 covers 50+ languages: the top-spoken languages worldwide plus all CEFR / JLPT / HSK targets. The data lives in `priv/data/languages.json` — open a PR to add more or correct an entry.
## Roadmap (planned, not in v1)
These are intentionally deferred so v1 ships small. The v1 API is shaped to leave room for them:
- Localized language names — `BeamlabLanguages.name("fr", in: "es")` → `"francés"`
- Plural rules (CLDR categories: `:zero`, `:one`, `:two`, `:few`, `:many`, `:other`)
- Articles (definite/indefinite, by gender)
- Case marking (Slavic, Finnic, etc.)
- Noun classes (Bantu)
- Scripts / writing systems per language
- IPA inventory
- Honorific levels (Japanese / Korean)
## Non-goals
- **Not a CLDR wrapper.** No locale formatting (numbers, dates, currencies). That belongs elsewhere.
- **Not a translation API.** Knows what languages *are*; doesn't translate text.
- **No GenServer / Agent / ETS.** All data is compile-time.
## Contributing
1. Fork it
2. Create a feature branch (`git checkout -b my-new-feature`)
3. Edit `priv/data/languages.json` and/or code
4. `mix test` and `mix format`
5. Open a PR
## License
MIT — see [LICENSE.md](./LICENSE.md).