README.md

# 🌍 Lang

Represent languages in Gleam!

[![Package Version](https://img.shields.io/hexpm/v/lang)](https://hex.pm/packages/lang)
[![Hex Docs](https://img.shields.io/badge/hex-docs-ffaff3)](https://hexdocs.pm/lang/)

**Lang** provides a `Language` type representing the ISO 639 languages that have both two-letter (ISO 639-1) and three-letter (ISO 639-2) codes, with helper functions to work with them.

This is designed to be a minimal foundational library so that different libraries and applications can interoperate with a common `Language` type.

```sh
gleam add lang@1
```

```gleam
import gleam/io
import gleam/list
import gleam/string
import lang

pub fn main() {
  let language = lang.En

  lang.to_iso639_1(language)
  // -> "en"

  lang.to_iso639_2(language)
  // -> "eng"

  lang.to_name(language)
  // -> "English"

  case lang.from_iso639_1("KO") {
    Ok(korean) -> {
      io.println("Language: " <> lang.to_name(korean))
      // -> "Language: Korean"
    }
    Error(_) -> io.println("Unknown language code")
  }

  let assert Ok(german_terminological) = lang.from_iso639_2("deu")
  let assert Ok(german_bibliographic) = lang.from_iso639_2("ger")

  // Both return the same language (lang.De)
  german_terminological == german_bibliographic
  // -> True

  // Iterate over all languages
  lang.all
  |> list.filter(fn(l) { lang.to_iso639_1(l) |> string.starts_with("a") })
  |> list.map(lang.to_name)
  // -> ["Afar", "Abkhazian", "Afrikaans", "Akan", ..]
}
```

## Design Philosophy

Lang is intentionally minimal:

- Provides ISO 639-1 and ISO 639-2 language identification
- Keeps the surface area small by not modelling regional variants (`en-US`, `en-GB`)
- Stays focused by excluding script information (`zh-Hans`, `zh-Hant`)
- Avoids complexity such as translations or localisation

These choices keep the library lean, leaving additional functionality to application code or higher-level libraries.

## ISO 639-2 Bibliographic vs. Terminological Codes

Some languages have two ISO 639-2 codes:
- **Terminological (T)** - Modern canonical form e.g. `deu` for German
- **Bibliographic (B)** - Used in libraries/catalogs e.g. `ger` for German

```gleam
// to_iso639_2 returns the T code (or the B code if no T code exists)
lang.to_iso639_2(lang.En)  // "eng"
lang.to_iso639_2(lang.De)  // "deu" (not "ger")
lang.to_iso639_2(lang.Zh)  // "zho" (not "chi")

// from_iso639_2 accepts both bibliographic and terminological codes
lang.from_iso639_2("deu")  // Ok(De)
lang.from_iso639_2("ger")  // Ok(De) - same result
```

## Examples

The following are some examples to demonstrate how you might use `lang`.

### Content Management System with Regional Variants

```gleam
import lang.{type Language}

pub type ContentLocale {
  ContentLocale(language: Language, region: String)
}

// Define your locales
pub const en_us = ContentLocale(lang.En, "US")

pub const en_gb = ContentLocale(lang.En, "GB")

pub const pt_br = ContentLocale(lang.Pt, "BR")

pub const pt_pt = ContentLocale(lang.Pt, "PT")

pub fn format_date(date: Date, locale: ContentLocale) -> String {
  case locale.language, locale.region {
    lang.En, "US" -> format_us_date(date)
    lang.En, "GB" -> format_uk_date(date)
    lang.Pt, "BR" -> format_br_date(date)
    lang.Pt, "PT" -> format_pt_date(date)
    language, _ -> format_default_date(date, language)
  }
}
```

### HTTP Headers

```gleam
import gleam/http/request
import gleam/http/response
import gleam/list
import gleam/result
import gleam/string
import lang.{type Language}

pub fn set_content_language(
  resp: response.Response(a),
  language: Language,
) -> response.Response(a) {
  response.set_header(resp, "content-language", lang.to_iso639_1(language))
}

pub fn get_accept_language(req: request.Request(a)) -> Result(Language, Nil) {
  let fallback = lang.En

  // Parse "en-US,en;q=0.9,es;q=0.8" -> "en"
  request.get_header(req, "accept-language")
  |> result.unwrap(lang.to_iso639_1(fallback))
  |> string.split(",")
  |> list.first
  |> result.try(fn(first) {
    first
    |> string.split("-")
    |> list.first
  })
  |> result.try(lang.from_iso639_1)
  |> result.unwrap(fallback)
  |> Ok
}
```

### Multilingual CLI Application

```gleam
import argv
import gleam/io
import gleam/list
import gleam/result
import gleam/string
import lang.{type Language}

pub fn main() {
  let language =
    argv.load().arguments
    |> list.find(fn(arg) { string.starts_with(arg, "--lang=") })
    |> result.map(fn(arg) { string.drop_start(arg, 7) })
    |> result.try(lang.from_iso639_1)
    |> result.unwrap(get_system_language())

  print_welcome(language)
}

fn print_welcome(language: Language) {
  case language {
    lang.En -> io.println("Welcome!")
    lang.Es -> io.println("¡Bienvenido!")
    lang.Fr -> io.println("Bienvenue!")
    lang.De -> io.println("Willkommen!")
    lang.Ja -> io.println("ようこそ!")
    lang.Tw -> io.println("Akwaaba!")
    lang.Zh -> io.println("欢迎!")
    _ -> io.println("Welcome!")
  }
}

fn get_system_language() -> Language {
  todo as "Implementation would check LANG environment variable"
}
```

## Development

This library is code-generated from the official ISO 639 data.

The `language-codes-full.csv` file is available on [DataHub](https://datahub.io/core/language-codes).

The code generation modules along with the CSV file are located in `dev/`.

After making changes, run the following:

```sh
gleam run -m codegen  # Regenerate src/lang module
gleam build           # Build the package
gleam docs build      # Build docs
gleam test            # Run the tests
```

If the `languages-codes-full.csv` file is re-downloaded, update the date in `dev/INFO.md`.