# onigleam
[](https://hex.pm/packages/onigleam)
[](https://hexdocs.pm/onigleam/)
A Gleam library for converting [Oniguruma](https://github.com/kkos/oniguruma) regex patterns to patterns compatible with `gleam_regexp`. This is hopefully useful for working with TextMate grammars in Gleam, as TextMate uses Oniguruma's regex syntax for syntax highlighting rules.
> **Attribution:** This library is a Gleam port of [oniguruma-to-es](https://github.com/slevithan/oniguruma-to-es) and [oniguruma-parser](https://github.com/slevithan/oniguruma-parser) by [Steven Levithan](https://github.com/slevithan).
>
> This port was developed with LLM assistance (Claude).
```sh
gleam add onigleam@1
```
## Quick Start
```gleam
import onigleam
import onigleam/options
import gleam/dict
import gleam/regexp
// Convert a TextMate-style pattern with named capture groups
let assert Ok(result) = onigleam.convert(
"(?<keyword>fn|let|pub)\\s+(?<name>[a-z_]\\w*)"
)
// Named groups become numbered, with a mapping preserved
result.pattern
// "(fn|let|pub)\\s+([a-z_]\\w*)"
dict.get(result.capture_names, "keyword") // Ok(1)
dict.get(result.capture_names, "name") // Ok(2)
// Compile and use directly
let assert Ok(re) = onigleam.to_regexp(
"(?<num>\\d+)",
options.default_options(),
)
let assert [match] = regexp.scan(re, "value: 42")
match.content // "42"
```
## Usage
### Named Capture Groups
Oniguruma's named capture groups `(?<name>...)` are converted to standard numbered groups, since `gleam_regexp` doesn't expose named groups. The name-to-number mapping is returned so you can still reference captures by name:
```gleam
import onigleam
import gleam/dict
let assert Ok(result) = onigleam.convert(
"(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})"
)
result.pattern
// "(\\d{4})-(\\d{2})-(\\d{2})"
dict.get(result.capture_names, "year") // Ok(1)
dict.get(result.capture_names, "month") // Ok(2)
dict.get(result.capture_names, "day") // Ok(3)
```
### Unicode and Hex Escapes
Oniguruma's various escape formats are converted to their literal characters:
```gleam
import onigleam
// Hex escapes
let assert Ok(r1) = onigleam.convert("\\x41\\x42\\x43")
r1.pattern // "ABC"
// Unicode escapes
let assert Ok(r2) = onigleam.convert("caf\\u00e9")
r2.pattern // "café"
```
### TextMate Grammar Patterns
TextMate grammars sometimes reference capture groups that don't exist in the current pattern (orphan backreferences). Use `convert_textmate` to handle these gracefully:
```gleam
import onigleam
// This pattern references \1 but has no capture group
// Normal conversion would fail, but convert_textmate allows it
let assert Ok(result) = onigleam.convert_textmate(
"(['\"]).*?\\1" // Match quoted strings
)
// Returns Ok with a warning about the orphan backref
```
### Flags and Options
```gleam
import onigleam
import onigleam/options
// Case-insensitive matching
let assert Ok(result) = onigleam.convert_with_flags(
"(?<tag>html|body|div)",
"i"
)
result.regexp_options.case_insensitive // True
// Full control with options builder
let opts = options.default_options()
|> options.with_flags("i")
|> options.allow_orphan_backrefs
let assert Ok(result) = onigleam.to_regexp_details(
"(?<open><\\w+>).*?(?<close></\\w+>)",
opts,
)
```
## API Reference
### Main Functions
| Function | Description |
|----------|-------------|
| `convert(pattern)` | Convert with default options |
| `convert_with_flags(pattern, flags)` | Convert with Oniguruma flags |
| `convert_textmate(pattern)` | Convert with TextMate-friendly options |
| `to_regexp(pattern, options)` | Convert and compile to `Regexp` |
| `to_regexp_details(pattern, options)` | Convert with full result details |
| `format_error(error)` | Format error as human-readable string |
### ConversionResult
```gleam
pub type ConversionResult {
ConversionResult(
pattern: String, // Generated pattern string
regexp_options: Options, // Options for gleam_regexp
capture_names: Dict(String, Int), // Name -> group number mapping
warnings: List(String), // Any warnings generated
)
}
```
## Supported Features
| Feature | Status | Notes |
|---------|--------|-------|
| Literals, escapes | Supported | Direct mapping |
| Character classes `[abc]` | Supported | Including ranges, negation |
| Quantifiers `*`, `+`, `?`, `{n,m}` | Supported | Greedy and lazy |
| Capturing groups `(...)` | Supported | Named groups converted to numbered |
| Non-capturing groups `(?:...)` | Supported | Direct mapping |
| Lookahead `(?=...)`, `(?!...)` | Supported | Both positive and negative |
| Lookbehind `(?<=...)`, `(?<!...)` | Supported | Both positive and negative |
| Anchors `^`, `$`, `\A`, `\z` | Supported | Direct mapping |
| Word boundaries `\b`, `\B` | Supported | Platform differences may apply |
| Character shorthands `\d`, `\w`, `\s` | Supported | Direct mapping |
| Alternation `a\|b` | Supported | Direct mapping |
| Unicode escapes `\uHHHH` | Supported | Converted to literal |
| Hex escapes `\xHH` | Supported | Converted to literal |
### Unsupported Features (Will Error)
| Feature | Why |
|---------|-----|
| Atomic groups `(?>...)` | Cannot emulate in gleam_regexp |
| Possessive quantifiers `*+`, `++` | Cannot emulate in gleam_regexp |
| Recursion `\g<0>` | Not supported by underlying engines |
| Subroutines `\g<name>` | Not supported by underlying engines |
| Search start `\G` | Requires stateful regex |
| Absence functions `(?~...)` | Cannot emulate |
### Partial Support / Workarounds
| Feature | Handling |
|---------|----------|
| Named captures | Converted to numbered; mapping returned |
| dotAll mode | `.` replaced with `[\s\S]` when enabled |
| Flag modifiers `(?i:...)` | Flags applied during transformation |
| `\K` directive | Warning issued; full match returned |
## Platform Compatibility
This library generates patterns compatible with both:
- Erlang's `re` module (PCRE)
- JavaScript's `RegExp`
Run tests on both targets:
```sh
gleam test --target erlang
gleam test --target javascript
```
## Error Handling
```gleam
import onigleam
case onigleam.convert("(?>atomic)") {
Ok(result) -> use_result(result)
Error(err) -> {
let message = onigleam.format_error(err)
// "Atomic groups are not supported. ..."
}
}
```
## Development
```sh
gleam test
gleam test --target javascript # Test on JavaScript target
gleam test --target erlang # Test on Erlang target
```
Further documentation can be found at <https://hexdocs.pm/onigleam>.
## License
MIT License. See [LICENSE](LICENSE) for details.