# Expath
[](https://hex.pm/packages/expath)
[](https://hexdocs.pm/expath)
[](https://github.com/wearecococo/expath/actions)
**Lightning-fast XML parsing and XPath querying for Elixir, powered by Rust NIFs.**
Expath provides blazing-fast XML processing through Rust's battle-tested `sxd-document` and `sxd-xpath` libraries, delivering **2-10x performance improvements** compared to existing Elixir XML libraries.
## ✨ Key Features
- **🚀 Blazing Fast**: 2-10x faster than SweetXml with Rust-powered NIFs
- **🔄 Parse-Once, Query-Many**: Efficient document reuse for multiple XPath queries
- **🛡️ Battle-Tested**: Built on proven Rust XML libraries (sxd-document, sxd-xpath)
- **🎯 Simple API**: Clean, intuitive interface with comprehensive documentation
- **⚡ Thread-Safe**: Safe concurrent access to parsed documents
- **🔧 Zero Dependencies**: No external XML parsers required
## 🚀 Quick Start
### Installation
Add `expath` to your list of dependencies in `mix.exs`:
```elixir
def deps do
[
{:expath, "~> 0.1.0"}
]
end
```
Then run:
```bash
mix deps.get
mix deps.compile
```
### Basic Usage
# Simple XPath query
```elixir
xml = """
<library>
<book id="1">
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
</book>
<book id="2">
<title>1984</title>
<author>George Orwell</author>
</book>
</library>
"""
# Extract all book titles
{:ok, titles} = Expath.select(xml, "//title/text()")
# => ["The Great Gatsby", "1984"]
# Find specific book
{:ok, [title]} = Expath.select(xml, "//book[@id='1']/title/text()")
# => ["The Great Gatsby"]
# Count books
{:ok, [count]} = Expath.select(xml, "count(//book)")
# => ["2"]
```
### Parse-Once, Query-Many (Recommended for Multiple Queries)
```elixir
# Parse document once
{:ok, doc} = Expath.new(xml)
# Run multiple queries efficiently
{:ok, titles} = Expath.query(doc, "//title/text()")
{:ok, authors} = Expath.query(doc, "//author/text()")
{:ok, book_count} = Expath.query(doc, "count(//book)")
# Document is automatically cleaned up when out of scope
```
## 📊 Performance Benchmarks
Real-world performance comparison with SweetXml across different document sizes:
| Document Size | Speed Improvement | Use Case |
|---------------|------------------|----------|
| Small (644B) | **2-3x faster** | API responses, config files |
| Medium (5.6KB) | **2.3x faster** | RSS feeds, small datasets |
| Large (904KB) | **8-10x faster** | Large documents, bulk processing |
### Benchmark Results Summary
```
*** Large XML Performance ***
Expath (Rust NIFs) 78.27 iterations/sec (12.78 ms avg)
SweetXml 7.77 iterations/sec (128.64 ms avg)
Comparison: Expath is 10.07x faster
```
Run your own benchmarks:
```bash
mix run bench/benchmark.exs
```
## 📖 API Reference
### Core Functions
#### `Expath.select/2` - Single Query
Perfect for one-off XPath queries.
```elixir
Expath.select(xml_string, xpath_expression)
# Returns: {:ok, results} | {:error, reason}
```
#### `Expath.new/1` - Parse Document
Creates a reusable document for multiple queries.
```elixir
{:ok, doc} = Expath.new(xml_string)
# Returns: {:ok, %Expath.Document{}} | {:error, reason}
```
#### `Expath.query/2` - Query Parsed Document
Query a previously parsed document.
```elixir
{:ok, results} = Expath.query(document, xpath_expression)
# Returns: {:ok, results} | {:error, reason}
```
### XPath Support
Expath supports the full XPath 1.0 specification:
```elixir
# Node selection
Expath.select(xml, "//book") # All book elements
Expath.select(xml, "/library/book[1]") # First book
Expath.select(xml, "//book[@id='1']") # Book with id="1"
# Text extraction
Expath.select(xml, "//title/text()") # All title text
Expath.select(xml, "//book/@id") # All id attributes
# Functions
Expath.select(xml, "count(//book)") # Count books
Expath.select(xml, "//book[position()=1]") # First book
Expath.select(xml, "//book[contains(@class,'fiction')]") # Contains filter
# Complex expressions
Expath.select(xml, "//book[price > 10]/title/text()") # Conditional selection
```
## Error Handling
Expath provides detailed error information:
```elixir
# Invalid XML (detected during query)
{:error, :invalid_xml} = Expath.select("<root><unclosed>", "/*")
# Invalid XPath expression
{:error, :invalid_xpath} = Expath.select(xml, "//[invalid")
# XPath evaluation errors
{:error, :xpath_error} = Expath.query(doc, "unknown-function()")
```
## Performance
Expath is designed for high-performance XML processing:
- **Native Speed**: Rust NIFs provide near-native performance
- **Zero-Copy**: Efficient string handling between Elixir and Rust
- **Resource Caching**: Parse once, query many times without re-parsing
- **Memory Efficient**: Automatic memory management via Erlang garbage collection
### Performance Example
```elixir
# Large XML document
xml = File.read!("large_document.xml")
# Parse once (expensive operation)
{:ok, doc} = Expath.new(xml)
# Multiple queries (very fast - no re-parsing)
Enum.each(1..1000, fn _i ->
{:ok, _results} = Expath.query(doc, "//some/xpath")
end)
```
## Platform Support
Expath supports all platforms where Rust and Erlang are available:
- **Linux** (x86_64, aarch64)
- **macOS** (Intel, Apple Silicon)
- **Windows** (x86_64)
### Apple Silicon (M1/M2) Setup
Expath includes special configuration for Apple Silicon Macs. If you encounter linking issues, ensure you have:
1. Native Erlang installation (not x86_64 via Rosetta)
2. Native Rust toolchain for aarch64-apple-darwin
The included Cargo configuration handles the necessary linker flags automatically.
## Examples
### RSS Feed Processing
```elixir
defmodule RSSProcessor do
def process_feed(rss_xml) do
{:ok, doc} = Expath.new(rss_xml)
{:ok, titles} = Expath.query(doc, "//item/title/text()")
{:ok, links} = Expath.query(doc, "//item/link/text()")
{:ok, descriptions} = Expath.query(doc, "//item/description/text()")
titles
|> Enum.zip([links, descriptions])
|> Enum.map(fn {title, [link, description]} ->
%{title: title, link: link, description: description}
end)
end
end
```
### Configuration File Parsing
```elixir
defmodule ConfigParser do
def parse_config(xml_config) do
{:ok, doc} = Expath.new(xml_config)
{:ok, database_host} = Expath.query(doc, "//database/@host")
{:ok, database_port} = Expath.query(doc, "//database/@port")
{:ok, features} = Expath.query(doc, "//features/feature/@name")
%{
database: %{host: database_host, port: database_port},
features: features
}
end
end
```
### Data Extraction Pipeline
```elixir
defmodule DataExtractor do
def extract_products(xml_data) do
{:ok, doc} = Expath.new(xml_data)
# Extract in parallel using cached document
tasks = [
Task.async(fn -> Expath.query(doc, "//product/@id") end),
Task.async(fn -> Expath.query(doc, "//product/name/text()") end),
Task.async(fn -> Expath.query(doc, "//product/price/text()") end),
Task.async(fn -> Expath.query(doc, "//product/category/text()") end)
]
[ids, names, prices, categories] =
tasks
|> Enum.map(&Task.await/1)
|> Enum.map(fn {:ok, results} -> results end)
[ids, names, prices, categories]
|> Enum.zip()
|> Enum.map(fn {id, name, price, category} ->
%{id: id, name: name, price: price, category: category}
end)
end
end
```
## Development
### Prerequisites
- Elixir 1.18 or later
- Erlang/OTP 27 or later
- Rust 1.70 or later
- C compiler (gcc, clang, or MSVC)
### Building from Source
```bash
git clone https://github.com/yourusername/expath.git
cd expath
mix deps.get
mix compile
```
### Running Tests
```bash
mix test
```
### Building Documentation
```bash
mix docs
```
### Docker Development
For cross-platform testing or if you prefer containerized development, Expath includes comprehensive Docker support:
#### Quick Start with Docker
```bash
# Run all tests in Linux container
./scripts/docker-test.sh
# Or use docker-compose for specific tasks
docker-compose run test
docker-compose run benchmark
docker-compose run quality
```
#### Available Docker Services
- **`dev`**: Development environment with all dependencies
- **`test`**: Run the full test suite
- **`benchmark`**: Execute performance benchmarks
- **`quality`**: Run code quality checks (Credo)
#### Docker Commands
```bash
# Build and test everything
docker-compose up test
# Run interactive development shell
docker-compose run dev iex -S mix
# Execute benchmarks
docker-compose run benchmark
# Check code quality
docker-compose run quality
# Clean up containers
docker-compose down --volumes
```
#### Multi-Architecture Testing
The Docker setup supports testing on different architectures:
```bash
# Test on current architecture
docker-compose run test
# Build for specific platform (requires BuildKit)
DOCKER_PLATFORM=linux/amd64 docker-compose run test
```
This is particularly useful for ensuring your NIFs work correctly across different platforms before deployment.
## Contributing
1. Fork the repository
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Write tests for your changes
4. Ensure all tests pass (`mix test`)
5. Commit your changes (`git commit -am 'Add some feature'`)
6. Push to the branch (`git push origin my-new-feature`)
7. Create a Pull Request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Built on top of the excellent [sxd-document](https://crates.io/crates/sxd-document) and [sxd-xpath](https://crates.io/crates/sxd-xpath) Rust crates
- Uses [Rustler](https://github.com/rusterlium/rustler) for safe Elixir-Rust interoperability
- Inspired by the need for high-performance XML processing in Elixir applications
## Changelog
### v0.1.0 (Initial Release)
- High-performance XML parsing via Rust NIFs
- Full XPath 1.0 support
- Parse-once, query-many Document resource API
- Comprehensive error handling
- Apple Silicon support
- Complete test suite and documentation