# Meeseeks

[![Hex Version](](
[![Hex Docs](")](
[![Total Download](](

Meeseeks is an Elixir library for parsing and extracting data from HTML and XML with CSS or XPath selectors.

import Meeseeks.CSS

html = HTTPoison.get!("").body

for story <- Meeseeks.all(html, css("tr.athing")) do
  title =, css(".title a"))

    title: Meeseeks.text(title),
    url: Meeseeks.attr(title, "href")
#=> [%{title: "...", url: "..."}, %{title: "...", url: "..."}, ...]

## Features

- Friendly API
- Browser-grade HTML5 parser
- Permissive XML parser
- CSS and XPath selectors
- Supports custom selectors
- Helpers to extract data from selections

## Compatibility

Meeseeks requires a minimum combination of Elixir 1.12.0 and Erlang/OTP 23.0, and is tested with a maximum combination of Elixir 1.14.0 and Erlang/OTP 25.0.

## Installation

Meeseeks depends on the Rust library [`html5ever`]( via [`meeseeks_html5ever`](, but because `meeseeks_html5ever` provides pre-compiled NIFs via [`rustler_precompiled`]( **you do not need to have Rust installed** to use Meeseeks.

To install Meeseeks, add it to your `mix.exs`:

defp deps do
    {:meeseeks, "~> 0.17.0"}

Then run `mix deps.get`.

### Force Compilation

If you need to force compilation of the Rust NIF for some reason, see the instructions [here](

## Getting Started

### Parse

Start by parsing a source (HTML/XML string or [`Meeseeks.TupleTree`]( into a [`Meeseeks.Document`]( so that it can be queried.

`Meeseeks.parse/1` parses the source as HTML, but `Meeseeks.parse/2` accepts a second argument of either `:html`, `:xml`, or `:tuple_tree` that specifies how the source is parsed.

document = Meeseeks.parse("<div id=main><p>1</p><p>2</p><p>3</p></div>")
#=> #Meeseeks.Document<{...}>

The selection functions accept an unparsed source, parsing it as HTML, but parsing is expensive so parse ahead of time when running multiple selections on the same document.

### Select

Next, use one of Meeseeks's selection functions - `fetch_all`, `all`, `fetch_one`, or `one` - to search for nodes.

All these functions accept a queryable (a source, a document, or a [`Meeseeks.Result`](, one or more [`Meeseeks.Selector`](, and optionally an initial context.

`all` returns a (possibly empty) list of results representing every node matching one of the provided selectors, while `one` returns a result representing the first node to match a selector (depth-first) or nil if there is no match.

`fetch_all` and `fetch_one` work like `all` and `one` respectively, but wrap the result in `{:ok, ...}` if there is a match or return `{:error, %Meeseeks.Error{type: :select, reason: :no_match}}` if there is not.

To generate selectors, use the `css` macro provided by [`Meeseeks.CSS`]( or the `xpath` macro provided by [`Meeseeks.XPath`](

import Meeseeks.CSS
result =, css("#main p"))
#=> #Meeseeks.Result<{ <p>1</p> }>

import Meeseeks.XPath
result =, xpath("//*[@id='main']//p"))
#=> #Meeseeks.Result<{ <p>1</p> }>

### Extract

Retrieve information from the [`Meeseeks.Result`]( with an extractor.

The included extractors are `attr`, `attrs`, `data`, `dataset`, `html`, `own_text`, `tag`, `text`, `tree`.

#=> "p"
#=> "1"
#=> {"p", [], ["1"]}

The extractors `html` and `tree` work on [`Meeseeks.Document`]( in addition to [`Meeseeks.Result`](

#=> "<html><head></head><body><div id=\"main\"><p>1</p><p>2</p><p>3</p></div></body></html>"

## Guides

- [Meeseeks vs. Floki](guides/
- [CSS Selectors](guides/
- [XPath Selectors](guides/
- [Custom Selectors](guides/
- [Deployment](guides/

## Contributing

If you are interested in contributing please read the [contribution guidelines](

## License

Meeseeks is licensed under the [MIT license](