
An OTP library for parsing HTML documents.

This library attempts to follow the [HTML 5.2 specification]( 
for tokenizing and parsing the HTML syntax as closely as possible.
This means that common errors that browsers accept are also accepted here and sanitized.

The output from `htmerl:sax/2` is identical to the XML SAX events produced
by `xmerl_sax_parser` except that here all values and names are UTF-8 binary
and not lists.  

There are two ways to use `htmerl`. 
Firstly, to build a tree directly from the parsed input. Notice here that the missing "head" element was added.

1> htmerl:simple(<<"<!DOCTYPE html><html><body>Hello</body></html>">>).

Secondly, as a SAX parser. Calling `htmerl:sax/1` returns a list of SAX events.
`htmerl:sax/2` calls a user defined function.

2> htmerl:sax(<<"<!DOCTYPE html><html><body>Hello</body></html>">>).

 or with a user defined function and state
3> F = fun(E, _, S) -> io:format("Event: ~p~n", [E]), S end,
Opts = [{event_fun, F}, {user_state, []}],
htmerl:sax(<<"<!DOCTYPE html><html><body>Hello</body></html>">>, Opts).
Event: startDocument
Event: {startDTD,<<"html">>,<<>>,<<>>}
Event: endDTD
Event: {startPrefixMapping,<<>>,<<"">>}
Event: {startElement,<<"">>,<<"html">>,
Event: {startElement,<<"">>,<<"head">>,
Event: {endElement,<<"">>,<<"head">>,
Event: {startElement,<<"">>,<<"body">>,
Event: {characters,<<"Hello">>}
Event: {endElement,<<"">>,<<"body">>,
Event: {endElement,<<"">>,<<"html">>,
Event: {endPrefixMapping,<<>>}
Event: endDocument


    $ rebar3 compile