# HtmlSanitizeEx [](https://github.com/rrrene/html_sanitize_ex/actions/workflows/ci-workflow.yml) [](http://inch-ci.org/github/rrrene/html_sanitize_ex)
`html_sanitize_ex` provides a fast and straightforward HTML Sanitizer written in Elixir which lets you include HTML authored by third-parties in your web application while protecting against XSS.
It is the first Hex package to come out of the [elixirstatus.com](http://elixirstatus.com) project, where it will be used to sanitize user announcements from the Elixir community.
## What can it do?
`html_sanitize_ex` parses a given HTML string and, based on the used [Scrubber](https://github.com/rrrene/html_sanitize_ex/tree/master/lib/html_sanitize_ex/scrubber), either completely strips it from HTML tags or sanitizes it by only allowing certain HTML elements and attributes to be present.
## Installation
Add html_sanitize_ex as a dependency in your `mix.exs` file.
```elixir
defp deps do
[{:html_sanitize_ex, "~> 1.4"}]
end
```
After adding you are done, run `mix deps.get` in your shell to fetch the new dependency.
The only dependency of `html_sanitize_ex` is `mochiweb` which is used to parse HTML.
## Usage
Depending on the scrubber you select, it can strip all tags from the given string:
```elixir
text = "<a href=\"javascript:alert('XSS');\">text here</a>"
HtmlSanitizeEx.strip_tags(text)
# => "text here"
```
Or allow certain basic HTML elements to remain:
```elixir
text = "<h1>Hello <script>World!</script></h1>"
HtmlSanitizeEx.basic_html(text)
# => "<h1>Hello World!</h1>"
```
There are built-in scrubbers that cover common use cases, but you can also
easily define custom scrubbers (see the next section).
The following default scrubbing options exist:
```elixir
HtmlSanitizeEx.basic_html(html)
HtmlSanitizeEx.html5(html)
HtmlSanitizeEx.markdown_html(html)
HtmlSanitizeEx.strip_tags(html)
```
There is also one scrubber primarily used for testing:
```elixir
HtmlSanitizeEx.noscrub(html)
```
Before using or extending a built-in scrubber, you should verify that it functions in the way
you expect. The built-in scrubbers are located in
[/lib/html_sanitize_ex/scrubber](https://github.com/rrrene/html_sanitize_ex/tree/master/lib/html_sanitize_ex/scrubber)
## Custom Scrubbers
A custom scrubber has the advantage of allowing you to support only the minimum
functionality needed for your use case.
With a custom scrubber, you define which tags, attributes, and uri schemes (e.g.
`https`, `mailto`, `javascript`, etc.) are allowed. Anything not allowed can
then be stripped out.
Here is an example of a custom scrubber which allows only `p`, `h1`, and
`a` tags, and restricts the `href` attribute to only the `https` and `mailto`
[URI schemes](https://en.wikipedia.org/wiki/List_of_URI_schemes). It also
removes CDATA sections and comments.
```elixir
defmodule MyProject.MyScrubber do
use HtmlSanitizeEx
allow_tag_with_these_attributes("p", [])
allow_tag_with_these_attributes("h1", [])
allow_tag_with_uri_attributes("a", ["href"], ["https", "mailto"])
end
```
Then, you can use the scrubber in your project by calling `MyProject.MyScrubber.sanitize/1`:
```elixir
text = "<h1>Hello <script>World!</script></h1>"
MyProject.MyScrubber.sanitize(text)
# => "<h1>Hello World!</h1>"
```
A great way to make a custom scrubber is to use one the of built-in scrubbers closest to your use case as a template.
The built in scrubbers are located in
[/lib/html_sanitize_ex/scrubber](https://github.com/rrrene/html_sanitize_ex/tree/master/lib/html_sanitize_ex/scrubber)
## Extending Scrubbers
Let's say you love `HtmlSanitizeEx.basic_html/1`, you just need it to also support the `small` tag (for whatever reason).
You can extend any scrubber by using the `:extend` option.
```elixir
defmodule MyProject.MyScrubber do
use HtmlSanitizeEx, extend: :basic_html
allow_tag_with_these_attributes("small", [])
end
```
You can extend `:basic_html`, `:html5`, `:markdown_html` and `:strip_tags` to extend built-in functionality and you can also extend any custom scrubber you created:
```elixir
defmodule MyProject.MyOtherScrubber do
use HtmlSanitizeEx, extend: MyProject.MyScrubber
allow_tag_with_these_attributes("p", ["class"])
end
```
The result is a scrubber that works like the built-in BasicHTML scrubber, but also allows `small` tags and `class` attributes on `<p>` tags.
## Contributing
1. [Fork it!](http://github.com/rrrene/html_sanitize_ex/fork)
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create new Pull Request
## Author
René Föhring (@rrrene)
## License
html_sanitize_ex is released under the MIT License. See the LICENSE file for further
details.