README.md

# DocxParse

Judith is a word docx transpiler, in that it converts from XML to HTML.  
The Docx format is just a zipped file of XML files. Judith takes those XML and outputs HTML with inline CSS to come close to looking like the original word doc. Judith is not feature complete but it should get close enough. 

## Installation

If [available in Hex](https://hex.pm/docs/publish), the package can be installed
by adding `docx_parse` to your list of dependencies in `mix.exs`:

```elixir
def deps do
  [
    {:docx_parse, "~> 0.1.0"}
  ]
end
```

Documentation can be generated with [ExDoc](https://github.com/elixir-lang/ex_doc)
and published on [HexDocs](https://hexdocs.pm). Once published, the docs can
be found at [https://hexdocs.pm/docx_parse](https://hexdocs.pm/docx_parse).

## Usage
```elixir
alias Remote.DocxParse
alias Remote.Docx.Document

DocxParse.document_to_html("./test/sample_doc.docx"),
  {:ok, "html output"}
end

```

## Reasoning
The purpose of this package is to provide the basic functionality of the Word docx format and convert it to HTML. It is important to remember that Word and HTML do not nessesary have equivilent functionalities. e.g. HTML doesn't easily support tiered lists out of the box, but it does support bold and italics.

The resulting HTML will not be perfectly semantic. When combined with inline styles and a stylesheet the result will look very close enough to the original word document.