README.md

# ExGherkin
Elixir implementation for `Gherkin`-language.

Status: Feature-complete but still more documentation work required.

The purpose of this project is to be at least functionally equivalent to
the [official `ruby` implementation](https://github.com/cucumber/cucumber/tree/master/gherkin/ruby).
In order to realize this, detailed tests have been implemented to compare
the output generated by the official tools with the output generated in
the tests.

This project is intentionally coined in its own dedicated `repo` to facilitate others; such as [white_bread](https://github.com/meadsteve/white-bread) and [cabbage](https://github.com/cabbage-ex/cabbage), to leverage this `parser` and
benefit from multi-language support and other features.

## Overview
Most likely your visit to this repo was prompted by your usage of the package [ex_cucumber](https://github.com/Ajwah/ex_cucumber).
In such a case, then the main things you would want from this repo are:
  * Multi Language Support:
    * `gherkin-languages.terms`(resource file): This binary contains all the
    international language support you find as listed at: https://cucumber.io/docs/gherkin/reference/#overview
    * `gherkin-languages.json`(source file): The latest version of this can be
    downloaded from https://github.com/cucumber/cucumber/blob/master/gherkin/gherkin-languages.json. Use
    `mix gherkin_languages` to generate the resource file after having configured `source` and `resource`
    in `config.exs` as exemplified below.
    * Feel free to introduce `gherkin keyword`-aliases of your own that you feel are beneficial to be included in your
    bussiness domain.

  * Grammar:
    * Formal Grammar Specification: https://github.com/Ajwah/ex-gherkin/blob/master/src/parser.yrl
    * Copious Examples: https://github.com/Ajwah/ex-gherkin/tree/master/test/support/testdata

## Configuration
```elixir
import Config
gherkin_languages = "gherkin-languages"

config :ex_gherkin,
  file: %{
    # to be downloaded. Serves as input for mix task: `mix gherkin_languages`
    source: "#{gherkin_languages}.json",

    # to be generated with the mix task: `mix gherkin_languages`
    resource: "#{gherkin_languages}.few.terms"
  },
  # e.g. a word with several meanings depending on the context for `Given`, `When`, `Then`, `And`, `But`
  homonyms: ["Агар ", "* ", "अनी ", "Tha ", "Þá ", "Ða ", "Þa "],

  # This is only beneficial for development purposes and can be skipped
  debug: %{
    tokenizer: false,
    prepare: false,
    parser: false,
    format_message: false,
    parser_raise: false
  }
```

Below this point are more arcane details that would mainly interest maintainers and those who want to use this library
directly to parse `feature`-files.

## Tools

1. Mix tasks:
    1. `mix gherkin_languages`:
       * input: `gherkin-languages.json`
       * output: `gherkin-languages.terms`
       * supply variety of options to control:
         * which subset of languages desired
         * homonyms
    2. `mix ast_ndjson`:
       * input: `*.feature`
       * output: `*.feature.ast.ndjson`
       * dependency: `gherkin`-executable(see below)
2. `generate-tokens`:
   * input: `*.feature`
   * output: `*.feature.tokens`
   * dependency: `gherkin`-executable(see below)

The dependency `gherkin`-executable referred to above can be installed by:
* `gem install cucumber`

## API
```elixir
"""
Feature: Minimal

  Scenario: minimalistic
    Given the minimalism

"""
|> ExGherkin.prepare
|> ExGherkin.run
```

```elixir
[path: "path_to.feature"]
|> ExGherkin.prepare
|> ExGherkin.run
```

For fine granular control, there are two parts:

1. `Scanner.tokenize/1,2` to tokenize the `feature` file. This leverages
`gherkin-languages.terms` to provide `i18n`-support. This file is
reproducible with the aid of the `mix`-task briefly docummented above,
e.g.: `mix gherkin_languages`. Using this task, one can use a subset
of what is contained under [gherkin-languages.json](https://github.com/cucumber/cucumber/blob/master/gherkin/gherkin-languages.json) and/or to even incorporate one's own domain-
language specific keywords to denote `Given`, `When`, `Then` etc. An
example of this is the file: `gherkin-languages.few.terms` which was
primarily introduced to bring the compile-time of this project down.
`Scanner.tokenize/1,2` [effectively iterates](https://github.com/Ajwah/ex-gherkin/blob/aa32dad70911cf5a7ead186a944dedafc10e2dd1/lib/scanner/scanner.ex#L59-L62) over this file,
generating functions that pattern-match `i18n`-support.

2. `Parser.run/1` to convert the tokens obtained into `AST`-tree.
Leverages `yecc` parser under the hood.


Kindly consult the test-files for more detailed usage.

## Road Forward
* [ ] Introduce more detailed documentation:
    * [ ] Better examples as to how to effectively use this parser
    * [ ] Create online documentation to clarify `Syntax`-errors
    * [ ] Provide a detailed account of the various tests implemented
    that are to proof that this is functionally equivalent to the `ruby`-
    implementation.
* [ ] Take feedback from the official team to have this `repo`to be
included in the [mono-repo](https://github.com/cucumber/cucumber)
eventually.
* [ ] CI/CD.
* [X] Implement `Cucumber` using this tool. See: [ex_cucumber](https://github.com/Ajwah/ex_cucumber)
* [X] Publish to Hex.