# FuzzyCompare

## Getting started

In order to compare two strings with each other do the following:

    iex> FuzzyCompare.similarity("Oscar-Claude Monet", "monet, claude")

## Inner workings

Imagine you had to [match some names](

Try to match the following list of painters:

  * `"Oscar-Claude Monet"`
  * `"Edouard Manet"`
  * `"Monet, Claude"`

For a human it is easy to see that some of the names have just been flipped
and that others are different but similar sounding.

A first approrach could be to compare the strings with a string similarity
function like the

    iex> String.jaro_distance("Oscar-Claude Monet", "Monet, Claude")

    iex> String.jaro_distance("Oscar-Claude Monet", "Edouard Manet")

This is not an improvement over exact equality.

In order to improve the results this library uses two different approaches,
`FuzzyCompare.ChunkSet` and `FuzzyCompare.SortedChunks`.

### Sorted chunks

This approach yields good results when words within a string have been
shuffled around. The strategy will sort all substrings by words and compare
the sorted strings.

    iex> FuzzyCompare.SortedChunks.substring_similarity("Oscar-Claude Monet", "Monet, Claude")

    iex(4)> FuzzyCompare.SortedChunks.substring_similarity("Oscar-Claude Monet", "Edouard Manet")

### Chunkset

The chunkset approach is best in scenarios when the strings contain other
substrings that are not relevant to what is being searched for.

    iex> FuzzyCompare.ChunkSet.standard_similarity("Claude Monet", "Alice Hoschedé was the wife of Claude Monet")

### Substring comparison

Should one of the strings be much longer than the other the library will
attempt to compare matching substrings only.

## Credits

This library is inspired by a [seatgeek blogpost from 2011](