# Changelog
## 3.0.2 (2022-10-25)
- Ensure that escaped fields as the last field on the last line without a newline are included in the results
## 3.0.1 (2022-10-25)
- Ensure that stray escape quotes and unterminated escape sequences on a last line without a newline produce errors
## 3.0.0 (2022-10-25)
- The parallel parser/lexer with a binary matching parser with better performance.
- A new `:field_transform` option allows specifying functionality applied when decoding any field through a function
- Escape characters can now be specified using the `:escape_character` option, this [Closes #59](https://github.com/beatrichartz/csv/issues/59)
- The library will now reparse lines that follow e.g. an unterminated escape sequence. This ensures that all possible valid rows
will be returned in normal mode
- Encoding checks have been removed because they can either be done using `:field_transform` or outside the library
- Better docs
### Upgrading from 2.x
- **Parallelism has been removed**, alongside its options `:num_workers` and `:worker_work_ratio`. You can safely remove them.
- **`StrayQuoteError` is now `StrayEscapeCharacterError`**. If you catch this error in your code, you need to rename it.
- **The `:strip_fields` option needs to be replaced** with the `:field_transform` option:
```elixir
File.stream!("data.csv") |> CSV.decode(field_transform: &String.trim/1)
```
- **`:validate_row_length` now defaults to `false`**. This option produces an error for rows with different length. Set it
to `true` to get the same behaviour as in 2.x
- **`:escape_formulas` is now `:unescape_formulas` for `decode` and `decode!`.** It is still `:escape_formulas` for
`encode`. Change `:escape_formulas` to `:unescape_formulas` in `decode` calls to get the same behaviour as in 2.x
- **`:escape_max_lines` now defaults to `10`** instead of `1000`. To get the same behaviour as in 2.x, use:
```elixir
File.stream!("data.csv") |> CSV.decode(escape_max_lines: 1000)
```
- **`:replace` has been removed**. `CSV` will now return fields with incorrect encoding as-is.
You can use the new `:field_transform` option to provide a function transforming fields while they are being parsed.
This allows to e.g. replace incorrect encoding:
```elixir
defp replace_bad_encoding(field) do
if String.valid?(field) do
field
else
field
|> String.codepoints()
|> Enum.map(fn codepoint -> if String.valid?(codepoint), do: codepoint, else: "?" end)
|> Enum.join()
end
end
```
## 2.5.0 (2022-09-17)
- Optional parameter `escape_formulas` to prevent CSV injection. [Fixes #103](https://github.com/beatrichartz/csv/issues/103) reported by [@maennchen](https://github.com/maennchen). Contributed by [@maennchen](https://github.com/maennchen) in [PR #104](https://github.com/beatrichartz/csv/pull/104).
- Optional parameter `force_quotes` to force quotes when encoding contributed by [@stuart](https://github.com/stuart)
- Bugfix to pass non UTF-8 lines through in normal mode so other lines can be processed, [Fixes #107](https://github.com/beatrichartz/csv/pull/107). Contributed by [@al2o3cr](https://github.com/al2o3cr).
- Allow to encode keyword lists specifying headers as values, contributed by [@michaelchu](https://github.com/michaelchu)
- Better docs thanks to [@kianmeng](https://github.com/kianmeng)
## 2.4.1 (2020-09-12)
- Fix unnecessary escaping of delimiters when encoding [Fixes #70](https://github.com/beatrichartz/csv/issues/70)
reported by [@karmajunkie](https://github.com/karmajunkie)
## 2.4.0 (2020-09-12)
- Fix [StrayQuoteError](https://hexdocs.pm/csv/CSV.StrayQuoteError.html) not getting
passed the correct arguments in strict mode. [Fixes #96](https://github.com/beatrichartz/csv/issues/96).
- When headers are present multiple times and the `:headers` option is set to `true`, parse the values into a list.
Contributed by [@MrAlexLau](https://github.com/MrAlexLau) in [PR #97](https://github.com/beatrichartz/csv/pull/97).
## 2.3.1 (2019-03-30)
- Fix [StrayQuoteError](https://hexdocs.pm/csv/CSV.StrayQuoteError.html) incorrectly
getting raised when escape sequences end in new lines. [Fixes #89](https://github.com/beatrichartz/csv/issues/89).
Raised by [@rockwood](https://github.com/rockwood) in [Issue #96](https://github.com/beatrichartz/csv/issues/96).
## 2.3.0 (2019-03-17)
- Add [StrayQuoteError](https://hexdocs.pm/csv/CSV.StrayQuoteError.html) which gets
raised when a row has stray quotes rather than [EscapeSequenceError](https://hexdocs.pm/csv/CSV.EscapeSequenceError.html#content)
to help with common encoding errors.
## 2.2.0 (2019-03-03)
- Make syntax compatible with latest Elixir releases
- Add [`validate_row_length:` option](https://hexdocs.pm/csv/CSV.html#decode/2-options) defaulting to true to allow
disabling validation of row length.
## 2.0.0 (2017-05-29)
- Make [`decode`](https://hexdocs.pm/csv/CSV.html#decode/2) return row and
error tuples instead of raising errors directly
- Make old behaviour of raising errors directly available
via [`decode!`](https://hexdocs.pm/csv/CSV.html#decode!/2)
- Improve error messages for escape sequences
- Rewrite parts of the pipeline to be more modular
## 1.4.4 (2016-11-12)
- Load [`parallel_stream`](https://github.com/beatrichartz/parallel_stream)
as an app dependency to avoid load level errors.
See [issue #56](https://github.com/beatrichartz/csv/issues/56) reported
by [@luk3thomas](https://github.com/luk3thomas)
## 1.4.3 (2016-08-27)
- Fix a case where lines would not be aggregated correctly
[see #52](https://github.com/beatrichartz/csv/issues/52) reported by
[@yury-dimov](https://github.com/yury-dymov)
## 1.4.2 (2016-06-20)
- Update dependency on [`parallel_stream`](https://github.com/beatrichartz/parallel_stream)
## 1.4.1 (2016-05-21)
- Fix condition where rows would be dropped when decoding from stateful streams.
[See #39](https://github.com/beatrichartz/csv/issues/39) reported by
[@moxley](https://github.com/moxley)
## 1.4.0 (2016-04-03)
- add option to specify headers in encode - added [in #34](https://github.com/beatrichartz/csv/issues/34)
by [@barruumrex](https://github.com/barruumrex)
## 1.3.3 (2016-03-25)
- Fix empty streams raising a lexer error - raised [in #28](https://github.com/beatrichartz/csv/issues/28)
by [@kiliancs](https://github.com/kiliancs)
## 1.3.2 (2016-03-08)
- Cleanup, removing some unused defaults in function headers to remove compile
time warnings
## 1.3.1 (2016-03-08)
- Fix `:strip_cells` not stripping cells when multiple options are specified - #29 by [@tomjoro](https://github.com/tomjoro)
## 1.3.0 (2016-03-01)
- Now supports linebreaks inside escaped fields (#13)
- Raises an error when row length mismatches across rows
- Uses [parallel_stream](https://github.com/beatrichartz/parallel_stream) for parallelism
## 1.2.4 (2016-02-06)
- Fix encoding of double quotes
## 1.2.3 (2016-01-19)
- Fix a condition where headers: true would enumerate the whole file once before parsing
## 1.2.2 (2016-01-02)
- Fix default num_pipes argument to evaluate num_pipes dependent on scheduler at runtime
- Test utf-8 files with BOM
- Syntax and mix updates for elixir 1.2
## 1.2.1 (2015-10-17)
- Decoder performance optimisations
## 1.2.0 (2015-10-11)
- Use `Stream.transform/4` - incompatible with Elixir < `1.1.0`
## 1.1.5 (2015-10-11)
- Decoder refactor from `Stream.resource/3` to `Stream.transform/3` in order to
get more predictable stream behaviour
- Rows now get processed in order
- Fix a bug where stream would get evaluated before being decoded
## 1.1.4 (2015-09-13)
- Fix a bug where headers could be out of order
## 1.1.3 (2015-09-12)
- Fix a bug where headers could get parsed as the first row
## 1.1.2 (2015-09-05)
- Fix a bug where calls to decode with num_pipes: 1 would yield varying
results due to leftover state in decoder message queue
## 1.1.1 (2015-07-14)
- Rescue from errors in stream producer to get more predictable behaviour
in case of failure
## 1.1.0 (2015-07-12)
- Better error messages when encountering invalid encodings
## 1.0.1 (2015-07-11)
- Indicate `consolidate_protocols` for better encoding performance
## 1.0.0 (2015-05-24)
- Use bytes as separators
## 0.2.3 (2015-05-24)
- Add benchmarking
## 0.2.2 (2015-05-20)
- Use utf-8 bytes instead of codepoints for multi-byte parsing
## 0.2.1 (2015-05-20)
- Fix handling of multi-byte utf-8 characters
## 0.2.0 (2015-03-25)
- Implement encoder protocol