README.md

# bencoding

An encoder/decoder for the [Bencoding](https://www.bittorrent.org/beps/bep_0003.html) data format in pure Erlang.

Bencoding is the serialization format used by the [BitTorrent](https://www.bittorrent.org/) protocol
for torrent files and tracker communication.


## Installation

### Erlang (rebar3)

Add `bencoding` to your `deps` in `rebar.config`:

```erlang
{deps, [bencoding]}.
```

### Elixir (Mix)

Add `bencoding` to your `deps` in `mix.exs`:

```elixir
defp deps do
  [
    {:bencoding, "~> 1.0"}
  ]
end
```

See [hex.pm/packages/bencoding](https://hex.pm/packages/bencoding) for the latest version info.


## Type Mapping

Bencoding defines four data types. This library maps them to Erlang types as follows:

| Bencoding Type | Erlang Type | Example (Bencoded) | Example (Erlang) |
|---|---|---|---|
| Integer | `integer()` | `i42e` | `42` |
| Byte String | `binary()` | `4:spam` | `<<"spam">>` |
| List | `list()` | `l4:spami42ee` | `[<<"spam">>, 42]` |
| Dictionary | `map()` | `d3:cow3:mooe` | `#{<<"cow">> => <<"moo">>}` |

**Notes:**
- Bencoded strings are decoded as **binaries** (`<<...>>`), not as Erlang charlists.
- Dictionary keys are always byte strings (binaries).
- Dictionary keys are encoded in **sorted order** (sorted as raw strings, not alphanumerics), as required by BEP 3.
- Encoding and decoding are **roundtrip-safe**: `decode(encode(Value))` yields the original value.


## API

The library exports two functions:

### `bencoding:encode/1`

Encodes an Erlang term into a bencoded binary.

```erlang
%% Encode a string
{ok, <<"4:spam">>} = bencoding:encode(<<"spam">>).

%% Encode an integer
{ok, <<"i42e">>} = bencoding:encode(42).

%% Encode a list
{ok, <<"l4:spam4:eggse">>} = bencoding:encode([<<"spam">>, <<"eggs">>]).

%% Encode a dictionary (keys are automatically sorted)
{ok, <<"d3:cow3:moo4:spam4:eggse">>} = bencoding:encode(#{
    <<"spam">> => <<"eggs">>,
    <<"cow">> => <<"moo">>
}).

%% Encode an empty dictionary
{ok, <<"de">>} = bencoding:encode(#{}).

%% Encode nested structures
{ok, Encoded} = bencoding:encode(#{
    <<"name">> => <<"example">>,
    <<"files">> => [<<"a.txt">>, <<"b.txt">>],
    <<"size">> => 1024
}),
<<"d5:filesl5:a.txt5:b.txte4:name7:example4:sizei1024ee">> = Encoded.
```

**Returns:** `{ok, Binary}` on success.


### `bencoding:decode/1`

Decodes a bencoded binary into an Erlang term.

```erlang
%% Decode a string
{ok, <<"spam">>, <<>>} = bencoding:decode(<<"4:spam">>).

%% Decode an integer
{ok, 42, <<>>} = bencoding:decode(<<"i42e">>).

%% Decode a list
{ok, [<<"spam">>, <<"eggs">>], <<>>} = bencoding:decode(<<"l4:spam4:eggse">>).

%% Decode a dictionary
{ok, #{<<"cow">> := <<"moo">>}, <<>>} = bencoding:decode(<<"d3:cow3:mooe">>).
```

**Returns:** `{ok, Value, Rest}` on success, where `Rest` is the remaining (not yet decoded) binary data.

The `Rest` value is useful when decoding data that may have trailing content:

```erlang
{ok, 42, <<"extra">>} = bencoding:decode(<<"i42eextra">>).
```


## Error Handling

The decoder detects and rejects certain invalid inputs as defined by the Bencoding specification:

```erlang
%% Negative zero is not allowed
{error, negative_zero} = bencoding:decode(<<"i-0e">>).

%% Leading zeros are not allowed
{error, leading_zero} = bencoding:decode(<<"i03e">>).
{error, leading_zero} = bencoding:decode(<<"i-03e">>).
```

Other malformed inputs (e.g. missing terminators, invalid characters) will result in
a `function_clause` error or a `badmatch` error.


## Practical Example: Reading a Torrent File

```erlang
%% Read and decode a .torrent file
{ok, FileContent} = file:read_file("example.torrent"),
{ok, Torrent, <<>>} = bencoding:decode(FileContent),

%% Access the info dictionary
Info = maps:get(<<"info">>, Torrent),

%% Get the file name
Name = maps:get(<<"name">>, Info),
io:format("Torrent name: ~s~n", [Name]),

%% Get the announce URL
Announce = maps:get(<<"announce">>, Torrent),
io:format("Tracker: ~s~n", [Announce]).
```


## Using from Elixir

The library can be used directly from Elixir without any wrapper:

```elixir
# Encoding
{:ok, encoded} = :bencoding.encode("spam")
# => {:ok, "4:spam"}

# Encoding a map (keys are sorted automatically)
{:ok, encoded} = :bencoding.encode(%{"cow" => "moo", "spam" => "eggs"})
# => {:ok, "d3:cow3:moo4:spam4:eggse"}

# Decoding
{:ok, value, rest} = :bencoding.decode("d3:cow3:mooe")
# => {:ok, %{"cow" => "moo"}, ""}

# Reading a torrent file
{:ok, content} = File.read("example.torrent")
{:ok, torrent, ""} = :bencoding.decode(content)
announce = Map.get(torrent, "announce")

# Error handling
{:error, :negative_zero} = :bencoding.decode("i-0e")
{:error, :leading_zero} = :bencoding.decode("i03e")
```


## Specification Compliance

This library follows the Bencoding specification as defined in
[BEP 3: The BitTorrent Protocol Specification](https://www.bittorrent.org/beps/bep_0003.html):

- Integers are encoded as `i<integer in base 10>e`. Integers have no size limitation.
- Byte strings are encoded as `<length>:<contents>`
- Lists are encoded as `l<elements>e`
- Dictionaries are encoded as `d<pairs>e` with keys sorted as raw strings (not alphanumerics), as required by BEP 3
- Leading zeros in integers are rejected (`i03e` is invalid, `i0e` is valid)
- Negative zero (`i-0e`) is rejected

**Note for torrent client developers:** BEP 3 states that the info-hash
(the SHA-1 hash of the bencoded `info` dictionary) must be computed from
the bencoded form as found in the `.torrent` file. Clients must
either reject invalid metainfo files or extract the info substring directly.
They must not perform a decode-encode roundtrip on invalid data.


## License

MIT — see [LICENSE](LICENSE) for details.


## Issues

If you find a bug or miss a feature, please open an issue:
https://github.com/ratopi/bencoding/issues