README.md

# MiniPB

Minimal data-driven protobuf encoder/decoder for Elixir. No code generation, no
build step, zero dependencies. A single-file library (~700 lines) that reads
standard `protoc` descriptor sets at runtime.

## Quick Start

```elixir
# 1. Generate a descriptor set with protoc
#    protoc --descriptor_set_out=schema.binpb --include_imports your.proto

# 2. Decode and compile
{:ok, descriptor_set} = MiniPB.decode_descriptor_set(File.read!("schema.binpb"))
schema = MiniPB.compile(descriptor_set)

# 3. Encode
{:ok, iodata} = MiniPB.encode(schema, :"mypackage.Person", %{
  name: "Alice",
  id: 42
})

# 4. Decode
{:ok, person} = MiniPB.decode(schema, :"mypackage.Person", IO.iodata_to_binary(iodata))
# => %{name: "Alice", id: 42}
```

## Installation

Add `minipb` to your list of dependencies in `mix.exs`:

```elixir
def deps do
  [
    {:minipb, "~> 0.1.0"}
  ]
end
```

## How It Works

MiniPB uses a three-step pipeline:

1. **Decode** -- `MiniPB.decode_descriptor_set/1` decodes a binary
   `FileDescriptorSet` (the output of `protoc --descriptor_set_out`) using a
   hardcoded bootstrap schema. This solves the chicken-and-egg problem of
   needing a protobuf decoder to read the protobuf schema. A bang variant
   `decode_descriptor_set!/1` is also available.

2. **Compile** -- `MiniPB.compile/1` walks the descriptor set and builds indexed
   lookup tables (`fields_by_name`, `fields_by_number`, enum mappings) for O(1)
   field resolution. This is not code generation -- just indexing lists into maps.

3. **Encode/Decode** -- `MiniPB.encode/3` and `MiniPB.decode/3` (or
   `decode/4` with options) use the compiled schema to serialize and
   deserialize Elixir maps.

You can also skip `decode_descriptor_set/1` and define the descriptor set as a
plain Elixir map directly (see [Schema Format](#schema-format) below).

## Data Conventions

### Scalar Fields

Missing fields are omitted from decoded maps by default. Pass `defaults: true`
to `decode/4` to populate missing fields with proto3 default values:

```elixir
{:ok, person} = MiniPB.decode(schema, :"mypackage.Person", data, defaults: true)
# => %{name: "", id: 0, role: :UNKNOWN, scores: [], tags: %{}}
```

Singular message fields and oneofs are never populated by `:defaults`.

### Repeated Fields

Decoded as lists. Packed encoding is handled transparently.

```elixir
%{scores: [100, 95, 88]}
```

### Enums

Atoms on encode and decode. Unknown values fall back to raw integers.

```elixir
# Encode -- both work:
%{role: :ADMIN}
%{role: 1}

# Decode:
%{role: :ADMIN}   # known value
%{role: 42}       # unknown value
```

### Oneofs

Tagged tuples under the oneof name:

```elixir
# Encode:
%{companion: {:pet, %{name: "Rex"}}}

# Decode:
%{companion: {:pet, %{name: "Rex"}}}
```

If no oneof field is set, the key is absent from the map.

### Maps

Plain Elixir maps. The `map<K,V>` desugaring into repeated `MapEntry` messages
is handled internally.

```elixir
# Encode:
%{tags: %{"team" => 1, "level" => 5}}

# Decode:
%{tags: %{"team" => 1, "level" => 5}}
```

## Schema Format

The schema mirrors `google.protobuf.FileDescriptorSet` using atom keys and atom
values for all proto names. `MiniPB.decode_descriptor_set/1` produces this
structure from a protoc image; you can also write it by hand.

```elixir
descriptor_set = %{
  file: [
    %{
      name: "test.proto",
      package: :test,
      syntax: :proto3,
      message_type: [
        %{
          name: :Person,
          field: [
            %{name: :name, number: 1, type: :TYPE_STRING, label: :LABEL_OPTIONAL},
            %{name: :id,   number: 2, type: :TYPE_INT32,  label: :LABEL_OPTIONAL},
            %{name: :role, number: 3, type: :TYPE_ENUM,   label: :LABEL_OPTIONAL,
              type_name: :"test.Role"},
          ]
        }
      ],
      enum_type: [
        %{
          name: :Role,
          value: [
            %{name: :UNKNOWN, number: 0},
            %{name: :ADMIN,   number: 1},
          ]
        }
      ]
    }
  ]
}

schema = MiniPB.compile(descriptor_set)
```

See [PLAN.md](PLAN.md) for the full schema reference including all field types,
labels, oneofs, nested messages, and map entries.

## Supported Types

| Proto Type   | Wire Format       | Elixir Type          |
|-------------|-------------------|----------------------|
| `double`     | 64-bit LE         | `float`              |
| `float`      | 32-bit LE         | `float`              |
| `int32/64`   | varint            | `integer`            |
| `uint32/64`  | varint            | `integer`            |
| `sint32/64`  | zigzag varint     | `integer`            |
| `fixed32/64` | 32/64-bit LE      | `integer`            |
| `sfixed32/64`| 32/64-bit LE      | `integer`            |
| `bool`       | varint            | `boolean`            |
| `string`     | length-delimited  | `String.t()`         |
| `bytes`      | length-delimited  | `binary`             |
| `enum`       | varint            | `atom \| integer`    |
| `message`    | length-delimited  | `map`                |

Groups (`TYPE_GROUP`) are not supported.

## License

MIT