MiniPB

Minimal data-driven protobuf encoder/decoder for Elixir. No code generation, no build step, zero dependencies. A single-file library (~700 lines) that reads standard protoc descriptor sets at runtime.

Quick Start

# 1. Generate a descriptor set with protoc
#    protoc --descriptor_set_out=schema.binpb --include_imports your.proto

# 2. Decode and compile
{:ok, descriptor_set} = MiniPB.decode_descriptor_set(File.read!("schema.binpb"))
schema = MiniPB.compile(descriptor_set)

# 3. Encode
{:ok, iodata} = MiniPB.encode(schema, :"mypackage.Person", %{
  name: "Alice",
  id: 42
})

# 4. Decode
{:ok, person} = MiniPB.decode(schema, :"mypackage.Person", IO.iodata_to_binary(iodata))
# => %{name: "Alice", id: 42}

Installation

Add minipb to your list of dependencies in mix.exs:

def deps do
  [
    {:minipb, "~> 0.1.0"}
  ]
end

How It Works

MiniPB uses a three-step pipeline:

  1. DecodeMiniPB.decode_descriptor_set/1 decodes a binary FileDescriptorSet (the output of protoc --descriptor_set_out) using a hardcoded bootstrap schema. This solves the chicken-and-egg problem of needing a protobuf decoder to read the protobuf schema. A bang variant decode_descriptor_set!/1 is also available.

  2. CompileMiniPB.compile/1 walks the descriptor set and builds indexed lookup tables (fields_by_name, fields_by_number, enum mappings) for O(1) field resolution. This is not code generation – just indexing lists into maps.

  3. Encode/DecodeMiniPB.encode/3 and MiniPB.decode/3 (or decode/4 with options) use the compiled schema to serialize and deserialize Elixir maps.

You can also skip decode_descriptor_set/1 and define the descriptor set as a plain Elixir map directly (see Schema Format below).

Data Conventions

Scalar Fields

Missing fields are omitted from decoded maps by default. Pass defaults: true to decode/4 to populate missing fields with proto3 default values:

{:ok, person} = MiniPB.decode(schema, :"mypackage.Person", data, defaults: true)
# => %{name: "", id: 0, role: :UNKNOWN, scores: [], tags: %{}}

Singular message fields and oneofs are never populated by :defaults.

Repeated Fields

Decoded as lists. Packed encoding is handled transparently.

%{scores: [100, 95, 88]}

Enums

Atoms on encode and decode. Unknown values fall back to raw integers.

# Encode -- both work:
%{role: :ADMIN}
%{role: 1}

# Decode:
%{role: :ADMIN}   # known value
%{role: 42}       # unknown value

Oneofs

Tagged tuples under the oneof name:

# Encode:
%{companion: {:pet, %{name: "Rex"}}}

# Decode:
%{companion: {:pet, %{name: "Rex"}}}

If no oneof field is set, the key is absent from the map.

Maps

Plain Elixir maps. The map<K,V> desugaring into repeated MapEntry messages is handled internally.

# Encode:
%{tags: %{"team" => 1, "level" => 5}}

# Decode:
%{tags: %{"team" => 1, "level" => 5}}

Schema Format

The schema mirrors google.protobuf.FileDescriptorSet using atom keys and atom values for all proto names. MiniPB.decode_descriptor_set/1 produces this structure from a protoc image; you can also write it by hand.

descriptor_set = %{
  file: [
    %{
      name: "test.proto",
      package: :test,
      syntax: :proto3,
      message_type: [
        %{
          name: :Person,
          field: [
            %{name: :name, number: 1, type: :TYPE_STRING, label: :LABEL_OPTIONAL},
            %{name: :id,   number: 2, type: :TYPE_INT32,  label: :LABEL_OPTIONAL},
            %{name: :role, number: 3, type: :TYPE_ENUM,   label: :LABEL_OPTIONAL,
              type_name: :"test.Role"},
          ]
        }
      ],
      enum_type: [
        %{
          name: :Role,
          value: [
            %{name: :UNKNOWN, number: 0},
            %{name: :ADMIN,   number: 1},
          ]
        }
      ]
    }
  ]
}

schema = MiniPB.compile(descriptor_set)

See PLAN.md for the full schema reference including all field types, labels, oneofs, nested messages, and map entries.

Supported Types

Proto Type Wire Format Elixir Type
double 64-bit LE float
float 32-bit LE float
int32/64 varint integer
uint32/64 varint integer
sint32/64 zigzag varint integer
fixed32/64 32/64-bit LE integer
sfixed32/64 32/64-bit LE integer
bool varint boolean
string length-delimited String.t()
bytes length-delimited binary
enum varint atom | integer
message length-delimited map

Groups (TYPE_GROUP) are not supported.

License

MIT