Elixir Protocol Buffer

Warning: only protocol buffers 3 is supported. Use protobuf-elixir if you need support for version 2 (protobuf-elixir was a major inspiration for this project).

This is a protocol buffer encoder and decoder. Its goal is to be fast at the cost of larger generated files. This is achieved by generating a significant part of the encoding and decoding logic at generation time with the protoc plugin.

Encoding and decoding performance is ~3-4x times faster than protobuf-elixir. For example, if we take the %Everything structure used in our tests, which has all field types, including all array types (with 2 values per array) and a few maps, pbuf takes ~14µs to encode and ~24µs to decode, versus 66µs and 67µs. However, the .beam file is quite a bit larger: 19K vs 7K.

(Note that there is limited support for version 2 syntax, but only enough to allow the protoc plugin to bootstrap itself. This may may or may not provide all the version 2 support you need).

Installation

Assuming you already have protoc installed, you'll want to run:

$ mix escript.install hex pbuf

to install the pbuf elixir generator. This will place protoc-gen-fast-elixir in your ~/.mix/escript/ folder. This must be on your $PATH.

You can then generate elixir files using the protoc command with the -fast-elixir_out=PATH flag:

protoc --fast-elixir_out=generated/ myschema.proto

Note the name fast-elixir_out. This allows you to also have protobuf installed in order to support proto2 syntax.

Encoding

The generated code is normal Elixir modules with a defstruct. Use new/1 to create new instances:

user = Models.User.new(name: "leto", age: 2000)

And Pbuf.encode!/1 and Pbuf.encode_to_iodata!/1 to encode them:

data = Pbuf.encode!(user)

Only structures generated by protoc can be passed to encode!/1 and encode_to_iodata!/1; you cannot pass maps or other structures.

These functions will raise a Pbuf.Encoder.Error on invalid data (such as assigning a float to a bool field). There are currently no non-raising functions.

Decoding

Decoding is done via Pbuf.decode!/2:

user = Pbuf.decode!(Models.User, data)

As an alternative, you can also use: Models.User.decode!(data).

Unlike encoding, there are non-raising versions of decode!:

# or use Models.user.decode(data)
case Pbuf.decode(Models.User, data) do
  {:ok, user} -> ...
  {:error, err} -> # err is a %Pbuf.Decode.Error{}
end

Decoding truly invalid data (as opposed to simply unexpected types) can raise.

Enumerations

A field declared as an enum should be set to the atom representation of the protocol buffer name, or the integer value. For example, a message defined as:

message User {
  UserType type = 1;
}

enum UserType {
  USER_TYPE_UNKNOWN = 0;
  USER_TYPE_PENDING = 1;
  USER_TYPE_NORMAL = 2;
  USER_TYPE_DELETED = 3;
}

Should be used as:

user = User.new(type: :USER_TYPE_PENDING)
# OR
user = User.new(type: 1)

(casing is preserved from the proto file)

Advanced Enums

You'll likely want to map your protocol buffer enums to specific atoms. With a bit of work, the generator can do this for you.

First, you'll need to specify a custom option, say in options.proto:

syntax = "proto2";

import 'google/protobuf/descriptor.proto';

extend google.protobuf.EnumValueOptions {
  optional ErlangEnumValueOptions erlang = 78832;
}

message ErlangEnumValueOptions {
  optional string atom = 1;
}

You can them import this .proto file like any other and use the option:

import 'options.proto';

enum HTTPMethod {
  HTTP_METHOD_GET = 0 [(erlang).atom = 'get'];
  HTTP_METHOD_POST = 1 [(erlang).atom = 'post'];
}

The value will now be :get and :post rather than :HTTP_METHOD_GET and :HTTP_METHOD_POST.

For this to work, Google's proto definitions must be available when you run protoc:

protoc -I=/usr/local/include/proto/ -I=. ...

They are available from the protocol buffer source: https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protoc-3.6.1-osx-x86_64.zip.

Oneofs

The value of a oneof field must be set to a tuple where the first element is the name of the field and the second is the value. Given:

message Event {
  oneof event_oneof {
    Commit commit = 1;
    Wiki wiki = 2;
  }
}

Then valid values for event_oneof are: nil, {:commit, Commit.t} or {:wiki, Wiki.t}.

Jason and Oneofs

Generated structures have a @derive Jason.Encoder. For simple messages, this means you can use Jason.encode(struct) to generate a json representation of your messages.

This fails for oneofs, since Jason can't encode tuples ({:type, value}). For this reason, a oneof can also be specified using a map, following either pattern:

  %{oneof: :commit, value: Commit.t}

  # or

  %{commit: Commit.t}

Note however that when decoding, it will always go to the tuple version.

Json Message Encoding

It's possible to not generate @derive Jason.Encoder on a per-message basis by using a custom option, say in options.proto:

syntax = "proto2";

extend google.protobuf.MessageOptions {
  int32 json_message = 78832;
}

And then using it in your message:

message Something {
  option (json_message) = 0;
  ...
}

Json Field Encoding

It's possible to automatically encode and decode a bytes field to and from Json. First, define a FieldOptions:

syntax = "proto2";

extend google.protobuf.MessageOptions {
  int32 json_field = 78832;
}

And then using it in your message:

message Something {
  bytes data = 1 [(json_field) = 1];
}

Which results in:

  something = [data: %{over: 9000}]
  |> Something.new()
  |> something.encode!()
  |> Somethihg.decode!()

  something.data == %{"over" => 9000}

Note that if you assign the json_field a value of 2, keys will be atomified.

What's Ugly?

There are two distinctly ugly parts of the code. The first is pretty much anything to do with oneof fields. The second is the decoding of maps.