CBOR
Implementation of RFC 8949CBOR (Concise Binary Object Representation) for Elixir.
This is a fork of excbor which modernizes the codebase, and makes decisions on handling data types that the original library had punted on.
Migrating from the previous version
This library is a fork of the no longer maintained excbor project.
For those migrating from previous versions of this library there are breaking changes that you should be aware of.
The module Cbor has been renamed to CBOR.
CBOR.decode will return a three item tuple of the form {:ok, decoded, rest}, instead of returning the decoded object. In the wild there are APIs that concat CBOR objects together. The rest variable includes any leftover information from the decoding operation in case you need to decode multiple objects.
Atoms will be encoded/decoded as strings, except for the special case of :__undefined__ which has no direct translation to elixir but has semantic meaning in CBOR.
Elixir/Erlang does not have a concept of infinity, negative infinity or NaN. In order to encode or decode these values we will return a struct of the form %CBOR.Tag{tag: :float, value: (:inf|:"-inf"|:nan)}
If you want to encode a raw binary value, you can use the CBOR.Tag struct with a tag of :bytes and the binary as the :value field.
Upgrading from 1.x to 2.0
Version 2.0 brings the library up to RFC 8949 (December 2020) and the related ecosystem RFCs (RFC 8943, the cbor-sets-spec). The wire format is still compatible with 7049, but encoders now produce spec-correct output where there's a clean alternative.
Encoder changes (wire output may differ)
Dateencodes as tag 1004 (RFC 8943 full-date string), not tag 0. The old form was technically invalid CBOR per RFC 8949 §3.4.1.MapSetencodes as tag 258 (cbor-sets-spec) wrapping a CBOR array, preserving set semantics on round-trip. Previously emitted as a bare array.- Floats encode in the shortest IEEE 754 form (binary16/32/64) that exactly preserves the value (RFC 8949 §4.1). Previously always 64-bit.
- Map keys are emitted in bytewise lexicographic order of their encoded form (RFC 8949 §4.2.1) — deterministic by default.
The encoder follows §4.2.1 (map-key sort) and §4.1 (shortest float form)
but does not implement the §4.2.2 rule that integer-valued floats
must be encoded as integers — CBOR.encode(1.0) emits a binary16 float,
not the integer 1. Callers that need byte-identical Core Deterministic
output for COSE/CWT/content-addressed use cases should encode
integer-valued floats manually as integers before calling
CBOR.encode/1.
Decoder changes
- Tag 1 (epoch-based date/time) auto-decodes to
DateTime. Previously passed through as%CBOR.Tag{tag: 1, value: ...}. Passdecode_epoch_time: falseto keep the raw integer/float. -
New built-in decoders: tag 100 (
Date), tag 1004 (Date), tag 258 (MapSet), tag 55799 (self-described CBOR — strip the marker).
Reading 1.x data
The decoder remains backward-compatible with everything 1.x emitted:
tag 0 + bare-date strings still decode to Date, and untagged arrays
still decode to List.
Maps with colliding keys
CBOR.encode/1 now raises ArgumentError if two distinct Elixir keys
encode to the same wire bytes — :foo/"foo" (atoms encode as text
strings), {1, 2}/[1, 2] (tuples encode as arrays), or any custom
struct whose defimpl CBOR.Encoder produces a key already present.
v1.x silently emitted invalid CBOR with both keys, violating RFC 8949
§5.6. The error message names both colliding Elixir keys.
If you hit this on upgrade, the fix is at the call site — pick one shape and stick to it. For atom-vs-string ambiguity, normalize to strings before encoding:
map |> Map.new(fn {k, v} -> {to_string(k), v} end) |> CBOR.encode()Minimum runtime
Elixir 1.17 and Erlang/OTP 27 are now required.
Requirements
- Elixir 1.17 or later
- Erlang/OTP 27 or later
Installation
def deps do
[
{:cbor, "~> 2.0"}
]
endUsage
This library follows the standard API for CBOR libraries by exposing two methods
on the CBOR module CBOR.encode/1 and CBOR.decode/1.
Encoding
iex(1)> CBOR.encode([1, [2, 3]])
<<130, 1, 130, 2, 3>>Decoding
iex(2)> CBOR.decode(<<130, 1, 130, 2, 3>>)
{:ok, [1, [2, 3]], ""}Design Notes
Given that Elixir has more available data types than are supported in CBOR, decisions were made so that encoding complex data structures succeed without throwing errors. My thoughts are collected below so you can understand why encoding and decoding of a value does not necessarily return exactly the same value.
Atoms
The only atoms that will be directly encoded are true, falsenil and __undefined__. Every other atom will be converted to a string before being encoded. We surround undefined with double underscores so that you only encode an undefined value when you clearly intend to do so.
Keyword List, Range, Tuple
These structures are converted to Lists before being encoded — CBOR has no native representation for any of them. Round-tripping a tuple gives back a list, etc.
MapSet
MapSet round-trips via tag 258 (the cbor-sets-spec)
wrapping a CBOR array. Set semantics are preserved across encode/decode.
Date
Date encodes as tag 1004 (RFC 8943 full-date string). The decoder also
accepts the older tag-0-with-bare-date-string form for compatibility with
data emitted by 1.x of this library.
Time
CBOR has no IANA-registered tag for time-of-day. Time continues to
encode as tag 0 + ISO 8601 partial-time string — technically out-of-spec
for tag 0 (which requires a full date-time per RFC 8949 §3.4.1) but
preserves round-trip behaviour. Strict-mode decoding rejects this form.
DateTime
Tag 0 (RFC 3339 date-time) decodes to DateTime. For Z-form input, the
result is in UTC. For non-Z input (e.g. "2024-01-01T00:00:00+05:00"),
the decoded DateTime carries the original offset via its utc_offset
and synthetic time_zone fields (e.g. "+05:00") — round-trip through
encode produces the same wire bytes.
DateTime.compare/2, DateTime.to_unix/1, and arithmetic operations
work equivalently across both forms (same instant, different
presentation). Inter-zone conversion via DateTime.shift_zone/2
requires a tz database (e.g. tzdata); the synthetic time_zone
field is an ISO 8601 offset string, not an IANA zone name.
NaiveDateTime
NaiveDateTime will be treated as if they are UTC.
Special Values
Elixir and erlang have no concept of infinity, negative infinity and NaN. If you want to encode those values, we have a special struct CBOR.Tag which you can use to represent those values.
%CBOR.Tag{tag: :float, value: :inf}
%CBOR.Tag{tag: :float, value: :"-inf"}
%CBOR.Tag{tag: :float, value: :nan}CBOR.Tag is also useful if you want to extend CBOR for internal applications
Decoder Options
CBOR.decode/2 accepts a keyword list of options:
:tag_decoders— list of modules implementingCBOR.TagDecoderfor tags this library doesn't natively handle. Built-in tag numbers (0,1,2,3,32,100,258,1004,55799) are sealed — registering a decoder for one raisesArgumentErrorat decode time.:decode_epoch_time—true(default) auto-decodes tag 1 toDateTime. Setfalseto receive the raw integer/float wrapped in%CBOR.Tag{tag: 1, value: ...}.:on_duplicate_key—:last_wins(default),:first_wins, or:error. The:erroroption returns{:error, {:duplicate_key, key}}.:max_depth— positive integer, default256. Rejects inputs whose longest root-to-leaf chain of CBOR data items exceeds this, returning{:error, {:max_depth_exceeded, limit}}. Each container, tag wrapper, and primitive on the chain counts as one level. Defends against hostile depth-bomb input (a few-byte payload that allocates super-linearly).:strict—false(default). Set totrueto reject not-well-formed CBOR (per RFC 8949 §3 and Appendix F) as typed errors. Catches reserved two-byte simple values, indefinite-length on major types 0/1/6, stray break codes, nested indefinite-length string chunks, tag 2/3 with non-byte-string content, tag 32 with non-URI-reference content, and tag 24 inner content that is not exactly one well-formed CBOR data item (RFC 8949 §3.4.5.1).
Options are validated up front: unknown option keys, wrong-type values,
out-of-set values, and non-positive :max_depth raise ArgumentError
naming the option and its expected shape, rather than surfacing as a
misleading {:not_well_formed, _} decode error.
Example:
CBOR.decode(bytes, strict: true, on_duplicate_key: :error)Rendering errors
CBOR.format_error/1 turns a decode_error() term into a
human-readable string suitable for logs and operator surfaces. One
clause per typed variant, with RFC 8949 section references inline so
triagers can reach for the spec without re-parsing the reason atom.
case CBOR.decode(bytes, strict: true) do
{:ok, value, ""} -> value
{:error, reason} -> Logger.warning("CBOR decode failed: " <> CBOR.format_error(reason))
endThe returned strings are for human consumption — don't pattern-match on them. Wording may improve in a patch release.
Custom Encoding
If you want to encode something that is not supported out of the box you can implement the CBOR.Encoder protocol for the module. You only have to implement a single CBOR.Encoder.encode_into/2 function. An example for encoding a Money struct is given below.
defimpl CBOR.Encoder, for: Money do
def encode_into(money, acc) do
money |> Money.to_string() |> CBOR.Encoder.encode_into(acc)
end
endCustom Tag Decoding
For tags this library does not decode natively, implement
CBOR.TagDecoder and pass the module via :tag_decoders. Example for
binary UUIDs (tag 37):
defmodule MyApp.UUIDDecoder do
@behaviour CBOR.TagDecoder
@impl true
def tag_number, do: 37
@impl true
def decode(%CBOR.Tag{tag: :bytes, value: bytes}) when byte_size(bytes) == 16 do
{:ok, bytes}
end
def decode(_), do: :error
end
CBOR.decode(bytes, tag_decoders: [MyApp.UUIDDecoder])
Built-in tag numbers cannot be overridden. Two user modules registering
for the same tag also raise ArgumentError.
Built-in: tag 24 (Encoded CBOR data item)
Tag 24 (RFC 8949 §3.4.5.1) wraps a byte string that itself contains CBOR. By default the library leaves it wrapped:
%CBOR.Tag{tag: 24, value: %CBOR.Tag{tag: :bytes, value: <inner_bytes>}}
Pass CBOR.TagDecoders.EncodedCBOR to recursively decode the inner
data item:
CBOR.decode(bytes, tag_decoders: [CBOR.TagDecoders.EncodedCBOR])
The inner decode inherits the outer call's options (:max_depth,
:strict, :on_duplicate_key, :tag_decoders, :decode_epoch_time),
so nested tag-24 wrappers respect the outer depth budget. In strict
mode, trailing bytes or malformed inner CBOR surface as
{:error, {:tag_decoder_failed, 24, reason}}. Strict mode also
validates the inner content even without this decoder registered,
in which case errors surface as {:error, {:invalid_tag, 24, reason}}
and the success-path result stays wrapped (strict validates without
auto-unwrapping — opt into EncodedCBOR for the unwrap).
Why are built-in tags sealed?
Tags 0, 1, 2, 3, 32, 100, 258, 1004, and 55799 have
RFC-defined semantics (RFC 8949 §3.4 + RFC 8943 + cbor-sets-spec).
Allowing user code to override them would create interop hazards: the
same wire bytes would decode differently in different applications.
The library raises ArgumentError at registration to surface the
collision at the source rather than in production.
If you need custom handling for one of these tag numbers — for example, consuming non-conforming wire data from a legacy peer — there are two paths:
- Decode normally and post-process. The library falls back to
%CBOR.Tag{tag: N, value: <content>}whenever content fails built-in validation (e.g. tag 1 with a string instead of an epoch number, tag 2 with non-byte-string content). Pattern-match on the wrap and apply the legacy interpretation in your consumer. - Use a non-built-in tag for your own data. CBOR's tag namespace
is
uint64; pick something outside the IANA-registered range andCBOR.TagDecodergives you full control.
API stability
This library follows Semantic Versioning. Within a 2.x line:
Stable:
-
The shape of
decode_error()— variants will not be removed or renamed. New variants may be added; pattern-matching consumers should include a default clause. -
The exception classes
CBOR.encode/1raises (Protocol.UndefinedError,ArgumentError), as documented in its docstring.
Not stable:
-
Strings returned by
CBOR.format_error/1are for human consumption. Don't pattern-match on them — wording may improve in a patch release. -
Specific reasons within the strict-mode "documented partial-coverage
gaps". Inputs currently surfacing as
:malformed_header/:malformed/:truncated(BEAM-class catch-alls translated by the public rescue) may reclassify to typed{:invalid_tag, _, _}or more specific{:not_well_formed, _}reasons in a future minor as strict-mode coverage grows. The fence tests inoptions_test.exsunder"strict option (documented partial-coverage gaps)"pin the current boundary; treat the catch-alls as non-final classifications. - Encoder wire output may shift if RFC 8949 §4.2.2 (Core Deterministic integer-valued floats as integers) is later implemented. Decoded values still round-trip, but byte-for-byte equivalence with a prior release is not promised.
Documentation
Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/cbor.
Copyright and License
Copyright (c) 2019-2026 Thomas Cioppettini
This work is free. You can redistribute it and/or modify it under the terms of the MIT License. See the LICENSE.md file for more details.