xb5_elixir

CIElixir VersionsLast commit

Elixir wrapper around xb5, an Erlang library of B-tree-based (order 5) sorted containers. Provides idiomatic Elixir structs with full Enumerable, Collectable, and Inspect support.

If you need ordered collections, benchmarks show xb5 running 1.2–2.5× faster than Erlang/OTP's gb_sets and gb_trees for most operations, with equal or lower heap usage — gains that largely carry over to xb5_elixir.

Three modules are provided:

Installation

Latest version

def deps do
  [
    {:xb5_elixir, "~> 0.1"}
  ]
end

Usage

API reference

Xb5.Bag

An ordered multiset, allowing multiple occurrences of the same value. Supports order-statistic operations such as percentile/3 and percentile_bracket/3.

push/2 always inserts a new copy; put/2 is idempotent.

iex> bag = Xb5.Bag.new() |> Xb5.Bag.push(:x) |> Xb5.Bag.push(:x)
Xb5.Bag.new([:x, :x])
iex> Xb5.Bag.size(bag)
2
iex> Xb5.Bag.put(bag, :x)
Xb5.Bag.new([:x, :x])

Order statistics on numerical data:

iex> latencies = Xb5.Bag.new([12, 14, 14, 15, 15, 18, 20, 25, 30, 100])
iex> Xb5.Bag.count(latencies, 15)
2
iex> Xb5.Bag.at(latencies, 0)
12
iex> Xb5.Bag.at(latencies, 4)
15
iex> Xb5.Bag.at(latencies, -1)
100
iex> Xb5.Bag.percentile_bracket(latencies, 0.00)
{:exact, 12}
iex> Xb5.Bag.percentile_bracket(latencies, 0.50)
{:between, 15, 18, 0.5}
iex> Xb5.Bag.percentile(latencies, 0.50)
16.5
iex> Xb5.Bag.percentile(latencies, 0.75)
23.75

percentile_bracket/3 returns {:exact, x} when the percentile falls exactly on one element, or {:between, lo, hi, t} when it falls between two — where t is the linear interpolation weight. The default method is inclusive (equivalent to Excel PERCENTILE.INC); exclusive and nearest_rank are also available via the opts argument. percentile/3 performs the interpolation and returns {:value, result}.

Xb5.Set

An ordered set. Supports all standard set operations and efficient traversal in sorted order.

iex> set = Xb5.Set.new([3, 1, 2, 1])
Xb5.Set.new([1, 2, 3])
iex> Xb5.Set.member?(set, 2)
true
iex> Xb5.Set.first!(set)
1
iex> Xb5.Set.last!(set)
3
iex> Xb5.Set.lower(set, 2)
{:ok, 1}
iex> Xb5.Set.higher(set, 2)
{:ok, 3}
iex> a = Xb5.Set.new(1..5)
iex> b = Xb5.Set.new(3..7)
iex> Xb5.Set.union(a, b)
Xb5.Set.new([1, 2, 3, 4, 5, 6, 7])
iex> Xb5.Set.difference(a, b)
Xb5.Set.new([1, 2])
iex> Xb5.Set.intersection(a, b)
Xb5.Set.new([3, 4, 5])

Xb5.Tree

An ordered key-value store. Its API closely mirrors Map.

iex> tree = Xb5.Tree.new(a: 1, b: 2, c: 3)
Xb5.Tree.new([a: 1, b: 2, c: 3])
iex> Xb5.Tree.get(tree, :a)
1
iex> Xb5.Tree.fetch(tree, :d)
:error
iex> Xb5.Tree.first!(tree)
{:a, 1}
iex> Xb5.Tree.last!(tree)
{:c, 3}
iex> tree = Xb5.Tree.put(tree, :d, 4)
Xb5.Tree.new([a: 1, b: 2, c: 3, d: 4])
iex> Xb5.Tree.keys(tree)
[:a, :b, :c, :d]
iex> Xb5.Tree.lower(tree, :c)
{:b, 2}
iex> Xb5.Tree.higher(tree, :b)
{:c, 3}

Comparison semantics

All three collections sort and compare using Erlang term order with == rather than ===. This means 1 and 1.0 are treated as the same element — unlike Map and MapSet, which use strict === equality.

This carries over to Enumerable: Enum.member?/2 uses == for whole elements in Xb5.Bag and Xb5.Set. For Xb5.Tree, which enumerates {key, value} pairs, the key is matched with == but the value is matched strictly with ===:

iex> Xb5.Set.new([1, 2, 3]) |> Enum.member?(1.0)
true
iex> tree = Xb5.Tree.new([{1, :a}])
iex> Enum.member?(tree, {1.0, :a})
true
iex> Enum.member?(tree, {1.0, :b})
false

Streaming

All three modules provide stream/2 for lazy traversal in ascending or descending order, and stream_from/3 to start at a given value (or key, for Xb5.Tree):

iex> set = Xb5.Set.new(1..5)
iex> Xb5.Set.stream(set, :desc) |> Enum.to_list()
[5, 4, 3, 2, 1]
iex> Xb5.Set.stream_from(set, 3) |> Enum.to_list()
[3, 4, 5]
iex> tree = Xb5.Tree.new([a: 1, b: 2, c: 3, d: 4])
iex> Xb5.Tree.stream_from(tree, :c, :desc) |> Enum.to_list()
[c: 3, b: 2, a: 1]

Xb5.Bag additionally provides stream_from_index/2 for index-based streaming (O(log n) to reach the start position):

iex> bag = Xb5.Bag.new([10, 20, 20, 30, 40])
iex> Xb5.Bag.stream_from_index(bag, 2) |> Enum.to_list()
[20, 30, 40]
iex> Xb5.Bag.stream_from_index(bag, -2) |> Enum.to_list()
[30, 40]

Protocols

All three modules implement Enumerable, Collectable, and Inspect, making them compatible with Enum, Stream, comprehensions, and Kernel.inspect/1.

iex> set = Xb5.Set.new(1..10)
iex> set |> Enum.filter(&rem(&1, 2) == 0) |> Enum.sum()
30
iex> tree = Xb5.Tree.new([a: 1, b: 2, c: 3])
iex> for {k, v} <- tree, do: {k, v * 2}
[a: 2, b: 4, c: 6]
iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Enum.into([4, 5], bag)
Xb5.Bag.new([1, 2, 3, 4, 5])

Enum.count/1 and Enum.member?/2 run in O(1) and O(log n) respectively for all three. Xb5.Bag additionally provides O(log n) random access: because it is an order-statistic tree, Enum.at/2, Enum.slice/3 and other index-based operations locate elements by rank without traversing from the start.

Erlang interop

The Elixir structs and the Erlang xb5 opaque records are two different packages around the same content: a size integer and a root node term. new/1 extracts that content from an Erlang record via :xb5_*.unwrap/1 (which does a quick read-only validation pass); unwrap!/1 hands it back as %{size: n, root: node}. No tree data is duplicated.

iex> erlang_set = :xb5_sets.from_list([1, 2, 3])
iex> elixir_set = Xb5.Set.new(erlang_set)
Xb5.Set.new([1, 2, 3])
iex> Xb5.Set.unwrap!(elixir_set).size
3

Benchmarks

See the xb5 benchmark report for detailed comparisons against gb_sets/gb_trees across 50+ operations and collection sizes up to 15,000 elements. A second report on an Intel i5-3550 is also available.

License

MIT License. Copyright (c) 2025–2026 Guilherme Andrade.