Grove

Conflict-free Replicated Tree CRDTs for Elixir.

Grove is a CRDT library for building collaborative editors, structured-document editors, and any application that needs real-time synchronization of tree-structured data (including ASTs) across distributed nodes.

Features

Architecture Overview

flowchart TB
subgraph Database
JSON[(Plain JSON/AST)]
end
subgraph "Grove Session"
SM[Session Manager]
Tree[CRDT Tree]
end
subgraph "LiveView Replicas"
LV1[LiveView A]
LV2[LiveView B]
LV3[LiveView C]
end
JSON -->|from_data| Tree
Tree -->|to_data| JSON
LV1 <-->|ops| SM
LV2 <-->|ops| SM
LV3 <-->|ops| SM
SM --- Tree

Installation

Add grove to your list of dependencies in mix.exs:

def deps do
[
{:grove, "~> 0.2.0"}
]
end

Grove requires Elixir ~> 1.20 and is tested against Erlang/OTP 29.

Quick Start

alias Grove.{Tree, Node}
# Create a tree for a replica (the replica id makes node ids globally unique)
tree = Tree.new("node_1")
# Build nodes; a node's parent is set with :parent_id
root = Node.new("root", :function, attrs: %{name: "hello", arity: 1})
param = Node.new("param_1", :param, parent_id: "root", attrs: %{name: "name"})
body = Node.new("body_1", :body, parent_id: "root")
# Insert them
tree =
tree
|> Tree.put_node(root)
|> Tree.put_node(param)
|> Tree.put_node(body)
# Move a node under a new parent (cycle-safe; returns {:error, reason} if invalid)
tree = Tree.move_node(tree, "param_1", "body_1")
# Query
Tree.get_node(tree, "param_1")
Tree.children(tree, "body_1") # => ["param_1"]

Core Concepts

Plain Data Conversion

Grove separates CRDT state from application data. Store plain maps/JSON in your database and use Grove only for real-time sync:

# Load plain data and convert to a CRDT tree
plain = %{
id: "form_1",
type: :form,
attrs: %{title: "Survey"},
children: [
%{id: "field_1", type: :text, attrs: %{label: "Name"}}
]
}
tree = Grove.from_data(plain, replica_id: "node_a")
# After editing, convert back to plain data for storage
data = Grove.to_data(tree)
# Round-trip guarantee when all nodes have explicit ids
^plain = Grove.to_data(Grove.from_data(plain, replica_id: "a"))

Replica IDs

Every node in a Grove cluster needs a unique replica id. It is used to generate globally unique node identifiers and to resolve concurrent operations deterministically.

tree = Tree.new(System.get_env("NODE_ID") || "node_1")

Node Structure

Each node in the tree has:

FieldDescription
idUnique identifier (string)
typeNode type (atom)
attrsNode attributes (map)
parent_idParent node reference
childrenOrdered list of child node ids
deleted?Tombstone flag

Operations & Sync

Grove is operation-based: each mutation records an operation and tracks it in the tree's pending_ops. To synchronize, you broadcast those operations to other replicas and apply incoming events with Grove.Tree.apply_remote/2. The Phoenix LiveView integration wires this up for you (broadcast on change, apply on receive), so most applications never touch the low-level event plumbing directly.

Batch Operations

Group multiple operations into a single atomic unit for undo and broadcast:

# Add a composite "address" group as one undoable action.
# Tree.batch/3 returns {updated_tree, delta}; the delta is what you broadcast.
{tree, _delta} =
Tree.batch(tree, fn t ->
addr = Node.new("addr", :group, parent_id: "form_1", attrs: %{label: "Address"})
street = Node.new("street", :text, parent_id: "addr", attrs: %{label: "Street"})
city = Node.new("city", :text, parent_id: "addr", attrs: %{label: "City"})
t
|> Tree.put_node(addr)
|> Tree.put_node(street)
|> Tree.put_node(city)
end)
# A single undo reverts the whole batch
{:ok, tree} = Tree.undo(tree)

Operation Metadata

Attach context to operations for audit trails and version history:

tree =
Tree.update_node(
tree,
"field_1",
fn node -> Node.update_attrs(node, %{label: "Full name"}) end,
meta: %{user_id: "u1", user_name: "Alice"}
)
# Read history and pull metadata back out
tree
|> Tree.history(where: [user_id: "u1"])
|> Enum.map(fn entry ->
meta = Tree.operation_meta(entry.operation)
"#{meta[:user_name]} edited #{Enum.join(entry.node_ids, ", ")}"
end)

Query API

Find and traverse nodes in the tree:

# Find by id
Tree.get_node(tree, "field_1")
# Find by type
Tree.find_nodes(tree, type: :text)
# Find by attribute
Tree.find_nodes(tree, where: [required: true])
# Combine filters
Tree.find_nodes(tree, type: :text, where: [required: true])
# Path to a node (for breadcrumbs)
Tree.path_to(tree, "field_1") # => ["form_1", "addr", "field_1"]
# Traversal
Tree.root(tree)
Tree.parent(tree, "field_1")
Tree.children(tree, "addr")
Tree.siblings(tree, "field_1")
Tree.ancestors(tree, "field_1")

Move Operation Semantics

The move operation is the most complex part of tree CRDTs. Grove implements the algorithm from "A highly-available move operation for replicated trees" (Kleppmann et al., 2021) to handle concurrent moves safely.

Cycle Prevention

When two replicas concurrently move nodes in conflicting ways, Grove guarantees no cycle is ever created:

# Replica A: move X under Y
tree_a = Tree.move_node(tree_a, "X", "Y")
# Replica B: move Y under X (concurrent)
tree_b = Tree.move_node(tree_b, "Y", "X")
# After sync, one move "wins" deterministically (by Lamport/HLC timestamp);
# the losing move is discarded so no cycle forms.

Undo / Redo

Grove maintains per-tree undo/redo stacks. Every mutation pushes its inverse onto the undo stack; a new operation clears the redo stack.

{:ok, tree} = Tree.undo(tree) # or {:error, :empty_stack}
{:ok, tree} = Tree.redo(tree) # or {:error, :empty_stack}
Tree.can_undo?(tree) # => boolean
Tree.can_redo?(tree) # => boolean

Undo/redo is informed by "A Generic Undo Support for State-Based CRDTs" (Yu et al., 2019).

Schema DSL

Define node types with per-field CRDT semantics. Each field can use a different CRDT type, and schemas support versioned migrations.

defmodule MyApp.FormSchema do
use Grove.Schema
version 1
node :dropdown do
field :label, :lww_register, default: "Untitled"
field :options, :rga_list, default: [] # ordered list, concurrent inserts merge
field :tags, :or_set, default: MapSet.new() # set, adds/removes merge
end
node :text_field do
field :label, :lww_register, default: ""
field :placeholder, :lww_register, default: ""
end
end
# Create a node whose attrs are backed by per-field CRDTs
node = MyApp.FormSchema.new_node(:dropdown, "node_1", %{label: "Country"})
# Update a single field (CRDT-aware)
node = MyApp.FormSchema.update_field(node, :label, "Region", "node_1")
# Introspection
MyApp.FormSchema.node_types() # => [:dropdown, :text_field]
MyApp.FormSchema.fields(:dropdown) # => field definitions

Schema migrations:

defmodule MyApp.FormSchemaV2 do
use Grove.Schema
version 2
node :dropdown do
field :label, :lww_register
field :required, :lww_register, default: false # new in v2
end
migrate 1 do
add_field :dropdown, :required, :lww_register, default: false
end
end

Field CRDT types:

TypeMerge BehaviorUse Case
:lww_registerLast writer winsSimple values (default)
:mv_registerKeep all concurrent valuesExpose conflicts to the user
:or_setAdd-wins setTags, categories
:rga_listInterleave concurrent insertsOrdered options
:counterSum of incrementsCounts, votes

Phoenix LiveView Integration

Grove provides first-class LiveView integration: mount_tree/3 subscribes the socket to a document topic, apply_and_broadcast/2 applies a local change and broadcasts the delta, and apply_remote/2 applies an incoming delta.

defmodule MyAppWeb.EditorLive do
use MyAppWeb, :live_view
def mount(%{"id" => doc_id}, _session, socket) do
socket =
Grove.LiveView.mount_tree(socket, doc_id,
replica_id: socket.id,
pubsub: MyApp.PubSub
)
{:ok, socket}
end
def handle_event("update_field", %{"node_id" => node_id, "value" => value}, socket) do
socket =
Grove.LiveView.apply_and_broadcast(socket, fn tree ->
updated =
Grove.Tree.update_node(tree, node_id, fn node ->
Grove.Node.update_attrs(node, %{value: value})
end)
delta = {:update_attrs, node_id, %{value: value}}
{updated, delta}
end)
{:noreply, socket}
end
# Incoming changes from other replicas
def handle_info({:grove_delta, delta, from_pid}, socket) when from_pid != self() do
{:noreply, Grove.LiveView.apply_remote(socket, delta)}
end
end

The current tree is available with Grove.LiveView.get_tree/1, and local-only updates can be made with Grove.LiveView.update_tree/2.

Presence Tracking

When the optional phoenix_pubsub/phoenix_live_view dependencies are present, Grove tracks per-replica cursors, selections, and focus (backed by Phoenix.Tracker). The helpers operate on the socket:

socket = Grove.set_cursor(socket, "node_123", 4)
socket = Grove.set_selection(socket, "node_123", "node_124")
socket = Grove.set_focus(socket, "node_123")
Grove.get_presences(socket) # all replicas' presence (with auto-assigned colors)
Grove.get_my_presence(socket) # this replica's presence

Session Management

Grove.Session runs a document as a supervised process that owns the tree, fans out changes to subscribers, and persists periodically:

{:ok, session} = Grove.Session.get_or_start("doc_123", replica_id: "node_1")
Grove.Session.join(session, self())
# Apply a mutation to the shared tree
Grove.Session.mutate(session, fn tree ->
Grove.Tree.put_node(tree, Grove.Node.new("n1", :text))
end)
Grove.Session.get_state(session) # current Grove.Tree
Grove.Session.info(session) # %{replicas: ..., last_activity: ..., ...}
Grove.Session.leave(session, self())

Session lifecycle:

Distributed Erlang Integration

Grove starts a :pg-based membership process (Grove.Cluster.Membership) in its supervision tree, and an optional Grove.Replication.Server provides periodic delta sync between replicas over :pg or Phoenix PubSub. This lets Grove cluster natively across BEAM nodes without any external coordination service.

Persistence

Grove ships pluggable storage adapters and a document-storage strategy that combines snapshots with an event log:

# Snapshot the current tree and restore it later.
# create_snapshot/1 succeeds at a "critical" version, returning {:ok, snapshot}.
{:ok, snapshot} = Grove.Tree.create_snapshot(tree)
tree = Grove.Tree.from_snapshot(snapshot, "node_1")

Serialization uses a compact columnar encoding (RLE + zlib) for tree events (Grove.Tree.EventSerializer).

Performance

Grove is optimized for real-time collaborative editing:

OperationTime ComplexitySpace Complexity
InsertO(1) amortizedO(1)
MoveO(log n)O(1)
Apply OpO(1) amortizedO(1)
Find NodeO(1)
TraverseO(n)O(depth)

Benchmarks

Date: 2026-06-14
Elixir: 1.20.1, OTP: 29
Scenario ips Average Memory
Sequential (fast-path) 918 1.09 ms 1.36 MB
Collaborative (mixed) 781 1.28 ms 1.68 MB
High-concurrency (slow-path) 834 1.20 ms 1.51 MB
Local Operations ips Average
get_node 9.59 M 104 ns
put_node 2.08 M 481 ns
update_node 1.11 M 900 ns

Note: The collaborative and high-concurrency scenarios generate their workload with an unseeded Enum.shuffle, so those rows are not reproducible run-to-run — treat them as indicative, not exact. The fast-path/local/scalability figures are deterministic.

Key insight: For sequential editing (the common case), Eg-walker applies operations with near-zero CRDT overhead. The fast-path/slow-path gap grows with the degree of genuine concurrency in the workload.

See the benchmarks guide for detailed analysis.

Eg-walker Optimization

Grove implements the Eg-walker algorithm (EuroSys 2025 Best Paper) which optimizes for the common case: sequential editing.

Most collaborative editing is sequential, so this optimization provides significant real-world performance benefits.

Testing

Grove.Testing provides utilities for verifying CRDT convergence:

defmodule MyApp.CrdtTest do
use ExUnit.Case
import Grove.Testing
test "concurrent edits converge" do
# Create N isolated replicas of a CRDT module
[a, b] = create_replicas(Grove.Set.ORSet, 2)
# Make concurrent edits on each replica
a = Grove.Set.ORSet.add(a, "x")
b = Grove.Set.ORSet.add(b, "y")
# Sync all replicas and assert convergence
[a, b] = sync_all([a, b])
assert_convergent([a, b])
end
end
FunctionDescription
create_replicas/3Create N isolated replicas of a CRDT module
sync/2Sync two replicas bidirectionally
sync_all/1Sync all replicas in a list
assert_convergent/1Assert all replicas have the same value
replica_values/1Materialize each replica's value
mutate_replica/3Apply a function to one replica by index
random_operations/3Generate random operations for fuzzing

Roadmap

The following are planned but not yet implemented. See guides/roadmap.md for the full list.

Research Background

Grove is based on peer-reviewed research in distributed systems and CRDTs:

Core Papers

Foundational CRDT Research

Undo/Redo

See the architecture guide for implementation details.

Comparison with Other Libraries

FeatureGroveAutomergeYjsLoro
LanguageElixirRust/JSJSRust/JS
Tree CRDT✅ First-class✅ JSON✅ XML✅ Movable tree
Move operation❌ (research only)❌ (removed)
Eg-walker optimization✅ (event-graph replay)
Undo/redo✅ local✅ local
Rich text
BEAM native
Phoenix integration

Contributing

Contributions are welcome!

# Clone the repository
git clone https://gitlab.com/tristanperalta/grove.git
cd grove
# Install dependencies
mix deps.get
# Run tests
mix test
# Run benchmarks
mix run benchmarks/eg_walker_bench.exs

License

Grove is released under the MIT License. See LICENSE for details.