Grove
Conflict-free Replicated AST (Abstract Syntax Tree) data types for Elixir.
Grove is a specialized CRDT library for building collaborative code editors, structured document editors, and any application requiring real-time synchronization of tree-structured data across distributed nodes.
Features
- Operation-based Tree CRDTs — Efficient synchronization with minimal bandwidth
- Eg-walker Optimization — 10x faster for sequential editing (the common case)
- Move Operations — Refactor and reorganize AST nodes without creating cycles
- Undo/Redo Support — First-class support for editor integration
- Phoenix LiveView Integration — Real-time collaboration in web applications
- Distributed Erlang Support — Native clustering with BEAM distribution
- Tombstone Garbage Collection — Automatic cleanup of deleted nodes
- Plain Data Conversion — Store JSON in your database, use CRDTs only for sync
Architecture Overview
flowchart TB
subgraph Database
JSON[(Plain JSON/AST)]
end
subgraph "Grove Session"
SM[Session Manager]
Tree[CRDT Tree]
end
subgraph "LiveView Replicas"
LV1[LiveView A]
LV2[LiveView B]
LV3[LiveView C]
end
JSON -->|from_data| Tree
Tree -->|to_data| JSON
LV1 <-->|ops| SM
LV2 <-->|ops| SM
LV3 <-->|ops| SM
SM --- TreeInstallation
Add grove to your list of dependencies in mix.exs:
def deps do
[
{:grove, "~> 0.1.0"}
]
endQuick Start
# Create a new AST CRDT with a replica ID
tree = Grove.new(replica_id: "node_1")
# Insert nodes
tree = tree
|> Grove.insert_node(:root, :function, %{name: "hello", arity: 1})
|> Grove.insert_node("node_1_1", :param, %{name: "name"}, parent: :root)
|> Grove.insert_node("node_1_2", :body, %{}, parent: :root)
# Move a node (safe, prevents cycles)
tree = Grove.move_node(tree, "node_1_1", new_parent: "node_1_2")
# Generate operations for sync
{tree, ops} = Grove.flush_operations(tree)
# On another replica, apply received operations
other_tree = Grove.apply_operations(other_tree, ops)
# Both trees converge to the same state
Grove.to_ast(tree) == Grove.to_ast(other_tree) # => trueCore Concepts
Plain Data Conversion
Grove separates CRDT state from application data. Store plain JSON/maps in your database and only use Grove for real-time sync:
# Load plain data from database, convert to CRDT tree
plain_ast = MyRepo.get!(Document, id).data
tree = Grove.from_data(plain_ast, replica_id: socket.id)
# After editing, convert back to plain data for storage
plain_ast = Grove.to_data(tree)
MyRepo.update!(document, %{data: plain_ast})
# Round-trip guarantee
assert Grove.to_data(Grove.from_data(data, replica_id: "a")) == dataReplica IDs
Every node in a Grove cluster needs a unique replica ID. This is used to generate globally unique node identifiers and resolve concurrent operations deterministically.
tree = Grove.new(replica_id: System.get_env("NODE_ID") || UUID.uuid4())Node Structure
Each node in the tree has:
| Field | Description |
|---|---|
id | Unique identifier (replica_id + lamport_timestamp) |
type | AST node type (atom) |
value | Node content (map) — uses LWW-Register semantics |
parent_id | Parent node reference |
children | Ordered list of child node IDs |
deleted? | Tombstone flag for deleted nodes |
Operations
Grove uses operation-based CRDTs where each modification generates an operation that can be sent to other replicas:
# Operations are automatically generated
{tree, ops} = Grove.flush_operations(tree)
# ops might contain:
[
%Grove.Op.Insert{id: "node_1_3", type: :expr, value: %{}, parent: "node_1_2", position: 0},
%Grove.Op.Move{id: "node_1_1", new_parent: "node_1_2", position: 1},
%Grove.Op.Update{id: "node_1_2", value: %{line: 5}}
]Batch Operations
Group multiple operations into a single atomic unit for undo and broadcast:
# Adding a composite field (address) as one undoable action
tree = Grove.batch(tree, fn t ->
t
|> Grove.insert_node("addr", :group, %{label: "Address"}, parent: "form")
|> Grove.insert_node("street", :text, %{label: "Street"}, parent: "addr")
|> Grove.insert_node("city", :text, %{label: "City"}, parent: "addr")
|> Grove.insert_node("zip", :text, %{label: "ZIP"}, parent: "addr")
end)
# Single undo reverts all 4 inserts
{tree, ops} = Grove.undo(tree)Operation Metadata
Attach context to operations for audit trails and version history:
tree = Grove.update_node(tree, node_id, %{label: "New Label"},
meta: %{
user_id: current_user.id,
user_name: current_user.name,
timestamp: DateTime.utc_now()
}
)
# Get operation history with metadata
Grove.history(tree)
|> Enum.map(fn op -> "#{op.meta.user_name} edited #{op.node_id}" end)Query API
Find and traverse nodes in the tree:
# Find by ID
Grove.get_node(tree, "field_123")
# Find by type
Grove.find_nodes(tree, type: :dropdown)
# Find by attribute
Grove.find_nodes(tree, where: [required: true])
# Get path to node (for breadcrumbs)
Grove.path_to(tree, "field_123")
# => ["form_root", "step_1", "field_123"]
# Traversal
Grove.parent(tree, "field_123")
Grove.children(tree, "field_123")
Grove.siblings(tree, "field_123")
Grove.ancestors(tree, "field_123")
Grove.descendants(tree, "field_123")Move Operation Semantics
The move operation is the most complex part of tree CRDTs. Grove implements the algorithm from "A highly-available move operation for replicated trees" (Kleppmann et al., 2021) to handle concurrent moves safely.
Cycle Prevention
When two replicas concurrently move nodes in conflicting ways, Grove prevents cycles:
graph LR
subgraph "Before"
A1[A] --> B1[B] --> C1[C]
end
subgraph "Concurrent Moves"
R1[Replica 1: move B under C]
R2[Replica 2: move C under B]
end
subgraph "After Sync"
A2[A] --> B2[B] --> C2[C]
end
style R1 fill:#ffcccc
style R2 fill:#ccccff# Replica A: move X under Y
tree_a = Grove.move_node(tree_a, "X", new_parent: "Y")
# Replica B: move Y under X (concurrent)
tree_b = Grove.move_node(tree_b, "Y", new_parent: "X")
# After sync, one move "wins" deterministically based on Lamport timestamps
# No cycles are ever createdMove vs Delete Conflicts
When one replica moves a node while another deletes its destination:
# Replica A: move X under Y
# Replica B: delete Y (concurrent)
# Result: X is moved to Y's former parent (or root if Y was a root child)Undo/Redo
Grove provides undo/redo support based on "A Generic Undo Support for State-Based CRDTs" (Yu et al., 2019).
# Enable undo tracking
tree = Grove.new(replica_id: "node_1", undo: true)
# Make some changes
tree = tree
|> Grove.insert_node("n1", :expr, %{value: 1}, parent: :root)
|> Grove.insert_node("n2", :expr, %{value: 2}, parent: :root)
|> Grove.delete_node("n1")
# Undo the last operation
{tree, ops} = Grove.undo(tree)
# "n1" is restored, and ops are generated to sync the undo
# Redo
{tree, ops} = Grove.redo(tree)Selective Undo
Undo a specific operation rather than just the most recent:
# Get operation history
history = Grove.history(tree)
# Undo a specific operation by ID
{tree, ops} = Grove.undo(tree, operation_id: "op_12345")Phoenix LiveView Integration
Grove provides first-class LiveView integration with automatic subscription and operation handling.
sequenceDiagram
participant User A
participant LiveView A
participant Session Manager
participant LiveView B
participant User B
User A->>LiveView A: Edit node
LiveView A->>Session Manager: apply_op(doc_id, op)
Session Manager->>LiveView B: {:grove_ops, ops}
LiveView B->>User B: Re-render
Session Manager->>LiveView A: {:grove_ops, ops}
LiveView A->>User A: Re-renderUsing Grove.LiveView Hooks
defmodule MyAppWeb.EditorLive do
use MyAppWeb, :live_view
on_mount {Grove.LiveView, :subscribe}
def mount(%{"id" => id}, _session, socket) do
# Load plain data and convert to CRDT tree
data = MyRepo.get!(Document, id).data
tree = Grove.from_data(data, replica_id: socket.id)
{:ok, assign(socket, doc_id: id, tree: tree)}
end
def handle_event("insert_node", params, socket) do
tree = Grove.insert_node(
socket.assigns.tree,
params["id"],
String.to_atom(params["type"]),
params["value"],
parent: params["parent_id"]
)
# Flush and broadcast
{tree, ops} = Grove.flush_operations(tree)
Grove.Session.apply_op(socket.assigns.doc_id, ops)
{:noreply, assign(socket, tree: tree)}
end
def handle_info({:grove_ops, ops}, socket) do
tree = Grove.apply_operations(socket.assigns.tree, ops)
{:noreply, assign(socket, tree: tree)}
end
endSession Management
Grove manages document sessions with automatic lifecycle handling:
# Sessions are started automatically when first replica joins
Grove.Session.join(doc_id, self())
# Get current tree state
tree = Grove.Session.get_state(doc_id)
# Get session info
Grove.Session.info(doc_id)
# => %{
# replicas: 3,
# last_activity: ~U[2024-01-15 10:30:00Z],
# pending_ops: 0
# }
# Leave session (automatic on process exit)
Grove.Session.leave(doc_id, self())Session lifecycle:
- Grace period: Sessions stay alive for 5 minutes after last replica leaves
- Periodic persistence: Auto-saves every 30 seconds when dirty
- Operation buffering: Debounces rapid updates
Presence Tracking
Built-in cursor and selection tracking for collaborative UIs:
# Track what each user is focused on
tree = Grove.set_focus(tree, "node_123")
tree = Grove.set_selection(tree, ["node_123", "node_124"])
# Get all replica presence (with auto-assigned colors)
Grove.get_presence(tree)
# => %{
# "replica_a" => %{focus: "node_1", selection: [], user: "Alice", color: "#4CAF50"},
# "replica_b" => %{focus: "node_3", selection: ["node_3", "node_4"], user: "Bob", color: "#2196F3"}
# }graph TD
subgraph "Collaborative View"
A[Form Root]
B[Step 1 - Alice editing]
C[Step 2 - Bob editing]
D[Field 1]
E[Field 2 - Selected by Bob]
end
A --> B
A --> C
B --> D
C --> E
style B fill:#4CAF50,color:white
style C fill:#2196F3,color:white
style E stroke:#2196F3,stroke-width:3pxLiveView Hooks for Editor Integration
// assets/js/hooks/grove_editor.js
export const GroveEditor = {
mounted() {
this.editor = createEditor(this.el);
this.editor.on("change", (delta) => {
this.pushEvent("editor_change", delta);
});
this.handleEvent("apply_ops", (ops) => {
this.editor.applyOps(ops);
});
}
}Distributed Erlang Integration
Grove works natively with Erlang distribution for clustering:
defmodule MyApp.DocumentCluster do
use GenServer
def start_link(document_id) do
GenServer.start_link(__MODULE__, document_id, name: via(document_id))
end
def init(document_id) do
# Join the cluster for this document
:pg.join(:grove_documents, document_id, self())
tree = Grove.new(replica_id: node_id())
{:ok, %{document_id: document_id, tree: tree}}
end
def handle_cast({:ops, ops, from_node}, state) do
tree = Grove.apply_operations(state.tree, ops)
{:noreply, %{state | tree: tree}}
end
def handle_call({:mutate, fun}, _from, state) do
tree = fun.(state.tree)
{tree, ops} = Grove.flush_operations(tree)
# Broadcast to all replicas
broadcast_ops(state.document_id, ops)
{:reply, :ok, %{state | tree: tree}}
end
defp broadcast_ops(document_id, ops) do
for pid <- :pg.get_members(:grove_documents, document_id), pid != self() do
GenServer.cast(pid, {:ops, ops, Node.self()})
end
end
endAST Conversion
Convert between Grove trees and Elixir AST:
# Parse Elixir code into a Grove tree
{:ok, ast} = Code.string_to_quoted("def hello(name), do: name")
tree = Grove.from_elixir_ast(ast, replica_id: "node_1")
# Convert back to Elixir AST
elixir_ast = Grove.to_elixir_ast(tree)
# Pretty print
Grove.to_string(tree)
# => "def hello(name), do: name"Custom AST Schemas
Define custom node types with field-level CRDT semantics:
defmodule MyApp.FormSchema do
use Grove.Schema
node :form do
field :title, :lww_register # Last-writer-wins (default)
field :description, :lww_register
children :steps
end
node :dropdown do
field :label, :lww_register
field :options, :rga_list # Ordered list - concurrent inserts merge
field :tags, :or_set # Set - adds/removes merge
end
node :text_field do
field :label, :lww_register
field :placeholder, :lww_register
field :validators, :or_set # Set of validation rules
end
end
# Use with Grove
tree = Grove.new(replica_id: "node_1", schema: MyApp.FormSchema)Field CRDT Types:
| Type | Merge Behavior | Use Case |
|---|---|---|
:lww_register | Last writer wins | Simple values (default) |
:mv_register | Keep all concurrent values | Expose conflicts to user |
:or_set | Union of adds, intersection of removes | Tags, categories |
:rga_list | Interleave concurrent inserts | Ordered options |
Performance
Grove is optimized for real-time collaborative editing:
| Operation | Time Complexity | Space Complexity |
|---|---|---|
| Insert | O(1) amortized | O(1) |
| Delete | O(1) | O(1) (tombstone) |
| Move | O(log n) | O(1) |
| Apply Op | O(1) amortized | O(1) |
| Find Node | O(1) | — |
| Traverse | O(n) | O(depth) |
Benchmarks
Date: 2026-01-03
Elixir: 1.18.4, OTP: 25
Scenario ips Average Memory
Sequential (fast-path) 617 1.62 ms 1.43 MB
Collaborative (mixed) 144 6.94 ms 5.82 MB
High-concurrency (slow-path) 63 15.76 ms 21.84 MB
Local Operations ips Average
get_node 5.57 M 179 ns
put_node 1.13 M 888 ns
update_node 0.65 M 1537 nsKey insight: Eg-walker's fast-path is 10x faster than slow-path for the common case (sequential editing).
See benchmarks guide for detailed analysis.
Eg-walker Optimization
Grove implements the Eg-walker algorithm (EuroSys 2025 Best Paper) which optimizes for the common case: sequential editing.
- Fast-path: When edits are sequential (one user or turn-taking), operations apply directly with near-zero CRDT overhead
- Slow-path: When true concurrency occurs, full CRDT merge ensures correctness
Most collaborative editing is sequential (~85%), so this optimization provides significant real-world performance benefits.
Garbage Collection
Tombstones (markers for deleted nodes) are garbage collected when all replicas have observed the deletion:
# Configure GC
tree = Grove.new(
replica_id: "node_1",
gc: [
# Minimum age before tombstone can be collected
min_tombstone_age: :timer.hours(24),
# Run GC every N operations
gc_interval: 1000
]
)
# Manual GC with known replica states
tree = Grove.gc(tree, observed_by: ["node_1", "node_2", "node_3"])Persistence
Grove trees can be serialized for storage:
# Serialize to binary (efficient)
binary = Grove.serialize(tree)
tree = Grove.deserialize(binary)
# Serialize to JSON (interoperable)
json = Grove.to_json(tree)
tree = Grove.from_json(json)
# Ecto integration
defmodule MyApp.Document do
use Ecto.Schema
schema "documents" do
field :tree, Grove.Ecto.Type
timestamps()
end
endTesting
Grove provides comprehensive testing utilities for verifying CRDT correctness:
defmodule MyApp.EditorTest do
use ExUnit.Case
import Grove.Testing
test "concurrent edits converge" do
# Create isolated replicas from initial data
[tree_a, tree_b] = create_replicas(initial_data, count: 2)
# Simulate concurrent edits
tree_a = Grove.insert_node(tree_a, "n1", :expr, %{v: 1}, parent: :root)
tree_b = Grove.insert_node(tree_b, "n2", :expr, %{v: 2}, parent: :root)
# Sync all replicas
[tree_a, tree_b] = sync_all([tree_a, tree_b])
# Verify convergence
assert_convergent([tree_a, tree_b])
assert_no_cycles(tree_a)
end
test "network partition and heal" do
[a, b, c, d] = create_replicas(data, count: 4)
# Simulate partition: {a, b} and {c, d} can't communicate
{group1, group2} = partition([a, b, c, d], groups: [[0, 1], [2, 3]])
# Make edits in each partition
group1 = update_all(group1, fn t -> Grove.insert_node(t, "x", :text, %{}, parent: :root) end)
group2 = update_all(group2, fn t -> Grove.insert_node(t, "y", :text, %{}, parent: :root) end)
# Heal partition
all_trees = heal(group1, group2)
# All should converge
assert_convergent(all_trees)
end
test "fuzz test with random operations" do
[tree_a, tree_b, tree_c] = create_replicas(data, count: 3)
# Generate and apply random operations
trees = [tree_a, tree_b, tree_c]
|> Enum.map(fn t -> apply_random_operations(t, count: 100) end)
|> sync_all()
assert_convergent(trees)
Enum.each(trees, &assert_no_cycles/1)
end
endTesting Utilities Reference
| Function | Description |
|---|---|
create_replicas/2 | Create N isolated replicas from initial data |
sync/2 | Sync two replicas bidirectionally |
sync_all/1 | Sync all replicas in a list |
assert_convergent/1 | Assert all trees have same value |
assert_no_cycles/1 | Assert tree has no cycles |
partition/2 | Simulate network partition |
heal/2 | Heal partition, sync all |
random_operations/2 | Generate random ops for fuzzing |
Research Background
Grove is based on peer-reviewed research in distributed systems and CRDTs:
Core Papers
- A highly-available move operation for replicated trees — Kleppmann et al., 2021
- Eg-walker: Better, Faster, Smaller — Gentle & Kleppmann, EuroSys 2025 (Best Paper)
- COAST: A Conflict-free Replicated Abstract Syntax Tree — Munsters et al., 2022
- A coordination-free, convergent, and safe replicated tree — Nair et al., 2021
Foundational CRDT Research
- A comprehensive study of Convergent and Commutative Replicated Data Types — Shapiro et al., 2011
- Pure Operation-Based Replicated Data Types — Baquero et al., 2017
- Interleaving Anomalies in Collaborative Text Editors — Kleppmann et al., 2019
Undo/Redo
- A Generic Undo Support for State-Based CRDTs — Yu et al., 2019
- Undo and Redo Support for Replicated Registers — Stewen & Kleppmann, 2024
See the architecture guide for implementation details.
Comparison with Other Libraries
| Feature | Grove | Automerge | Yjs | Riak DT |
|---|---|---|---|---|
| Language | Elixir | Rust/JS | JS | Erlang |
| Tree CRDTs | ✅ First-class | ✅ JSON | ✅ XML | ❌ |
| Move operation | ✅ | ✅ | ✅ | ❌ |
| Eg-walker optimization | ✅ | ❌ | ❌ | ❌ |
| Undo/Redo | ✅ | ✅ | ✅ | ❌ |
| BEAM native | ✅ | ❌ | ❌ | ✅ |
| Phoenix integration | ✅ | ❌ | ❌ | ❌ |
Contributing
Contributions are welcome!
# Clone the repository
git clone https://gitlab.com/tristanperalta/grove.git
cd grove
# Install dependencies
mix deps.get
# Run tests
mix test
# Run benchmarks
mix benchLicense
Grove is released under the MIT License. See LICENSE for details.
Built with ❤️ for the Elixir community.