Grove

Conflict-free Replicated AST (Abstract Syntax Tree) data types for Elixir.

Grove is a specialized CRDT library for building collaborative code editors, structured document editors, and any application requiring real-time synchronization of tree-structured data across distributed nodes.

Features

Architecture Overview

flowchart TB
    subgraph Database
        JSON[(Plain JSON/AST)]
    end

    subgraph "Grove Session"
        SM[Session Manager]
        Tree[CRDT Tree]
    end

    subgraph "LiveView Replicas"
        LV1[LiveView A]
        LV2[LiveView B]
        LV3[LiveView C]
    end

    JSON -->|from_data| Tree
    Tree -->|to_data| JSON

    LV1 <-->|ops| SM
    LV2 <-->|ops| SM
    LV3 <-->|ops| SM

    SM --- Tree

Installation

Add grove to your list of dependencies in mix.exs:

def deps do
  [
    {:grove, "~> 0.1.0"}
  ]
end

Quick Start

# Create a new AST CRDT with a replica ID
tree = Grove.new(replica_id: "node_1")

# Insert nodes
tree = tree
  |> Grove.insert_node(:root, :function, %{name: "hello", arity: 1})
  |> Grove.insert_node("node_1_1", :param, %{name: "name"}, parent: :root)
  |> Grove.insert_node("node_1_2", :body, %{}, parent: :root)

# Move a node (safe, prevents cycles)
tree = Grove.move_node(tree, "node_1_1", new_parent: "node_1_2")

# Generate operations for sync
{tree, ops} = Grove.flush_operations(tree)

# On another replica, apply received operations
other_tree = Grove.apply_operations(other_tree, ops)

# Both trees converge to the same state
Grove.to_ast(tree) == Grove.to_ast(other_tree)  # => true

Core Concepts

Plain Data Conversion

Grove separates CRDT state from application data. Store plain JSON/maps in your database and only use Grove for real-time sync:

# Load plain data from database, convert to CRDT tree
plain_ast = MyRepo.get!(Document, id).data
tree = Grove.from_data(plain_ast, replica_id: socket.id)

# After editing, convert back to plain data for storage
plain_ast = Grove.to_data(tree)
MyRepo.update!(document, %{data: plain_ast})

# Round-trip guarantee
assert Grove.to_data(Grove.from_data(data, replica_id: "a")) == data

Replica IDs

Every node in a Grove cluster needs a unique replica ID. This is used to generate globally unique node identifiers and resolve concurrent operations deterministically.

tree = Grove.new(replica_id: System.get_env("NODE_ID") || UUID.uuid4())

Node Structure

Each node in the tree has:

Field Description
id Unique identifier (replica_id + lamport_timestamp)
type AST node type (atom)
value Node content (map) — uses LWW-Register semantics
parent_id Parent node reference
children Ordered list of child node IDs
deleted? Tombstone flag for deleted nodes

Operations

Grove uses operation-based CRDTs where each modification generates an operation that can be sent to other replicas:

# Operations are automatically generated
{tree, ops} = Grove.flush_operations(tree)

# ops might contain:
[
  %Grove.Op.Insert{id: "node_1_3", type: :expr, value: %{}, parent: "node_1_2", position: 0},
  %Grove.Op.Move{id: "node_1_1", new_parent: "node_1_2", position: 1},
  %Grove.Op.Update{id: "node_1_2", value: %{line: 5}}
]

Batch Operations

Group multiple operations into a single atomic unit for undo and broadcast:

# Adding a composite field (address) as one undoable action
tree = Grove.batch(tree, fn t ->
  t
  |> Grove.insert_node("addr", :group, %{label: "Address"}, parent: "form")
  |> Grove.insert_node("street", :text, %{label: "Street"}, parent: "addr")
  |> Grove.insert_node("city", :text, %{label: "City"}, parent: "addr")
  |> Grove.insert_node("zip", :text, %{label: "ZIP"}, parent: "addr")
end)

# Single undo reverts all 4 inserts
{tree, ops} = Grove.undo(tree)

Operation Metadata

Attach context to operations for audit trails and version history:

tree = Grove.update_node(tree, node_id, %{label: "New Label"},
  meta: %{
    user_id: current_user.id,
    user_name: current_user.name,
    timestamp: DateTime.utc_now()
  }
)

# Get operation history with metadata
Grove.history(tree)
|> Enum.map(fn op -> "#{op.meta.user_name} edited #{op.node_id}" end)

Query API

Find and traverse nodes in the tree:

# Find by ID
Grove.get_node(tree, "field_123")

# Find by type
Grove.find_nodes(tree, type: :dropdown)

# Find by attribute
Grove.find_nodes(tree, where: [required: true])

# Get path to node (for breadcrumbs)
Grove.path_to(tree, "field_123")
# => ["form_root", "step_1", "field_123"]

# Traversal
Grove.parent(tree, "field_123")
Grove.children(tree, "field_123")
Grove.siblings(tree, "field_123")
Grove.ancestors(tree, "field_123")
Grove.descendants(tree, "field_123")

Move Operation Semantics

The move operation is the most complex part of tree CRDTs. Grove implements the algorithm from "A highly-available move operation for replicated trees" (Kleppmann et al., 2021) to handle concurrent moves safely.

Cycle Prevention

When two replicas concurrently move nodes in conflicting ways, Grove prevents cycles:

graph LR
    subgraph "Before"
        A1[A] --> B1[B] --> C1[C]
    end

    subgraph "Concurrent Moves"
        R1[Replica 1: move B under C]
        R2[Replica 2: move C under B]
    end

    subgraph "After Sync"
        A2[A] --> B2[B] --> C2[C]
    end

    style R1 fill:#ffcccc
    style R2 fill:#ccccff
# Replica A: move X under Y
tree_a = Grove.move_node(tree_a, "X", new_parent: "Y")

# Replica B: move Y under X (concurrent)
tree_b = Grove.move_node(tree_b, "Y", new_parent: "X")

# After sync, one move "wins" deterministically based on Lamport timestamps
# No cycles are ever created

Move vs Delete Conflicts

When one replica moves a node while another deletes its destination:

# Replica A: move X under Y
# Replica B: delete Y (concurrent)

# Result: X is moved to Y&#39;s former parent (or root if Y was a root child)

Undo/Redo

Grove provides undo/redo support based on "A Generic Undo Support for State-Based CRDTs" (Yu et al., 2019).

# Enable undo tracking
tree = Grove.new(replica_id: "node_1", undo: true)

# Make some changes
tree = tree
  |> Grove.insert_node("n1", :expr, %{value: 1}, parent: :root)
  |> Grove.insert_node("n2", :expr, %{value: 2}, parent: :root)
  |> Grove.delete_node("n1")

# Undo the last operation
{tree, ops} = Grove.undo(tree)

# "n1" is restored, and ops are generated to sync the undo

# Redo
{tree, ops} = Grove.redo(tree)

Selective Undo

Undo a specific operation rather than just the most recent:

# Get operation history
history = Grove.history(tree)

# Undo a specific operation by ID
{tree, ops} = Grove.undo(tree, operation_id: "op_12345")

Phoenix LiveView Integration

Grove provides first-class LiveView integration with automatic subscription and operation handling.

sequenceDiagram
    participant User A
    participant LiveView A
    participant Session Manager
    participant LiveView B
    participant User B

    User A->>LiveView A: Edit node
    LiveView A->>Session Manager: apply_op(doc_id, op)
    Session Manager->>LiveView B: {:grove_ops, ops}
    LiveView B->>User B: Re-render
    Session Manager->>LiveView A: {:grove_ops, ops}
    LiveView A->>User A: Re-render

Using Grove.LiveView Hooks

defmodule MyAppWeb.EditorLive do
  use MyAppWeb, :live_view
  on_mount {Grove.LiveView, :subscribe}

  def mount(%{"id" => id}, _session, socket) do
    # Load plain data and convert to CRDT tree
    data = MyRepo.get!(Document, id).data
    tree = Grove.from_data(data, replica_id: socket.id)

    {:ok, assign(socket, doc_id: id, tree: tree)}
  end

  def handle_event("insert_node", params, socket) do
    tree = Grove.insert_node(
      socket.assigns.tree,
      params["id"],
      String.to_atom(params["type"]),
      params["value"],
      parent: params["parent_id"]
    )

    # Flush and broadcast
    {tree, ops} = Grove.flush_operations(tree)
    Grove.Session.apply_op(socket.assigns.doc_id, ops)

    {:noreply, assign(socket, tree: tree)}
  end

  def handle_info({:grove_ops, ops}, socket) do
    tree = Grove.apply_operations(socket.assigns.tree, ops)
    {:noreply, assign(socket, tree: tree)}
  end
end

Session Management

Grove manages document sessions with automatic lifecycle handling:

# Sessions are started automatically when first replica joins
Grove.Session.join(doc_id, self())

# Get current tree state
tree = Grove.Session.get_state(doc_id)

# Get session info
Grove.Session.info(doc_id)
# => %{
#   replicas: 3,
#   last_activity: ~U[2024-01-15 10:30:00Z],
#   pending_ops: 0
# }

# Leave session (automatic on process exit)
Grove.Session.leave(doc_id, self())

Session lifecycle:

Presence Tracking

Built-in cursor and selection tracking for collaborative UIs:

# Track what each user is focused on
tree = Grove.set_focus(tree, "node_123")
tree = Grove.set_selection(tree, ["node_123", "node_124"])

# Get all replica presence (with auto-assigned colors)
Grove.get_presence(tree)
# => %{
#   "replica_a" => %{focus: "node_1", selection: [], user: "Alice", color: "#4CAF50"},
#   "replica_b" => %{focus: "node_3", selection: ["node_3", "node_4"], user: "Bob", color: "#2196F3"}
# }
graph TD
    subgraph "Collaborative View"
        A[Form Root]
        B[Step 1 - Alice editing]
        C[Step 2 - Bob editing]
        D[Field 1]
        E[Field 2 - Selected by Bob]
    end

    A --> B
    A --> C
    B --> D
    C --> E

    style B fill:#4CAF50,color:white
    style C fill:#2196F3,color:white
    style E stroke:#2196F3,stroke-width:3px

LiveView Hooks for Editor Integration

// assets/js/hooks/grove_editor.js
export const GroveEditor = {
  mounted() {
    this.editor = createEditor(this.el);

    this.editor.on("change", (delta) => {
      this.pushEvent("editor_change", delta);
    });

    this.handleEvent("apply_ops", (ops) => {
      this.editor.applyOps(ops);
    });
  }
}

Distributed Erlang Integration

Grove works natively with Erlang distribution for clustering:

defmodule MyApp.DocumentCluster do
  use GenServer

  def start_link(document_id) do
    GenServer.start_link(__MODULE__, document_id, name: via(document_id))
  end

  def init(document_id) do
    # Join the cluster for this document
    :pg.join(:grove_documents, document_id, self())

    tree = Grove.new(replica_id: node_id())
    {:ok, %{document_id: document_id, tree: tree}}
  end

  def handle_cast({:ops, ops, from_node}, state) do
    tree = Grove.apply_operations(state.tree, ops)
    {:noreply, %{state | tree: tree}}
  end

  def handle_call({:mutate, fun}, _from, state) do
    tree = fun.(state.tree)
    {tree, ops} = Grove.flush_operations(tree)

    # Broadcast to all replicas
    broadcast_ops(state.document_id, ops)

    {:reply, :ok, %{state | tree: tree}}
  end

  defp broadcast_ops(document_id, ops) do
    for pid <- :pg.get_members(:grove_documents, document_id), pid != self() do
      GenServer.cast(pid, {:ops, ops, Node.self()})
    end
  end
end

AST Conversion

Convert between Grove trees and Elixir AST:

# Parse Elixir code into a Grove tree
{:ok, ast} = Code.string_to_quoted("def hello(name), do: name")
tree = Grove.from_elixir_ast(ast, replica_id: "node_1")

# Convert back to Elixir AST
elixir_ast = Grove.to_elixir_ast(tree)

# Pretty print
Grove.to_string(tree)
# => "def hello(name), do: name"

Custom AST Schemas

Define custom node types with field-level CRDT semantics:

defmodule MyApp.FormSchema do
  use Grove.Schema

  node :form do
    field :title, :lww_register        # Last-writer-wins (default)
    field :description, :lww_register
    children :steps
  end

  node :dropdown do
    field :label, :lww_register
    field :options, :rga_list          # Ordered list - concurrent inserts merge
    field :tags, :or_set               # Set - adds/removes merge
  end

  node :text_field do
    field :label, :lww_register
    field :placeholder, :lww_register
    field :validators, :or_set         # Set of validation rules
  end
end

# Use with Grove
tree = Grove.new(replica_id: "node_1", schema: MyApp.FormSchema)

Field CRDT Types:

Type Merge Behavior Use Case
:lww_register Last writer wins Simple values (default)
:mv_register Keep all concurrent values Expose conflicts to user
:or_set Union of adds, intersection of removes Tags, categories
:rga_list Interleave concurrent inserts Ordered options

Performance

Grove is optimized for real-time collaborative editing:

Operation Time Complexity Space Complexity
Insert O(1) amortized O(1)
Delete O(1) O(1) (tombstone)
Move O(log n) O(1)
Apply Op O(1) amortized O(1)
Find Node O(1)
Traverse O(n) O(depth)

Benchmarks

Date: 2026-01-03
Elixir: 1.18.4, OTP: 25

Scenario                          ips        Average    Memory
Sequential (fast-path)            617        1.62 ms    1.43 MB
Collaborative (mixed)             144        6.94 ms    5.82 MB
High-concurrency (slow-path)       63       15.76 ms   21.84 MB

Local Operations                  ips        Average
get_node                        5.57 M       179 ns
put_node                        1.13 M       888 ns
update_node                     0.65 M      1537 ns

Key insight: Eg-walker's fast-path is 10x faster than slow-path for the common case (sequential editing).

See benchmarks guide for detailed analysis.

Eg-walker Optimization

Grove implements the Eg-walker algorithm (EuroSys 2025 Best Paper) which optimizes for the common case: sequential editing.

Most collaborative editing is sequential (~85%), so this optimization provides significant real-world performance benefits.

Garbage Collection

Tombstones (markers for deleted nodes) are garbage collected when all replicas have observed the deletion:

# Configure GC
tree = Grove.new(
  replica_id: "node_1",
  gc: [
    # Minimum age before tombstone can be collected
    min_tombstone_age: :timer.hours(24),
    # Run GC every N operations
    gc_interval: 1000
  ]
)

# Manual GC with known replica states
tree = Grove.gc(tree, observed_by: ["node_1", "node_2", "node_3"])

Persistence

Grove trees can be serialized for storage:

# Serialize to binary (efficient)
binary = Grove.serialize(tree)
tree = Grove.deserialize(binary)

# Serialize to JSON (interoperable)
json = Grove.to_json(tree)
tree = Grove.from_json(json)

# Ecto integration
defmodule MyApp.Document do
  use Ecto.Schema

  schema "documents" do
    field :tree, Grove.Ecto.Type
    timestamps()
  end
end

Testing

Grove provides comprehensive testing utilities for verifying CRDT correctness:

defmodule MyApp.EditorTest do
  use ExUnit.Case
  import Grove.Testing

  test "concurrent edits converge" do
    # Create isolated replicas from initial data
    [tree_a, tree_b] = create_replicas(initial_data, count: 2)

    # Simulate concurrent edits
    tree_a = Grove.insert_node(tree_a, "n1", :expr, %{v: 1}, parent: :root)
    tree_b = Grove.insert_node(tree_b, "n2", :expr, %{v: 2}, parent: :root)

    # Sync all replicas
    [tree_a, tree_b] = sync_all([tree_a, tree_b])

    # Verify convergence
    assert_convergent([tree_a, tree_b])
    assert_no_cycles(tree_a)
  end

  test "network partition and heal" do
    [a, b, c, d] = create_replicas(data, count: 4)

    # Simulate partition: {a, b} and {c, d} can&#39;t communicate
    {group1, group2} = partition([a, b, c, d], groups: [[0, 1], [2, 3]])

    # Make edits in each partition
    group1 = update_all(group1, fn t -> Grove.insert_node(t, "x", :text, %{}, parent: :root) end)
    group2 = update_all(group2, fn t -> Grove.insert_node(t, "y", :text, %{}, parent: :root) end)

    # Heal partition
    all_trees = heal(group1, group2)

    # All should converge
    assert_convergent(all_trees)
  end

  test "fuzz test with random operations" do
    [tree_a, tree_b, tree_c] = create_replicas(data, count: 3)

    # Generate and apply random operations
    trees = [tree_a, tree_b, tree_c]
    |> Enum.map(fn t -> apply_random_operations(t, count: 100) end)
    |> sync_all()

    assert_convergent(trees)
    Enum.each(trees, &assert_no_cycles/1)
  end
end

Testing Utilities Reference

Function Description
create_replicas/2 Create N isolated replicas from initial data
sync/2 Sync two replicas bidirectionally
sync_all/1 Sync all replicas in a list
assert_convergent/1 Assert all trees have same value
assert_no_cycles/1 Assert tree has no cycles
partition/2 Simulate network partition
heal/2 Heal partition, sync all
random_operations/2 Generate random ops for fuzzing

Research Background

Grove is based on peer-reviewed research in distributed systems and CRDTs:

Core Papers

Foundational CRDT Research

Undo/Redo

See the architecture guide for implementation details.

Comparison with Other Libraries

Feature Grove Automerge Yjs Riak DT
Language Elixir Rust/JS JS Erlang
Tree CRDTs ✅ First-class ✅ JSON ✅ XML
Move operation
Eg-walker optimization
Undo/Redo
BEAM native
Phoenix integration

Contributing

Contributions are welcome!

# Clone the repository
git clone https://gitlab.com/tristanperalta/grove.git
cd grove

# Install dependencies
mix deps.get

# Run tests
mix test

# Run benchmarks
mix bench

License

Grove is released under the MIT License. See LICENSE for details.


Built with ❤️ for the Elixir community.