HyParView
A clean, BEAM-native Elixir implementation of the HyParView hybrid partial-view membership protocol from Leitão, Pereira, and Rodrigues (DSN 2007).
Why this exists
If your Elixir cluster is approaching the point where libcluster's full-mesh distribution starts to crack (≈50–100 nodes), HyParView is the membership primitive that Riak Core and Partisan are built on.
Each node maintains:
-
A small active view (typically
log(N) + 1peers, e.g. 5 for ~30 nodes) that is symmetric, TCP-connected, and used for messaging. -
A larger passive view (typically
K · (log(N) + 1), defaultK=6) that is gossiped lazily and held in reserve to repair the active view on failure.
The result: bounded per-node connection cost, fast failure detection (TCP), and high reliability under massive node churn — the paper demonstrates >90% delivery with 95% of nodes failed.
Status
Pre-release. v0.1.0 is feature-complete and tested. Not yet on hex.pm.
Installation
Once published, add to mix.exs:
def deps do
[{:hyparview, "~> 0.1"}]
endQuick start
Built-in test transport (in-process, ideal for tests)
{:ok, contact_pid} = HyParView.start_link(
peer: HyParView.Peer.new("contact", make_ref()),
transport: HyParView.Transport.Test
)
{:ok, joiner_pid} = HyParView.start_link(
peer: HyParView.Peer.new("joiner", make_ref()),
contacts: [HyParView.Peer.new("contact", make_ref())], # same address
transport: HyParView.Transport.Test
)
HyParView.active_view(contact_pid)
# => [%HyParView.Peer{id: "joiner", ...}]Real TCP transport
contact = HyParView.Peer.new("node-a", {{127, 0, 0, 1}, 4000})
joiner = HyParView.Peer.new("node-b", {{127, 0, 0, 1}, 4001})
{:ok, _} = HyParView.start_link(
peer: contact,
transport: HyParView.Transport.TCP
)
{:ok, joiner_pid} = HyParView.start_link(
peer: joiner,
contacts: [contact],
transport: HyParView.Transport.TCP
)
HyParView.subscribe(joiner_pid)
receive do
{:hyparview, {:peer_up, peer}} -> IO.puts("up: #{peer.id}")
endPower-user mode: pure protocol core
For applications that want to drive the protocol from their own event loop
(no GenServer, no transport) — HyParView.State is a pure functional
state machine:
state = HyParView.State.new(local_peer, HyParView.Config.new())
{state, actions} = HyParView.State.handle_message(state, message)
{state, actions} = HyParView.State.tick_shuffle(state)
{state, actions} = HyParView.State.connection_lost(state, peer)
# `actions` is a list of one or more:
# {:notify_up, peer}
# {:notify_down, peer}
# {:send, peer, message}
This is the canonical core; HyParView.Server is a thin wrapper around it
with timers, transport, and subscriber notifications.
Comparison
| libcluster | Horde | HyParView | |
|---|---|---|---|
| Concern |
Discovery + Node.connect | Distributed registry/supervisor | Membership |
| Topology | Full mesh | Full mesh (uses libcluster) | Partial mesh (active view) |
| Scale ceiling | ~50–100 nodes | ~50–100 nodes | Hundreds to thousands |
| Failure detection | BEAM net_kernel | Inherited | TCP, fast |
| Use this when | You want a small cluster | You want a registry/supervisor | You want membership at scale |
A future companion package, libcluster_hyparview, will combine HyParView for
membership with Node.connect calls scoped to the active view — giving Elixir
users partial-mesh BEAM distribution.
Architecture
┌─────────────────────────────┐
│ HyParView.State (pure) │
│ views, transitions, actions│
└──────────────┬──────────────┘
│ %{peer, message}
│ %{action, ...}
┌──────────────┴──────────────┐
│ HyParView.Server (GenServer)│
│ timers, subscribers, │
│ telemetry, transport plumb. │
└──────────────┬──────────────┘
│
┌──────────────┴──────────────┐
│ HyParView.Transport │
│ (behaviour) │
└────┬───────────────┬─────────┘
│ │
┌─────────┴───┐ ┌────────┴─────────┐
│ Transport. │ │ Transport.TCP │
│ Test │ │ + Connection │
│ (in-proc) │ │ (gen_statem/peer) │
└─────────────┘ └───────────────────┘Configuration
HyParView.start_link(
peer: %Peer{...},
transport: HyParView.Transport.TCP,
contacts: [%Peer{...}, ...],
config: [
active_view_size: 5, # paper default
passive_view_size: 30, # paper default
arwl: 6, # active random-walk length
prwl: 3, # passive random-walk length
shuffle_active_count: 3, # ka — active samples per shuffle
shuffle_passive_count: 4, # kp — passive samples per shuffle
shuffle_interval: 30_000, # ms between shuffle ticks
shuffle_ttl: 6 # walk depth for SHUFFLE
]
)Telemetry
All view changes emit events under the configured prefix (default
[:hyparview]):
:telemetry.attach_many(
"hyparview-handler",
Enum.map(HyParView.Telemetry.event_paths(), &([:hyparview | &1])),
fn event, measurements, metadata, _ ->
Logger.info("#{inspect(event)}: #{inspect(metadata)}")
end,
nil
)
See HyParView.Telemetry for the full event catalog.
Custom transports
The HyParView.Transport behaviour has three callbacks. A minimal
custom transport (e.g., over Erlang distribution) is around 30 lines.
See HyParView.Transport.Test for the reference implementation.
What this library is, and isn't
| Question | Answer |
|---|---|
| Membership? | Yes — JOIN, FORWARD_JOIN, NEIGHBOR, DISCONNECT, SHUFFLE. |
| Broadcast (Plumtree)? | No. A separate library — membership and broadcast are different concerns. |
| Distributed registry / supervisor? | No. That's Horde. |
| Replacement for libcluster? | Adjacent. See Comparison above. |
| TLS / auth? | Out of scope. Wrap the transport behaviour with your own. |
Contributing
See CONTRIBUTING.md. All commits require DCO sign-off.
License
Apache 2.0. See LICENSE.
Citation
João Leitão, José Pereira, Luís Rodrigues. HyParView: a membership protocol for reliable gossip-based broadcast. In Proc. of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN '07), Edinburgh, UK, June 2007.