PgRegistry

A distributed, metadata-aware process registry for Elixir. Works like Elixir's Registry but discovers processes across a cluster, with the same gossip-based eventually-consistent membership model as Erlang's :pg.

PgRegistry exposes a Registry-shaped API on top of a self-contained Elixir port of :pg that has been extended to carry per-entry metadata. You get cluster-wide process discovery, per-process values, listener notifications, and ETS-native match-spec queries — all without :global's coordination tax.

Installation

def deps do
  [
    {:pg_registry, "~> 0.3"}
  ]
end

Quick start

# In your supervision tree
children = [
  {PgRegistry, :my_registry}
]

# Register a process by name (via tuple)
GenServer.start_link(MyServer, arg,
  name: {:via, PgRegistry, {:my_registry, :my_key}})

# Or register the calling process directly
{:ok, _} = PgRegistry.register(:my_registry, :worker, %{role: :primary})

# Look up across the cluster
PgRegistry.lookup(:my_registry, :worker)
#=> [{#PID<0.123.0>, %{role: :primary}}]

PgRegistry.whereis_name({:my_registry, :my_key})
#=> #PID<0.456.0>

Why PgRegistry

Registry:globalPgRegistry
Scope one node whole cluster whole cluster
Cost per register µs ms (cluster-wide lock) µs (async gossip per peer)
Convergence model n/a synchronous, lock-based eventual, gossip-based
Net-split behavior n/a collisions on heal resolved by user-supplied resolver, may kill processes diverges silently, converges on heal without conflict
Per-process values yes no yes
Match-spec queries yes (ETS-native) no yes (ETS-native)
Listeners yes no yes
Unique key mode yes (per-node) yes (cluster-wide, expensive) yes (per-node only)

Registry is fast but local-only. :global is cluster-wide but synchronous and famously slow. PgRegistry sits in a third spot: cluster-wide, lock-free, fast, with no consensus tax — at the cost of allowing duplicate registrations (multiple processes per key, including across nodes).

API

Configuration

Three start-link forms are accepted:

# Bare scope name
{PgRegistry, :my_registry}

# Scope + opts
{PgRegistry, {:my_registry, listeners: [MyListener]}}

# Registry-shaped keyword form
{PgRegistry, name: :my_registry, listeners: [MyListener], keys: :duplicate}

Supported options:

Registering processes

# via the :via tuple (compatible with GenServer/Agent/Task names)
{:via, PgRegistry, {scope, key}}                # value defaults to nil
{:via, PgRegistry, {scope, key, value}}         # 3-tuple attaches a value

# via the Registry-shaped API (registers self())
PgRegistry.register(scope, key, value)          #=> {:ok, self()}
PgRegistry.unregister(scope, key)               #=> :ok

# via the explicit-pid API
PgRegistry.register_name({scope, key}, pid)
PgRegistry.register_name({scope, key, value}, pid)
PgRegistry.unregister_name({scope, key})

A process may register multiple times under the same key (with possibly different values). Each registration is independent and must be unregistered separately. When a registered process exits, all of its entries are automatically removed and listeners/subscribers are notified.

Reading

PgRegistry.lookup(scope, key)              # [{pid, value}, ...]
PgRegistry.whereis_name({scope, key})      # pid | :undefined
PgRegistry.get_members(scope, key)         # [pid, ...] (cluster-wide)
PgRegistry.get_local_members(scope, key)   # [pid, ...] (local node only)
PgRegistry.values(scope, key, pid)         # [value, ...] for one pid
PgRegistry.keys(scope, pid)                # [key, ...] for one pid
PgRegistry.which_groups(scope)             # [key, ...]
PgRegistry.count(scope)                    # total entries

All read functions are lock-free, performed directly against ETS from the calling process. They never block on the GenServer.

Updating values

PgRegistry.update_value(scope, key, new_value)        # self()
PgRegistry.update_value(scope, key, pid, new_value)   # specific pid

Updates every entry under key whose pid matches. Returns :not_joined if there's nothing to update. Subscribers (see below) receive {ref, :update, key, [{pid, old, new}]} events; listeners do not — matching Registry's behavior.

Match-spec queries

PgRegistry.match(scope, key, pattern)
PgRegistry.match(scope, key, pattern, guards)
PgRegistry.count_match(scope, key, pattern)
PgRegistry.count_match(scope, key, pattern, guards)
PgRegistry.unregister_match(scope, key, pattern)
PgRegistry.unregister_match(scope, key, pattern, guards)

PgRegistry.select(scope, match_spec)        # returns user-defined results
PgRegistry.count_select(scope, match_spec)

match/3,4 matches against the value position. select/2 takes a full ETS match-spec whose patterns are shaped as {key, pid, value} (matching Registry.select/2). All of these run as native ETS queries against the underlying table.

Subscriptions and listeners

Two ways to react to scope changes:

Listeners are configured at scope start-up and receive raw messages. Same shape as Registry's listeners:

{PgRegistry, name: :my_registry, listeners: [MyListener]}

# MyListener receives:
{:register,   :my_registry, key, pid, value}
{:unregister, :my_registry, key, pid}

Addressed by registered name (atom), so a listener that crashes and restarts under the same name keeps receiving events. Listeners do not fire on update_value (matching Registry).

Runtime subscriptions are dynamic and use refs. They also fire :update events:

{ref, snapshot} = PgRegistry.Pg.monitor_scope(:my_registry)
# snapshot :: %{key => [{pid, value}, ...]}

# Subscriber receives:
{^ref, :join,   key, [{pid, value}, ...]}
{^ref, :leave,  key, [{pid, value}, ...]}
{^ref, :update, key, [{pid, old, new}, ...]}

PgRegistry.Pg.demonitor(:my_registry, ref)

Use listeners for fixed system-level integrations (logging, metrics, side effects on register/unregister). Use subscriptions for short-lived consumers that come and go.

Scope-level metadata

PgRegistry.put_meta(scope, :config, %{retries: 3})
PgRegistry.meta(scope, :config)        #=> {:ok, %{retries: 3}}
PgRegistry.delete_meta(scope, :config)

Local-only — scope metadata is not gossiped between nodes. Stored in a sibling ETS table for fast reads.

Dispatch

PgRegistry.dispatch(scope, key, fn members ->
  for pid <- members, do: send(pid, {:work, payload})
end)

Invokes the callback with the list of pids registered under key, or no-ops if there are none.

Design notes

Per-node uniqueness

PgRegistry supports keys: :unique, but uniqueness is enforced per-node, not cluster-wide. This is the same scope as Registry's :unique mode, lifted into a distributed setting:

{PgRegistry, name: :singletons, keys: :unique}

# On node A:
{:ok, _} = PgRegistry.register(:singletons, :worker, :v)

# Same node, second call (any pid):
{:error, {:already_registered, ^pid_a}} =
  PgRegistry.register(:singletons, :worker, :v)

# Node B at the same time succeeds — its node has its own holder:
{:ok, _} = PgRegistry.register(:singletons, :worker, :v)

The :via tuple integrates correctly: register_name/2 returns :no on collision, so GenServer.start_link(name: {:via, PgRegistry, ...}) surfaces {:error, {:already_started, pid}} automatically.

When the holding process exits (or calls unregister/2), the key becomes available again on that node.

This is useful when you want a singleton per node — one connection pool per node, one cache per node, one supervisor per node — without paying :global's consensus tax. It is not what you want for cluster-wide singletons. If you need exactly-one across the cluster, use :global or a leader election library.

Multi-pid joins (Pg.join(scope, key, [p1, p2])) raise ArgumentError in unique mode because they have no meaningful semantics — at most one of the pids could ever succeed.

Partitions

PgRegistry uses a single ETS table per scope. Registry's partitions: option shards the local table to reduce write contention; for distributed workloads the dominant cost is gossip/convergence, not local writes, so partitioning a node helps much less. PgRegistry accepts partitions: 1 as a no-op and raises on anything else.

If you genuinely hit local write contention on a single scope, the recommended fix is to split the scope into several scopes (one per logical workload), not to partition a single one.

Convergence and net-splits

PgRegistry inherits :pg's eventually-consistent semantics. During a netsplit, each side of the partition continues to accept joins independently. When the cluster heals, both sides resync via gossip and converge without conflict — there's no "winner" to choose because duplicates are the normal state.

One known imperfection: on netsplit recovery, sync-driven membership changes update ETS correctly but do not fire :update notifications for entries whose metadata changed during the split. Subscribers see correct state at every read; only the notification stream during convergence is incomplete. See the comment on the private sync_one_group/4 in lib/pg_registry/pg.ex for the rationale (multi-join ambiguity makes the diff non-unique).

Storage layout

Each scope owns an ETS :duplicate_bag of rows shaped:

{key, pid, value, tag}

where tag is an opaque per-node monotonic integer that gives every entry its own identity. Tags are internal — callers never see them. They exist so that ref-counted multi-join semantics survive a flat-row layout, and so that cross-node leaves can identify a specific entry unambiguously even when (key, pid, value) is otherwise duplicated.

This layout is what makes match-spec queries (select/2, match/3) run natively against ETS — user-supplied match-specs against {key, pid, value} are translated to operate on the 4-tuple by appending :_ for the tag.

License

MIT — see LICENSE.