DocsHex.pm

SuperCache

Introduction

High-performance in-memory caching library for Elixir backed by partitioned ETS tables with experimental distributed cluster support. SuperCache provides transparent local and distributed modes with configurable consistency guarantees, batch operations, and multiple data structures.

Features

Architecture

SuperCache contains 34 modules organized into 7 layers:

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                          │
│  SuperCache │ KeyValue │ Queue │ Stack │ Struct               │
└─────────────────────────┬───────────────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────┐
│                    Routing Layer                              │
│  Partition Router (local) │ Cluster Router (distributed)     │
│  Cluster.DistributedStore (shared helpers)                    │
└─────────────────────────┬───────────────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────┐
│                  Replication Layer                            │
│  Replicator (async/sync) │ WAL (strong) │ ThreePhaseCommit   │
└─────────────────────────┬───────────────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────┐
│                   Storage Layer                               │
│  Storage (ETS wrapper) │ EtsHolder (table lifecycle)         │
│  Partition (hashing) │ Partition.Holder (registry)           │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                Cluster Infrastructure                         │
│  Manager │ NodeMonitor │ HealthMonitor │ Metrics │ Stats     │
│  TxnRegistry │ Router                                        │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                Buffer System (lazy_put)                       │
│  Buffer (scheduler-affine) → Internal.Queue → Internal.Stream│
└─────────────────────────────────────────────────────────────┘

Module Overview

Layer Modules Responsibility
APISuperCache, KeyValue, Queue, Stack, Struct Public interfaces for all data structures
RoutingPartition, Cluster.Router, Cluster.DistributedStore Hash-based partition routing and distributed request routing
ReplicationCluster.Replicator, Cluster.WAL, Cluster.ThreePhaseCommit Async/sync/strong replication engines
StorageStorage, EtsHolder, Partition.Holder ETS table management and lifecycle
ClusterCluster.Manager, Cluster.NodeMonitor, Cluster.HealthMonitor Membership, discovery, and health monitoring
ObservabilityCluster.Metrics, Cluster.Stats, Cluster.TxnRegistry Counters, latency tracking, and transaction logs
BufferBuffer, Internal.Queue, Internal.Stream Scheduler-affine write buffers for lazy_put/1

Installation

Requirements: Erlang/OTP 25 or later, Elixir 1.15 or later.

Add super_cache to your dependencies in mix.exs:

def deps do
  [
    {:super_cache, "~> 1.2"}
  ]
end

Quick Start

Local Mode

# Start with defaults (num_partition = schedulers, key_pos = 0, partition_pos = 0)
SuperCache.start!()

# Or with custom config
opts = [key_pos: 0, partition_pos: 1, table_type: :bag, num_partition: 4]
SuperCache.start!(opts)

# Basic tuple operations
SuperCache.put!({:user, 1, "Alice"})
SuperCache.get!({:user, 1})
# => [{:user, 1, "Alice"}]

SuperCache.delete!({:user, 1})

Key-Value API

alias SuperCache.KeyValue

KeyValue.add("session", :user_1, %{name: "Alice"})
KeyValue.get("session", :user_1)
# => %{name: "Alice"}

# Batch operations (10-100x faster than individual calls)
KeyValue.add_batch("session", [
  {:user_2, %{name: "Bob"}},
  {:user_3, %{name: "Charlie"}}
])

KeyValue.remove_batch("session", [:user_1, :user_2])

Queue & Stack

alias SuperCache.{Queue, Stack}

# FIFO Queue
Queue.add("jobs", "process_order_1")
Queue.add("jobs", "process_order_2")
Queue.out("jobs")
# => "process_order_1"

Queue.peak("jobs")
# => "process_order_2"

# LIFO Stack
Stack.push("history", "page_a")
Stack.push("history", "page_b")
Stack.pop("history")
# => "page_b"

Struct Storage

alias SuperCache.Struct

defmodule User do
  defstruct [:id, :name, :email]
end

Struct.init(%User{}, :id)
Struct.add(%User{id: 1, name: "Alice", email: "alice@example.com"})
{:ok, user} = Struct.get(%User{id: 1})
# => {:ok, %User{id: 1, name: "Alice", email: "alice@example.com"}}

Complete API Reference

SuperCache (Main API)

Primary entry point for tuple storage with transparent local/distributed mode support.

Lifecycle

SuperCache.start!()
SuperCache.start!(opts)
SuperCache.start()
SuperCache.start(opts)
SuperCache.started?()
SuperCache.stop()

Write Operations

SuperCache.put!(data)
SuperCache.put(data)
SuperCache.lazy_put(data)
SuperCache.put_batch!(data_list)

Read Operations

SuperCache.get!(data, opts \\ [])
SuperCache.get(data, opts \\ [])
SuperCache.get_by_key_partition!(key, partition_data, opts \\ [])
SuperCache.get_same_key_partition!(key, opts \\ [])
SuperCache.get_by_match!(partition_data, pattern, opts \\ [])
SuperCache.get_by_match!(pattern)
SuperCache.get_by_match_object!(partition_data, pattern, opts \\ [])
SuperCache.get_by_match_object!(pattern)
SuperCache.scan!(partition_data, fun, acc)
SuperCache.scan!(fun, acc)

Delete Operations

SuperCache.delete!(data)
SuperCache.delete(data)
SuperCache.delete_all()
SuperCache.delete_by_match!(partition_data, pattern)
SuperCache.delete_by_match!(pattern)
SuperCache.delete_by_key_partition!(key, partition_data)
SuperCache.delete_same_key_partition!(key)

Partition-Specific Operations

SuperCache.put_partition!(data, partition)
SuperCache.get_partition!(key, partition)
SuperCache.delete_partition!(key, partition)
SuperCache.put_partition_by_idx!(data, partition_idx)
SuperCache.get_partition_by_idx!(key, partition_idx)
SuperCache.delete_partition_by_idx!(key, partition_idx)

Statistics & Mode

SuperCache.stats()
SuperCache.cluster_stats()
SuperCache.distributed?()

KeyValue

In-memory key-value namespaces backed by ETS partitions. Multiple independent namespaces coexist using different kv_name values.

KeyValue.add(kv_name, key, value)
KeyValue.get(kv_name, key, default \\ nil, opts \\ [])
KeyValue.remove(kv_name, key)
KeyValue.remove_all(kv_name)

KeyValue.keys(kv_name, opts \\ [])
KeyValue.values(kv_name, opts \\ [])
KeyValue.count(kv_name, opts \\ [])
KeyValue.to_list(kv_name, opts \\ [])

KeyValue.add_batch(kv_name, pairs)
KeyValue.remove_batch(kv_name, keys)

Queue

Named FIFO queues backed by ETS partitions.

Queue.add(queue_name, value)
Queue.out(queue_name, default \\ nil)
Queue.peak(queue_name, default \\ nil, opts \\ [])
Queue.count(queue_name, opts \\ [])
Queue.get_all(queue_name)

Stack

Named LIFO stacks backed by ETS partitions.

Stack.push(stack_name, value)
Stack.pop(stack_name, default \\ nil)
Stack.count(stack_name, opts \\ [])
Stack.get_all(stack_name)

Struct

In-memory struct store backed by ETS partitions. Call init/2 once per struct type before using.

Struct.init(struct, key \\ :id)
Struct.add(struct)
Struct.get(struct, opts \\ [])
Struct.get_all(struct, opts \\ [])
Struct.remove(struct)
Struct.remove_all(struct)

Distributed Mode

SuperCache supports distributing data across a cluster of Erlang nodes with configurable consistency guarantees.

Configuration

All nodes must share identical partition configuration:

# config/config.exs
config :super_cache,
  auto_start:         true,
  key_pos:            0,
  partition_pos:      0,
  cluster:            :distributed,
  replication_mode:   :async,      # :async | :sync | :strong
  replication_factor: 2,           # primary + 1 replica
  table_type:         :set,
  num_partition:      8            # Must match across ALL nodes

# config/runtime.exs
config :super_cache,
  cluster_peers: [
    :"node1@10.0.0.1",
    :"node2@10.0.0.2",
    :"node3@10.0.0.3"
  ]

Replication Modes

Mode Guarantee Latency Use Case
:async Eventual consistency ~50-100µs High-throughput caches, session data
:sync Majority ack (adaptive quorum) ~100-300µs Balanced durability/performance
:strong WAL-based strong consistency ~200µs Critical data requiring durability

Async Mode: Fire-and-forget replication via Task.Supervisor worker pool. Returns immediately after local write.

Sync Mode: Adaptive quorum writes — returns :ok once a strict majority of replicas acknowledge, avoiding waits for slow stragglers.

Strong Mode: Write-Ahead Log (WAL) replaces heavy 3PC. Writes locally first, then async replicates with majority acknowledgment. ~7x faster than traditional 3PC.

Read Modes (Distributed)

# Local read (fastest, may be stale)
SuperCache.get!({:user, 1})

# Primary read (consistent with primary node)
SuperCache.get!({:user, 1}, read_mode: :primary)

# Quorum read (majority agreement, early termination)
SuperCache.get!({:user, 1}, read_mode: :quorum)

Quorum reads use early termination — returns as soon as a strict majority agrees, avoiding waits for slow replicas.

Manual Bootstrap

SuperCache.Cluster.Bootstrap.start!(
  key_pos: 0,
  partition_pos: 0,
  cluster: :distributed,
  replication_mode: :strong,
  replication_factor: 2,
  num_partition: 8
)

Performance

Benchmarks (Local Mode, 4 partitions)

Operation Throughput Notes
put! ~1.2M ops/sec ~33% overhead vs raw ETS
get! ~2.1M ops/sec Near raw ETS speed
KeyValue.add_batch (10k) ~1.1M ops/sec Single ETS insert

Distributed Latency

Operation Async Sync (Quorum) Strong (WAL)
Write ~50-100µs ~100-300µs ~200µs
Read (local) ~10µs ~10µs ~10µs
Read (quorum) ~100-200µs ~100-200µs ~100-200µs

Performance Optimizations

  1. Compile-time log elimination — Debug macros expand to :ok when disabled (zero overhead)
  2. Partition resolution inlining — Single function call with @compile {:inline}
  3. Batch ETS operations:ets.insert/2 with lists instead of per-item calls
  4. Async replication worker poolTask.Supervisor eliminates per-operation spawn/1 overhead
  5. Adaptive quorum writes — Returns on majority ack, not all replicas
  6. Quorum read early termination — Stops waiting once majority is reached
  7. WAL-based strong consistency — Replaces 3PC with fast local write + async replication + majority ack
  8. Persistent-term config — Hot-path config keys served from :persistent_term for O(1) access
  9. Scheduler-affine bufferslazy_put/1 routes to buffer on same scheduler
  10. Protected ETS tables — Partition.Holder uses :protected ETS for lock-free reads

WAL Configuration

config :super_cache, :wal,
  majority_timeout: 2_000,  # ms to wait for majority ack
  cleanup_interval: 5_000,  # ms between WAL cleanup cycles
  max_pending: 10_000       # max uncommitted entries

Examples

The examples/ directory contains runnable examples:

Run examples with:

mix run examples/local_mode_example.exs
mix run examples/distributed_mode_example.exs

Configuration Options

Option Type Default Description
key_pos integer 0 Tuple index for ETS key lookup
partition_pos integer 0 Tuple index for partition hashing
num_partition integer schedulers Number of ETS partitions
table_type atom :set ETS table type (:set, :bag, :ordered_set, :duplicate_bag)
table_prefix string "SuperCache.Storage.Ets" Prefix for ETS table atom names
cluster atom :local:local or :distributed
replication_mode atom :async:async, :sync, or :strong
replication_factor integer 2 Total copies (primary + replicas)
cluster_peers list [] List of peer node atoms
auto_start boolean false Auto-start on application boot
debug_log boolean false Enable debug logging (compile-time)

Health Monitoring

SuperCache includes a built-in health monitor that continuously tracks:

Access health data:

SuperCache.Cluster.HealthMonitor.cluster_health()
SuperCache.Cluster.HealthMonitor.node_health(node)
SuperCache.Cluster.HealthMonitor.replication_lag(partition_idx)
SuperCache.Cluster.HealthMonitor.partition_balance()
SuperCache.Cluster.HealthMonitor.force_check()

Health data is also emitted via :telemetry events:

Debug Logging

Enable at compile time (zero overhead in production):

# config/config.exs
config :super_cache, debug_log: true

Or toggle at runtime:

SuperCache.Log.enable(true)
SuperCache.Log.enable(false)

Troubleshooting

Common Issues

"tuple size is lower than key_pos" — Ensure your tuples have enough elements for the configured key_pos.

"Partition count mismatch" — All nodes in a cluster must have the same num_partition value.

"Replication lag increasing" — Check network connectivity between nodes. Use HealthMonitor.cluster_health() to diagnose.

"Quorum reads timing out" — Ensure majority of nodes are reachable, check :erpc connectivity.

Performance Tips

  1. Use put_batch!/1 for bulk inserts (10-100x faster)
  2. Use KeyValue.add_batch/2 for key-value bulk operations
  3. Prefer :async replication mode for high-throughput caches
  4. Use read_mode: :local when eventual consistency is acceptable
  5. Enable compile-time debug_log: false for production (default)
  6. Monitor health metrics and wire telemetry to Prometheus/Datadog

Guides

Testing

# All tests (includes cluster tests)
mix test

# Unit tests only — no distribution needed
mix test --exclude cluster

# Cluster tests only
mix test.cluster

# Specific test file
mix test test/kv_test.exs

# With warnings as errors
mix test --warnings-as-errors

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Code Style

License

MIT License. See LICENSE for details.

Changelog

v1.2.1

v1.1.0

v1.0.0