SuperCache
Introduction
High-performance in-memory caching library for Elixir backed by ETS tables with experimental distributed cluster support. SuperCache provides transparent local and distributed modes with configurable consistency guarantees, batch operations, and horizontal scalability.
Features
- Partitioned ETS Storage — Reduces contention by splitting data across multiple ETS tables
- Multiple Data Structures — Tuples, key-value namespaces, queues, stacks, and struct storage
- Distributed Clustering — Automatic node discovery, partition assignment, and replication
- Configurable Consistency — Choose between async, sync (quorum), or strong (WAL) replication
- Batch Operations — High-throughput bulk writes with
put_batch!/1,add_batch/2,remove_batch/2 - Performance Optimized — Compile-time log elimination, partition resolution inlining, worker pools, and early termination quorum reads
Design
Client → API → Partition Router → Storage (ETS)
↓
Distributed Router (optional)
↓
Replicator → Remote NodesArchitecture Components
- API Layer — Public interface (
SuperCache,KeyValue,Queue,Stack,Struct) - Partition Layer — Hash-based routing to ETS tables (
Partition,Partition.Holder) - Storage Layer — ETS table management (
Storage,EtsHolder) - Cluster Layer — Distributed coordination (
Manager,Replicator,WAL,Router)
Call Flow (Local Mode)
sequenceDiagram
participant Client
participant Api
participant Partition
participant Storage
Client->>Api: put!({:user, 1, "Alice"})
Api->>Partition: get_partition(1)
Partition->>Api: :"SuperCache.Storage.Ets_2"
Api->>Storage: put({:user, 1, "Alice"}, partition)
Storage->>Api: true
Api->>Client: true
Client->>Api: get!({:user, 1})
Api->>Partition: get_partition(1)
Partition->>Api: :"SuperCache.Storage.Ets_2"
Api->>Storage: get({:user, 1}, partition)
Storage->>Api: [{:user, 1, "Alice"}]
Api->>Client: [{:user, 1, "Alice"}]Call Flow (Distributed Mode)
sequenceDiagram
participant Client
participant Api
participant Router
participant Primary
participant Replicas
participant WAL
Client->>Api: put!({:user, 1, "Alice"})
Api->>Router: route_put!
Router->>Primary: local_put (if primary)
Primary->>WAL: commit(ops)
WAL->>Primary: apply_local
WAL->>Replicas: async replicate_and_ack
Replicas->>WAL: ack
WAL->>Primary: majority reached
Primary->>Router: true
Router->>Api: true
Api->>Client: trueInstallation
Requirements: Erlang/OTP 25 or later, Elixir 1.14 or later.
Add super_cache to your dependencies in mix.exs:
def deps do
[
{:super_cache, "~> 1.2"}
]
endQuick Start
Local Mode
# Start with defaults (num_partition = schedulers, key_pos = 0, partition_pos = 0)
SuperCache.start!()
# Or with custom config
opts = [key_pos: 0, partition_pos: 1, table_type: :bag, num_partition: 4]
SuperCache.start!(opts)
# Basic operations
SuperCache.put!({:user, 1, "Alice"})
SuperCache.get!({:user, 1})
# => [{:user, 1, "Alice"}]
SuperCache.delete!({:user, 1})Key-Value API
alias SuperCache.KeyValue
KeyValue.add("session", :user_1, %{name: "Alice"})
KeyValue.get("session", :user_1)
# => %{name: "Alice"}
# Batch operations (10-100x faster than individual calls)
KeyValue.add_batch("session", [
{:user_2, %{name: "Bob"}},
{:user_3, %{name: "Charlie"}}
])
KeyValue.remove_batch("session", [:user_1, :user_2])Queue & Stack
alias SuperCache.{Queue, Stack}
# FIFO Queue
Queue.add("jobs", "process_order_1")
Queue.add("jobs", "process_order_2")
Queue.out("jobs")
# => "process_order_1"
# LIFO Stack
Stack.push("history", "page_a")
Stack.push("history", "page_b")
Stack.pop("history")
# => "page_b"Struct Storage
alias SuperCache.Struct
defmodule User do
defstruct [:id, :name, :email]
end
Struct.init(%User{}, :id)
Struct.add(%User{id: 1, name: "Alice", email: "alice@example.com"})
{:ok, user} = Struct.get(%User{id: 1})Distributed Mode
Configuration
All nodes must share identical partition configuration:
# config/config.exs
config :super_cache,
auto_start: true,
key_pos: 0,
partition_pos: 0,
cluster: :distributed,
replication_mode: :async, # :async | :sync | :strong
replication_factor: 2, # primary + 1 replica
table_type: :set,
num_partition: 8 # Must match across all nodes
# config/runtime.exs
config :super_cache,
cluster_peers: [
:"node1@10.0.0.1",
:"node2@10.0.0.2",
:"node3@10.0.0.3"
]Replication Modes
| Mode | Guarantee | Latency | Use Case |
|---|---|---|---|
:async | Eventual consistency | ~50-100µs | High-throughput caches, session data |
:sync | Majority ack (adaptive quorum) | ~100-300µs | Balanced durability/performance |
:strong | WAL-based strong consistency | ~200µs | Critical data requiring durability |
Async Mode: Fire-and-forget replication via worker pool. Returns immediately after local write.
Sync Mode: Adaptive quorum writes — returns :ok once a strict majority of replicas acknowledge, avoiding waits for slow stragglers.
Strong Mode: Write-Ahead Log (WAL) replaces heavy 3PC. Writes locally first, then async replicates with majority acknowledgment. ~7x faster than traditional 3PC.
Read Modes (Distributed)
# Local read (fastest, may be stale)
SuperCache.get!({:user, 1})
# Primary read (consistent with primary node)
SuperCache.get!({:user, 1}, read_mode: :primary)
# Quorum read (majority agreement, early termination)
SuperCache.get!({:user, 1}, read_mode: :quorum)Quorum reads use early termination — returns as soon as a strict majority agrees, avoiding waits for slow replicas.
Manual Bootstrap
SuperCache.Cluster.Bootstrap.start!(
key_pos: 0,
partition_pos: 0,
cluster: :distributed,
replication_mode: :strong,
replication_factor: 2,
num_partition: 8
)Performance
Benchmarks (Local Mode, 4 partitions)
| Operation | Throughput | Notes |
|---|---|---|
put! | ~1.2M ops/sec | ~33% overhead vs raw ETS |
get! | ~2.1M ops/sec | Near raw ETS speed |
KeyValue.add_batch (10k) | ~1.1M ops/sec | Single ETS insert |
Distributed Latency
| Operation | Async | Sync (Quorum) | Strong (WAL) |
|---|---|---|---|
| Write | ~50-100µs | ~100-300µs | ~200µs |
| Read (local) | ~10µs | ~10µs | ~10µs |
| Read (quorum) | ~100-200µs | ~100-200µs | ~100-200µs |
Performance Optimizations
- Compile-time log elimination — Debug macros expand to
:okwhen disabled (zero overhead) - Partition resolution inlining — Single function call with
@compile {:inline} - Batch ETS operations —
:ets.insert/2with lists instead of per-item calls - Async replication worker pool —
Task.Supervisoreliminates per-operationspawn/1overhead - Adaptive quorum writes — Returns on majority ack, not all replicas
- Quorum read early termination — Stops waiting once majority is reached
- WAL-based strong consistency — Replaces 3PC with fast local write + async replication + majority ack
WAL Configuration
config :super_cache, :wal,
majority_timeout: 2_000, # ms to wait for majority ack
cleanup_interval: 5_000 # ms between WAL cleanup cyclesAPI Reference
SuperCache (Main API)
start!/1,start/1— Start cache with optionsput!/1,put/1— Insert tuple (bang returnstrue, safe returns{:ok, true})put_batch!/1— Batch insert (10-100x faster for bulk writes)get!/2,get/2— Retrieve by keydelete!/1,delete/1— Remove by keydelete_all/0— Clear all partitionsget_by_match!/3,get_by_match_object!/3— Pattern matchingscan!/3— Fold over partition recordsstats/0— Get cache statisticsdistributed?/0— Check if running in distributed mode
KeyValue
add/3,get/4,remove/2— Basic operationsadd_batch/2,remove_batch/2— Batch operationskeys/2,values/2,count/2,to_list/2— Collection operationsremove_all/1— Clear namespace
Queue
add/2,out/1,peak/1— FIFO operationscount/1,get_all/1— Inspection
Stack
push/2,pop/1,peak/1— LIFO operationscount/1,get_all/1— Inspection
Struct
init/2— Initialize struct type with key fieldadd/1,get/2,remove/1— CRUD operationsget_all/2,remove_all/1— Bulk operations
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
key_pos | integer | 0 | Tuple index for ETS key lookup |
partition_pos | integer | 0 | Tuple index for partition hashing |
num_partition | integer | schedulers | Number of ETS partitions |
table_type | atom | :set |
ETS table type (:set, :bag, :ordered_set) |
cluster | atom | :local | :local or :distributed |
replication_mode | atom | :async | :async, :sync, or :strong |
replication_factor | integer | 2 | Total copies (primary + replicas) |
cluster_peers | list | [] | List of peer node atoms |
auto_start | boolean | false | Auto-start on application boot |
debug_log | boolean | false | Enable debug logging (compile-time) |
Debug Logging
Enable at compile time (zero overhead in production):
# config/config.exs
config :super_cache, debug_log: trueOr toggle at runtime:
SuperCache.Log.enable(true)
SuperCache.Log.enable(false)Health Monitoring
SuperCache includes a built-in health monitor that tracks:
-
Node connectivity (RTT via
:erpc) - Replication lag (probe-based measurement)
- Partition balance (size variance tracking)
- Operation success rates
Access health data:
SuperCache.Cluster.HealthMonitor.health()
SuperCache.Cluster.HealthMonitor.metrics()Troubleshooting
Common Issues
"tuple size is lower than key_pos" — Ensure your tuples have enough elements for the configured key_pos.
Partition count mismatch — All nodes in a cluster must have the same num_partition value.
Replication lag — Check network connectivity between nodes. Use HealthMonitor.metrics() to diagnose.
High memory usage — Monitor partition sizes with SuperCache.stats(). Consider increasing num_partition or implementing TTL.
Performance Tips
-
Use
put_batch!/1for bulk inserts (10-100x faster) -
Use
KeyValue.add_batch/2for key-value bulk operations -
Prefer
:asyncreplication mode for high-throughput caches -
Use
read_mode: :localwhen eventual consistency is acceptable -
Enable compile-time
debug_log: falsefor production (default)
License
MIT License. See LICENSE for details.
Contributing
- Fork the repository
-
Create a feature branch (
git checkout -b feature/amazing-feature) -
Commit your changes (
git commit -m 'Add amazing feature') -
Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Run tests:
mix test
mix test --exclude cluster # Skip flaky cluster tests
mix test --warnings-as-errorsChangelog
v1.1.0
- WAL-based strong consistency — Replaces 3PC with ~7x faster writes (~200µs vs ~1500µs)
- Adaptive quorum writes — Sync mode returns on majority ack, not all replicas
- Replication worker pool — Eliminates per-operation
spawn/1overhead - Batch API optimizations —
add_batch/2uses single ETS insert - Quorum read early termination — Stops waiting once majority is reached
- Compile-time log elimination — Zero overhead when debug disabled
- Partition resolution inlining — Faster hot-path lookups
v1.0.0
- Initial release with ETS-backed caching
- Distributed mode with 3PC consistency
- Queue, Stack, KeyValue, and Struct APIs