Note: This library is under active development and the API may change.
AshScylla
An Ash Framework data layer for ScyllaDB/Apache Cassandra
Quick Start • Features • Documentation • Contributing • License
Overview
AshScylla enables you to use ScyllaDB or Apache Cassandra as a persistence layer for your Ash Framework resources. It implements the Ash.DataLayer behaviour using Xandra (a native Elixir CQL driver) to communicate via CQL (Cassandra Query Language).
Key Benefits
- Seamless Ash Integration: Use familiar Ash resources, actions, and queries
- ScyllaDB Performance: Leverage ScyllaDB's high-performance, low-latency architecture
- Cassandra Compatibility: Works with Apache Cassandra and ScyllaDB
- Rich Feature Set: TTL, consistency levels, secondary indexes, materialized views, batch operations
Quick Start
Prerequisites
- Elixir 1.17+
- Running ScyllaDB or Cassandra instance
- Basic knowledge of Ash Framework
Installation
Add ash_scylla to your dependencies in mix.exs:
def deps do
[
{:ash_scylla, "~> 0.7.0"}
]
end
Minimal Setup
1. Configure a Repo:
# lib/my_app/repo.ex
defmodule MyApp.Repo do
use AshScylla.Repo,
otp_app: :my_app
end
2. Configure the Repo in config/config.exs:
config :my_app, MyApp.Repo,
nodes: ["127.0.0.1:9042"],
keyspace: "my_app_dev",
pool_size: 10
3. Add the Repo to your supervision tree:
# lib/my_app/application.ex
children = [
MyApp.Repo,
# ...
]
4. Generate a Resource:
mix ash_scylla.gen User name:string, email:string
This creates lib/my_app/resources/user.ex with a starter template. Or define it manually:
# lib/my_app/resources/user.ex
defmodule MyApp.User do
use Ash.Resource,
data_layer: AshScylla.DataLayer,
repo: MyApp.Repo
attributes do
uuid_primary_key :id
attribute :name, :string
attribute :email, :string
end
actions do
defaults [:create, :read, :update, :destroy]
end
end
5. Create a Domain:
# lib/my_app/domain.ex
defmodule MyApp.Domain do
use Ash.Domain
resources do
resource MyApp.User
end
end
6. Create Keyspace and Tables:
# Create keyspace (using the mix task)
mix ash_scylla.setup
# Or programmatically
MyApp.Repo.create_keyspace()
# Run migrations
AshScylla.Migrator.run!(MyApp.Repo.nodes(), [
AshScylla.Migration.create_table_cql(MyApp.User),
"CREATE INDEX IF NOT EXISTS idx_users_email ON users (email)"
])
7. Start Using It:
# Create
{:ok, user} = Ash.create(MyApp.User, %{name: "John", email: "john@example.com"})
# Read
users = MyApp.User
|> Ash.Query.filter(email == "john@example.com")
|> Ash.read!()
# Update
{:ok, updated} = user
|> Ash.Changeset.for_update(:update, %{name: "John Doe"})
|> Ash.update()
# Delete
:ok = Ash.destroy(user)
Or using the domain directly:
# Create via domain
{:ok, user} = MyApp.Domain.create_user(%{name: "John", email: "john@example.com"})
# Read via domain
users = MyApp.Domain.read_users!()
Features
Core Ash Features ✅
| Feature | Status | Description |
|---|---|---|
| Create | ✅ | Insert records with TTL support |
| Read | ✅ | Query with filtering and sorting |
| Update | ✅ | Update existing records |
| Destroy | ✅ | Delete records |
| Filter | ✅ | Powerful filter syntax with CQL WHERE conversion |
| Sort | ⚠️ | ORDER BY on clustering columns only (within a partition) |
| Keyset pagination | ✅ | Token-based pagination via paging_state (preferred over OFFSET) |
| Limit | ✅ | LIMIT is natively supported |
| Offset | ⚠️ | Not natively supported in ScyllaDB; results silently truncated. Use keyset pagination instead. |
| Select | ✅ | Select specific fields |
| Multitenancy | ✅ | Keyspace-based multitenancy |
| Bulk Create | ✅ | Batch INSERT operations |
ScyllaDB-Specific Features 🚀
TTL (Time To Live)
Automatically expire data after a specified time:
defmodule MyApp.Session do
use Ash.Resource,
data_layer: AshScylla.DataLayer
ash_scylla do
ttl 3600 # Expire after 1 hour
end
end
Consistency Levels
Configure read/write consistency per resource:
ash_scylla do
consistency :quorum # :any, :one, :two, :three, :quorum, :all, :local_quorum
end
Secondary Indexes
Query non-primary key columns efficiently:
ash_scylla do
secondary_index :email # Single column
secondary_index [:name, :age] # Composite index
end
Materialized Views
Create alternative query patterns with automatic view maintenance:
ash_scylla do
materialized_view :users_by_email,
primary_key: [:email, :id],
include_columns: [:name, :age]
end
Batch Operations
Reduce network round-trips with BATCH statements:
# Bulk create (uses BATCH internally)
{:ok, users} = user_data_list
|> Ash.bulk_create(MyApp.User, :create)
# Async partition-aware batching for large datasets
AshScylla.DataLayer.Batch.batch_insert_async(repo, statements, resource: MyApp.User, max_concurrency: 8)
Token-Based Pagination
Efficient pagination without OFFSET:
ash_scylla do
pagination :token # Use token-based pagination instead of OFFSET
end
Per-Action Consistency
Configure consistency levels per action:
ash_scylla do
consistency :quorum # Default consistency
per_action_consistency read: :one, create: :quorum # Per-action overrides
end
Data Modeling Best Practices
ScyllaDB is a wide-column store optimized for specific query patterns. Follow these principles:
1. Query-First Design 🎯
Design your tables around your queries, not the other way around:
# Good: Partition key supports your main query
defmodule MyApp.User do
attributes do
attribute :email, :string, primary_key?: true # Partition key
attribute :name, :string
end
end
# Query by partition key (efficient)
MyApp.User
|> Ash.Query.filter(email == "user@example.com")
|> Ash.read_one()
2. Denormalization is Normal 📦
Duplicate data across tables to support different query patterns:
# Table for querying posts by author
defmodule MyApp.PostByAuthor do
attributes do
attribute :author_id, :uuid, primary_key?: true
attribute :post_id, :uuid, primary_key?: true
attribute :title, :string
attribute :content, :string
end
end
# Table for querying posts by date
defmodule MyApp.PostByDate do
attributes do
attribute :date, :date, primary_key?: true
attribute :post_id, :uuid, primary_key?: true
attribute :title, :string
attribute :author_name, :string # Denormalized
end
end
3. Choose Partition Keys Wisely 🔑
- High cardinality: Distribute data evenly across nodes
- Query patterns: Support your most common queries
- Avoid hotspots: Don't use low-cardinality partition keys
# Good: User ID has high cardinality
attribute :user_id, :uuid, primary_key?: true
# Avoid: Status has low cardinality (creates hotspots)
attribute :status, :string, primary_key?: true # Don't do this
Configuration
Resource Configuration
defmodule MyApp.User do
use Ash.Resource,
data_layer: AshScylla.DataLayer
ash_scylla do
table "users" # Override table name
keyspace "custom_keyspace" # Override keyspace
consistency :quorum # Consistency level
ttl 3600 # Default TTL (seconds)
# Secondary indexes
secondary_index :email
secondary_index [:name, :age]
# Materialized views
materialized_view :users_by_email,
primary_key: [:email, :id],
include_columns: [:name, :age]
end
end
Repo Configuration
config :my_app, MyApp.Repo,
nodes: ["scylla-1:9042", "scylla-2:9042"], # Cluster nodes
keyspace: "my_app_prod",
pool_size: 50, # Connections per node
request_timeout: 300_000, # Query timeout (ms)
connect_timeout: 10_000
Pool Size Guidelines:
- Development: 5-10
- Production: 25-100 (based on concurrent queries)
ScyllaDB works best with a connections-per-shard approach:
pool_size = num_nodes * num_cores_per_node
Limitations
Since ScyllaDB/Cassandra is a NoSQL wide-column store, some features are not supported:
| Limitation | Reason | Workaround |
|---|---|---|
| No JOINs | No relational joins | Denormalize or application-side joins |
| No complex aggregations | No GROUP BY, COUNT across partitions | Materialized views or custom aggregation |
| No ACID transactions | Only lightweight transactions (LWT) | Use LWT for single-partition operations |
| Limited WHERE clauses | Without indexes, only PK queries are efficient; filtering on non-indexed columns raises errors | Create secondary indexes or materialized views for non-PK query patterns |
| No OR conditions | CQL limitation | Multiple queries or UNION-like patterns |
| No foreign keys | No relational integrity | Application-level validation |
| OFFSET not supported | ScyllaDB has no native OFFSET; it would require full table scan | Use keyset pagination with pagination :token. The data layer silently drops OFFSET to prevent performance disasters. |
Observability
Telemetry
AshScylla emits standard :telemetry events for all query and batch operations,
enabling integration with LiveDashboard, Datadog, OpenTelemetry, and other
observability tools.
Query events:
[:ash_scylla, :query, :start]- Query begins execution[:ash_scylla, :query, :stop]- Query finishes successfully[:ash_scylla, :query, :exception]- Query raises an error
Batch events:
[:ash_scylla, :batch, :start]- Batch operation begins[:ash_scylla, :batch, :stop]- Batch operation finishes
Attaching a handler:
:telemetry.attach(
"ash_scylla-logger",
[:ash_scylla, :query, :stop],
&MyApp.Telemetry.handle_event/4,
nil
)
Prepared Statement Caching
For high-throughput workloads, enable the prepared statement cache to eliminate repeated query parsing overhead on ScyllaDB:
# In your supervision tree
children = [
AshScylla.PreparedStatementCache,
# ... other children
]
Documentation
For detailed documentation, see:
- Usage Guide - Comprehensive guide with examples
- Development Guide - Dev container setup and development workflow
- Production Guide - Multi-node cluster deployment and operations
- Implementation Summary - Technical details
- Error Handling - Error types and handling strategies
- API Documentation - Module documentation (when published)
Quick Links
- Secondary Indexes
- Materialized Views
- Batch Operations
- Consistency Levels
- TTL Support
- Performance Optimization
Testing
Run the test suite:
# All tests (unit + integration; requires Podman/Docker for testcontainers)
mix test
# Unit tests only (no ScyllaDB required)
mix test --exclude integration
# Integration tests only (requires Podman/Docker)
mix test test/scylla_integration_test.exs --only integration
# CI pipeline (unit tests + credo)
mix test.ci
Test Structure
| File | Description |
|---|---|
test/ash_scylla_test.exs | Core DataLayer and DSL unit tests |
test/data_layer_crud_test.exs | CRUD operations with FakeRepo (create, update, destroy, upsert, bulk_create, run_query, aggregates) |
test/data_layer_callbacks_test.exs | DataLayer callbacks (transform_query, set_tenant, set_context, filter, sort, limit, offset, select, lock, combination_of, calculate, add_aggregate, add_aggregates, distinct) |
test/data_layer_pipeline_test.exs | Full pipeline DSL → DataLayer → QueryBuilder → CQL generation and execution |
test/data_layer_comprehensive_test.exs | Comprehensive gap coverage: run_query edge cases, filter OR rewriting, sort edge cases, bulk_create scenarios, source/repo edge cases, upsert delegation, aggregates, distinct, calculate, handle_scylla_result, sanitize_identifier, struct defaults, exhaustive can?/2 |
test/edge_cases_test.exs | Edge cases for QueryBuilder, Batch, Pagination, MaterializedView, Migration |
test/error_edge_cases_test.exs | Comprehensive error handling edge cases |
test/dsl_resource_test.exs | DSL compilation, public API, secondary_index parsing, materialized_view |
test/integration_test.exs | Integration test placeholder |
test/scylla_integration_test.exs | Full integration tests with testcontainers |
Integration tests use testcontainer_ex to spin up a ScyllaDB instance automatically via Podman or Docker.
Contributing
Contributions are welcome! Here's how to get started:
- Fork the repository
- Clone your fork:
git clone https://github.com/your-username/ash_scylla.git - Create a feature branch:
git checkout -b feature/my-feature - Make your changes
- Run tests:
mix test - Commit your changes:
git commit -am 'Add some feature' - Push to the branch:
git push origin feature/my-feature - Create a Pull Request
Development Setup
# Install dependencies
mix deps.get
# Start ScyllaDB via Podman Compose (includes health checks)
podman-compose up -d
# Or via Docker Compose
docker compose up -d
# Or start ScyllaDB manually with Podman
podman run -p 9042:9042 scylladb/scylla:latest
# Or with Docker
docker run -p 9042:9042 scylladb/scylla:latest
# Run tests
mix test
Dev Container
A .devcontainer/devcontainer.json is provided for VS Code Dev Containers.
It brings up both Elixir and ScyllaDB together via Podman Compose or Docker Compose.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Acknowledgments
- Ash Framework - The Elixir framework this data layer integrates with
- Xandra - Native Elixir CQL driver for ScyllaDB/Cassandra
- ScyllaDB - High-performance NoSQL database
Made with ❤️ for the Elixir and Ash communities