QuackDB

Hex.pmHexDocs

DuckDB for Elixir applications, over DuckDB's experimental Quack protocol.

QuackDB gives Elixir applications an OTP-supervised DuckDB process, a DBConnection client, an Ecto adapter and query DSL for analytical DuckDB workflows, native append APIs, Explorer dataframe writes, Geo/WKB spatial integration, Table.Reader results, telemetry, and a managed DuckDB binary installer.

Warning

QuackDB targets DuckDB's experimental Quack protocol and is not production-ready yet. Public APIs, result shapes, Ecto adapter behavior, and supported protocol coverage may still change as DuckDB and QuackDB evolve. Validate behavior against your DuckDB version before relying on it for critical workloads.

defmodule MyApp.Analytics do
use QuackDB.Ecto
alias QuackDB.Source
def category_latency do
source = Source.parquet("s3://bucket/events/*.parquet", hive_partitioning: true)
from event in source,
group_by: event.category,
select: %{
category: event.category,
p95: quantile_cont(event.duration_ms, 0.95),
median: median(event.duration_ms),
events: count()
}
end
end

Why QuackDB?

DuckDB is already excellent at analytical SQL. QuackDB focuses on the Elixir side:

Elixir integrations

Elixir layerQuackDB integration
OTPsupervised local DuckDB server, managed binary, restartable child specs
DBConnectionpooled Quack sessions, queries, streams, transactions
Ectoadapter, query DSL, analytical helpers, migrations, writes
Explorerdataframe append and dataframe-friendly results
Geo%Geo.*{} params and WKB/GeoJSON workflows
Table.ReaderLivebook/dataframe-friendly result consumption
Telemetryquery, append, and fetch spans
Mixquackdb.install task for managed DuckDB binaries

Installation

Add QuackDB to your dependencies:

def deps do
[
{:quackdb, "~> 0.3.0"}
]
end

Optional integrations are enabled when their packages are present:

def deps do
[
{:quackdb, "~> 0.3.0"},
{:ecto_sql, "~> 3.13"},
{:explorer, "~> 0.11"},
{:geo, "~> 4.1"}
]
end

DuckDB's Quack protocol is experimental. For local development, use DuckDB 1.5.3 or newer with the quack extension.

Supervised DuckDB

For development, tests, examples, and notebooks, QuackDB can supervise DuckDB's CLI process and start a matching client pool. child_specs/1 generates one random token and injects it into both children.

children =
QuackDB.Server.child_specs(
server: [
name: MyApp.DuckDB,
duckdb: :managed,
endpoint: "quack:localhost:9494"
],
client: [
name: MyApp.QuackDB,
pool_size: System.schedulers_online()
]
)

duckdb: :managed downloads DuckDB's official CLI binary on first use, verifies known checksums for QuackDB's pinned DuckDB version, and caches it. QuackDB never downloads DuckDB during dependency compilation. Use QUACKDB_BINARY_PATH, QUACKDB_BINARY_CACHE_DIR, duckdb: "/path/to/duckdb", or run the quackdb.install Mix task when you want explicit control. See the managed DuckDB guide.

You can also start DuckDB manually:

duckdb -interactive -init /dev/null \
-cmd "LOAD quack; CALL quack_serve('quack:localhost', token='super_secret');"

quack:localhost often binds on IPv6 localhost, so examples use http://[::1]:9494.

DBConnection client

QuackDB can be used directly through its DBConnection-backed client.

{:ok, conn} =
QuackDB.start_link(
uri: "http://[::1]:9494",
token: "super_secret"
)
{:ok, result} = QuackDB.query(conn, "SELECT ? AS name, ? AS n", ["duck", 42])
result.columns
#=> ["name", "n"]
result.rows
#=> [["duck", 42]]

Use QuackDB.stream/4 for large result sets, or QuackDB.columns/4 when a column-oriented shape is more convenient for analytics tooling.

DuckDB workflows as Ecto queries

QuackDB exposes common DuckDB analytical workflows as Ecto-compatible helpers so they compose with normal queries.

Analytical aggregates

defmodule MyApp.Analytics do
use QuackDB.Ecto
def category_scores do
from event in "events",
group_by: event.category,
select: %{
category: event.category,
p95: quantile_cont(event.duration_ms, 0.95),
median: median(event.duration_ms),
precise_sum: fsum(event.duration_ms),
mode: mode(event.duration_ms),
weighted_average: weighted_avg(event.duration_ms, event.weight),
values: list(event.duration_ms, order_by: [desc_nulls_last: event.duration_ms]),
slow_events: filter(count(event.id), event.duration_ms > 1_000),
distinct_users: count(event.user_id, :distinct),
average_duration: coalesce(avg(event.duration_ms), 0),
events: count()
}
end
end

Date and timestamp series

use QuackDB.Ecto
from day in series(Date.range(~D[2024-01-01], ~D[2024-01-31])),
left_join: event in "events",
on: event.occurred_on == day.value,
group_by: day.value,
order_by: day.value,
select: %{
day: day.value,
events: count(event.id)
}

Timestamp series use Duration steps:

from bucket in series(
~N[2024-01-01 00:00:00],
~N[2024-01-02 00:00:00],
step: Duration.new!(hour: 1)
),
select: bucket.value

Source scans

DuckDB can query data where it already lives. QuackDB source helpers can be used directly as Ecto sources.

use QuackDB.Ecto
alias QuackDB.Source
source = Source.parquet("s3://bucket/events/*.parquet", hive_partitioning: true)
from event in source,
group_by: event.category,
select: %{
category: event.category,
events: count(),
avg_score: avg(event.score)
}

QuackDB does not upload local files for you. The DuckDB server must be able to see the path, URL, object store, or lakehouse catalog. See the sources guide.

External data can be materialized with CREATE TABLE AS, indexed with DuckDB FTS, and queried with BM25 from Ecto.

use QuackDB.Ecto
alias QuackDB.{DDL, FTS, Source}
query =
from doc in Source.parquet("s3://bucket/docs/*.parquet"),
select: %{
id: doc.id,
title: doc.title,
body: doc.body
}
MyApp.AnalyticsRepo.query!(DDL.create_table("docs", as: query, temporary: true))
MyApp.AnalyticsRepo.query!(FTS.create_index("docs", :id, [:title, :body], overwrite: true))
schema = FTS.schema_name("main.docs")
search = "duckdb analytics"
from doc in "docs",
where: bm25(^schema, doc.id, ^search) > 0,
order_by: [desc: bm25(^schema, doc.id, ^search)],
limit: 10,
select: %{
id: doc.id,
title: doc.title,
score: bm25(^schema, doc.id, ^search)
}

See the full-text search guide.

Text and regex predicates

DuckDB text and RE2 regular-expression helpers compose with Ecto filters and aggregate FILTER clauses. Shared contains/2 dispatches obvious string calls to DuckDB contains and spatial helper expressions to ST_Contains; ambiguous calls raise so contains_text/2 and st_contains/2 are available when you want to be explicit.

use QuackDB.Ecto
from event in "events",
where: contains(event.name, "duck") and regexp_matches(event.name, ~r/^duck/i),
select: %{
slug: regexp_replace(event.name, ~r/\s+/, "-", "g"),
parts: string_split(event.tags, ",")
}

DuckDB regexes use RE2, so ~r literals are intended for the syntax subset shared with Elixir regexes.

List predicates

DuckDB LIST/ARRAY helpers map directly to list_contains, list_has_any, list_has_all, and unnest. use QuackDB.Ecto imports non-conflicting list helpers by default; use contains_list/2 to avoid ambiguity with text/spatial contains/2.

use QuackDB.Ecto
from fragment in "fragments",
where: contains_list(fragment.terms, ^term_id) and has_any(fragment.terms, ^optional_term_ids),
select: %{id: fragment.id, term: unnest(fragment.terms)}

Spatial queries

DuckDB Spatial works with Ecto queries and %Geo.*{} structs when the optional :geo package is installed.

use QuackDB.Ecto
import QuackDB.Ecto.Spatial
alias QuackDB.Spatial
MyApp.AnalyticsRepo.query!(Spatial.load())
point = %Geo.Point{coordinates: {13.405, 52.52}, srid: nil}
from place in "places",
where: intersects(place.geom, ^point) and distance(place.geom, ^point) < 1_000,
select: %{
id: place.id,
name: place.name,
wkt: as_text(place.geom)
}

GEOMETRY values decode as WKB-compatible bytes for tested DuckDB Spatial values. QuackDB.Geometry can convert to/from %Geo.*{} structs when the optional :geo package is installed. See the spatial guide and the Spatial WMS example.

Writes and dataframes

QuackDB can write through DuckDB's native append protocol instead of generating huge INSERT VALUES statements.

QuackDB.insert_rows!(conn, "events", [
[id: 1, name: "duck", tags: ["bird", "wetland"]],
[id: 2, name: "goose", tags: ["bird", "loud"]]
])
QuackDB.insert_columns!(conn, "measurements", [
id: [1, 2, 3],
temperature: [12.5, 13.0, 12.8]
])

Explicit MAP columns accept ordinary Elixir maps while plain map inference stays STRUCT-shaped:

QuackDB.insert_rows!(conn, "events", [[id: 1, labels: %{env: "prod", region: "eu"}]],
columns: [id: :integer, labels: {:map, :varchar, :varchar}]
)

When Explorer is installed, dataframes can be appended directly:

alias Explorer.DataFrame
alias QuackDB.Explorer, as: QuackExplorer
frame = DataFrame.new(id: [1, 2], name: ["duck", "goose"])
QuackExplorer.insert_dataframe!(conn, "events", frame)

Enumerable rows can be streamed into native append batches:

File.stream!("events.ndjson")
|> Stream.map(&Jason.decode!/1)
|> QuackDB.insert_stream!(conn, "events", chunk_every: 10_000)

Any Table.Reader-compatible data can be appended through the same column append path:

QuackDB.insert_table!(conn, "events", %{id: [1, 2], name: ["duck", "goose"]})

Append supports explicit types, batching, scalar DuckDB values, and nested LIST, STRUCT, ARRAY, and MAP values. See the type support guide and the Explorer guide.

Results, Livebook, and telemetry

QuackDB.Result and QuackDB.Columns implement Table.Reader, so they can be consumed by Livebook and other Table-aware tooling. When Explorer is installed, query results can be turned into dataframes:

result = QuackDB.query!(conn, "SELECT * FROM events")
Explorer.DataFrame.new(result)

QuackDB emits telemetry spans for query, append, and fetch operations:

Metadata includes connection/session information, command details, append batch counts, and client query IDs. Params are not included unless you opt in with telemetry_params: true. See the telemetry guide.

Ecto coverage

QuackDB includes an optional Ecto SQL adapter for applications that want Ecto query composition, schema-based reads/writes, migrations, and raw SQL through Repo.query/3.

defmodule MyApp.AnalyticsRepo do
use Ecto.Repo,
otp_app: :my_app,
adapter: Ecto.Adapters.QuackDB
end

The adapter currently covers:

DuckDB-specific SQL that Ecto cannot model cleanly should still use Repo.query/3. See the Ecto coverage matrix.

Examples

The repository includes runnable scripts, a Livebook notebook, and a small WMS app:

Run scripts from outside the Mix project so Mix.install/2 can load the local package:

cd /tmp
elixir /path/to/quackdb/examples/dataframe_analytics.exs

Current boundaries

QuackDB is intentionally focused on DuckDB analytics over Quack:

Documentation

Development

mix deps.get
mix ci

See CONTRIBUTING.md for local checks, example smoke tests, package audit steps, and release dry-run notes.

Integration tests are skipped by default. To run them against a Quack server:

QUACKDB_TEST_URI='http://[::1]:9494' \
QUACKDB_TEST_TOKEN=super_secret \
mix test --include integration