SquidMesh—Durable workflows for Elixir apps

sm-logo

CICodecovHexHexDocsElixir ForumDiscordLicense: Apache 2.0

Squid Mesh is an embedded durable workflow runtime for Elixir applications. It is for teams that want business workflows to live inside an existing Phoenix or OTP app, share that app's repo and deployment model, and still have durable run history, retries, approvals, replay, cancellation, and operator inspection.

It sits between a job backend and a standalone workflow service: more structured and inspectable than a job queue, but still embedded in the host app instead of running as a separate platform. Jido, Runic, and Spark are foundation layers in the current architecture; Reactor, Ash Reactor, Sage, and FlowStone solve adjacent orchestration problems at different abstraction layers.

Getting Started

Start with the manual and the example host apps:

  1. Read the Squid Mesh Manual and the Learning Path.
  2. Use the Minimal Host App to see manual approval, dependency recovery, saga compensation, local repo transactions, cron delivery, restart resilience, and bounded soak coverage.
  3. Use the Bedrock Minimal Host App to see backend-owned delivery, leases, delayed visibility, retry requeue, dead-letter handling, and cron payload mapping.

What It Does

Companion Dashboard

SquidSonar is the optional read-only Phoenix LiveView dashboard for Squid Mesh. Mount it inside a Phoenix host app to inspect recent workflow runs, filter by status, search runtime metadata, and view run detail pages with diagnosis, history counts, last error information, and workflow graph visualization.

Example Apps

When To Use It

Use Squid Mesh when a Phoenix or OTP app needs a durable workflow run as the main abstraction, not just a background job. It fits flows where:

For the full runtime direction and comparison with adjacent projects, see the Positioning guide.

If you are new to the project, start with the Learning Path. It teaches the model in order: install, write one workflow, drain journal attempts, inspect the run, then add retries, manual gates, cron, and Bedrock-backed leases when those pieces are needed.

Warning

Squid Mesh is still in early development. The runtime is suitable for evaluation, local development, and integration work, but it is not yet documented as production-ready. See Production Readiness for the current checklist and remaining bar.

Runtime Shape

Execution Boundary

The journal-backed runtime is Jido-native. Squid Mesh records workflow facts in Jido journals while host-owned workers provide process supervision and capacity by calling SquidMesh.execute_next/1. External schedulers may enqueue cron activation payloads, but step delivery now runs through the journal-backed worker loop.

+-----------------------------------------------------+
| Squid Mesh |
+-----------------------------------------------------+
| Public API: start_run / inspect_run / explain_run |
+-----------------------------------------------------+
|
v
+-----------------------------------------------------+
| Squid Mesh Runtime |
+-----------------------------------------------------+
| Plans work, applies results, retries, pauses, |
| cancels, completes, inspects, and explains |
+-----------------------------------------------------+
|
v
+-----------------------------------------------------+
| Jido Journals |
+-----------------------------------------------------+
| Durable workflow facts: runs, attempts, claims, |
| heartbeats, completions, failures, terminal state |
+-----------------------------------------------------+
| ^
v |
+----------------------------+ +----------------------------+
| SquidMesh.execute_next/1 | | SquidMesh.Executor.Leases |
+----------------------------+ +----------------------------+
| claim and run journal work | | claim / heartbeat / finish |
+----------------------------+ +----------------------------+
| |
+-------------+------------+
|
v
+----------------------------------------------------+
| Backend Adapter |
+----------------------------------------------------+
| Queue: enqueue, delay, cron delivery |
| Lease: claim, heartbeat, expiry, complete/fail |
+----------------------------------------------------+
|
v
+----------------------------------------------------+
| Backend Storage |
+----------------------------------------------------+
| Jobs, leases, worker liveness, delivery metadata |
+----------------------------------------------------+

For example, a Bedrock adapter could use Bedrock/FDB for job delivery, lease extension, stale-worker recovery, and delivery metadata. A Postgres or Oban adapter could keep using relational storage for delivery. The key boundary is that Squid Mesh owns workflow decisions and journaled facts, while adapters own the concrete queue and lease mechanics required by their backend.

Quick Start

Requirements:

1. Install from Hex.pm

defp deps do
[
{:squid_mesh, "~> 0.1.0-beta.1"}
]
end

For the common authoring path, define custom steps with use SquidMesh.Step. Raw Jido.Action modules remain supported as an explicit interop path; if the host app defines raw Jido actions directly, add :jido explicitly as well:

defp deps do
[
{:jido, "~> 2.0"},
{:squid_mesh, "~> 0.1.0-beta.1"}
]
end

2. Configure Squid Mesh

config :squid_mesh,
repo: MiddleEarth.Repo,
queue: "default"

Start one supervised worker loop that calls SquidMesh.execute_next/1. See Host App Integration for a minimal worker shape.

3. Install migrations

mix deps.get
mix squid_mesh.install
mix ecto.migrate

mix squid_mesh.install creates one current-schema Squid Mesh migration in the host app's priv/repo/migrations. The host app still owns migrations for its chosen job system.

4. Import formatter rules

To keep workflow modules formatted as DSL-style calls, import Squid Mesh's formatter configuration from the host app:

# .formatter.exs
[
import_deps: [:squid_mesh],
inputs: ["{mix,.formatter}.exs", "{config,lib,test}/**/*.{ex,exs}"]
]

Example: The Ring Errand

Before the longer example, here is the workflow API in small pieces.

Manual triggers declare an entrypoint and a payload contract. Payload fields are validated before Squid Mesh persists the run, and defaults are resolved at run creation time:

defmodule MiddleEarth.Workflows.RingErrand do
use SquidMesh.Workflow
workflow do
trigger :leave_shire do
manual()
payload do
field :bearer, :string, default: "Frodo"
field :ring_id, :string
field :snack_count, :integer, default: 11
field :panic_level, :float, required: false
field :eagle_backup?, :boolean, default: false
field :fellowship, :list, default: ["Sam"]
field :map_marks, :map, default: %{}
field :mood, :atom, default: :peckish
field :started_on, :string, default: {:today, :iso8601}
end
end
step :pack_lembas, Hobbiton.Steps.PackLembas,
input: [:snack_count],
output: :provisions,
transaction: :repo
step :announce_departure, :log,
message: "Leaving the Shire with suspicious jewelry",
level: :info
step :wait_for_gandalf, :wait, duration: 5_000
step :hide_at_prancing_pony, :pause
approval_step :council_vote, output: :council
step :cross_moria, Fellowship.Steps.CrossMoria,
input: [:bearer, :provisions, :council],
output: :moria,
retry: [
max_attempts: 3,
backoff: [type: :exponential, min: 1_000, max: 10_000]
]
step :reserve_eagle, Eagles.Steps.ReserveRide,
compensate: Eagles.Steps.CancelRide
step :insult_sauron, Gondor.Steps.InsultSauron,
compensatable: false
step :toss_ring, Mordor.Steps.TossRing,
irreversible: true
step :walk_home_awkwardly, Hobbiton.Steps.WalkHomeAwkwardly
transition :pack_lembas, on: :ok, to: :announce_departure
transition :announce_departure, on: :ok, to: :wait_for_gandalf
transition :wait_for_gandalf, on: :ok, to: :hide_at_prancing_pony
transition :hide_at_prancing_pony, on: :ok, to: :council_vote
transition :council_vote, on: :ok, to: :cross_moria
transition :council_vote, on: :error, to: :walk_home_awkwardly
transition :cross_moria, on: :ok, to: :reserve_eagle
transition :cross_moria, on: :error, to: :walk_home_awkwardly, recovery: :undo
transition :reserve_eagle, on: :ok, to: :insult_sauron
transition :insult_sauron, on: :ok, to: :toss_ring
transition :toss_ring, on: :ok, to: :complete
transition :walk_home_awkwardly, on: :ok, to: :complete
end
end

Cron triggers use the same workflow shape, but the host app owns recurring scheduling and activation:

defmodule Gondor.Workflows.BeaconWatch do
use SquidMesh.Workflow
workflow do
trigger :nightly_beacon_check do
cron "0 21 * * *", timezone: "Etc/UTC"
payload do
field :steward_mood, :string, default: "dramatic"
field :orc_count, :integer, default: 9001
end
end
step :inspect_hilltops, Gondor.Steps.InspectHilltops,
retry: [max_attempts: 5]
step :light_first_beacon, Gondor.Steps.LightBeacon,
compensate: Gondor.Steps.ExtinguishBeacon
step :log_call_for_aid, :log,
message: "Gondor calls for aid",
level: :info
transition :inspect_hilltops, on: :ok, to: :light_first_beacon
transition :light_first_beacon, on: :ok, to: :log_call_for_aid
transition :log_call_for_aid, on: :ok, to: :complete
end
end

Dependency-based workflows use after: [...] instead of transitions. A step is runnable only after all of its declared dependencies complete:

defmodule Mordor.Workflows.FinalDistraction do
use SquidMesh.Workflow
workflow do
trigger :start_distraction do
manual()
payload do
field :speech, :string, default: "For Frodo."
end
end
step :march_to_gate, Gondor.Steps.MarchToGate
step :look_very_brave, Gondor.Steps.LookBrave
step :sneak_up_volcano, Hobbiton.Steps.SneakUpVolcano
step :declare_victory, Gondor.Steps.DeclareVictory,
after: [:march_to_gate, :look_very_brave, :sneak_up_volcano],
irreversible: true
end
end

Step modules implement domain work. Squid Mesh records durable journal state, makes runnable attempts visible to SquidMesh.execute_next/1, applies retry policy, routes failures after retry exhaustion, and exposes run inspection.

For approval or manual-review gates, use approval_step/2 in transition-based workflows and resume the paused run through SquidMesh.approve_run/3 or SquidMesh.reject_run/3. Approval steps persist their resolved :ok and :error targets plus output-mapping metadata, so already-paused review runs keep the same decision semantics across restarts and deploys. Generic SquidMesh.unblock_run/2 remains available for lower-level :pause steps when you need manual intervention without an explicit approve/reject contract.

When a step needs a narrower contract than the whole payload plus accumulated context, use input: [...] to select keys and output: :key to namespace the returned map for downstream steps.

When a custom step needs several local repo writes to commit or roll back together, declare transaction: :repo. This wraps only that action callback in the configured Ecto repo transaction; workflow durability, successor dispatch, external side effects, and saga compensation remain explicit Squid Mesh boundaries.

For external side effects that cannot be honestly undone, mark the step with irreversible: true or compensatable: false. Squid Mesh exposes that recovery policy in inspection and blocks replay by default after such a step completes; council members can still replay with allow_irreversible: true after reviewing the side effect.

In the Ring Errand example, the :error transition on :cross_moria is a same-step fallback after retries are exhausted. The compensation callback is different: it is used only if :reserve_eagle completes, stores reversible reservation output, and a later step causes the run to fail.

For other reversible saga steps, declare compensation callbacks the same way:

step :borrow_elven_rope, Lothlorien.Steps.BorrowRope,
compensate: Lothlorien.Steps.ReturnRope
step :reserve_eagle, Eagles.Steps.ReserveRide,
compensate: Eagles.Steps.CancelRide
step :cross_moria, Fellowship.Steps.CrossMoria,
retry: [max_attempts: 2]
transition :borrow_elven_rope, on: :ok, to: :reserve_eagle
transition :reserve_eagle, on: :ok, to: :cross_moria
transition :cross_moria, on: :ok, to: :complete

When a downstream step fails after retries and the workflow has no forward :error path, Squid Mesh runs completed compensation callbacks in reverse completion order. In the example above, a failed :cross_moria step cancels the eagle reservation before returning the rope, and each result is persisted under the original step's recovery.compensation history.

Start the workflow through the public API and inspect the result with history:

{:ok, run} =
SquidMesh.start_run(MiddleEarth.Workflows.RingErrand, :leave_shire, %{
ring_id: "one-ring"
})
SquidMesh.inspect_run(run.id, include_history: true)

With history enabled, the inspected run includes chronological step_runs, declared steps state, and durable audit_events for pause, resume, approval, and rejection actions.

For workflows paused at a generic :pause step, resume with unblock_run/2. For approval steps, resume through the explicit decision APIs:

{:ok, paused_run} = SquidMesh.inspect_run(run.id, include_history: true)
{:ok, resumed_run} =
SquidMesh.unblock_run(paused_run.id, %{
actor: "strider",
reason: "pipeweed restocked"
})
# Once the run pauses at an approval step, choose one path:
{:ok, approved_run} =
SquidMesh.approve_run(resumed_run.id, %{
actor: "elrond",
note: "approved by council"
})
# Or reject it instead:
{:ok, rejected_run} =
SquidMesh.reject_run(resumed_run.id, %{
actor: "elrond",
note: "too much singing"
})

Runs can also be listed, cancelled, or replayed. Replay requires an explicit override after irreversible or non-compensatable steps:

{:ok, running_runs} = SquidMesh.list_runs(status: :running)
{:ok, cancelling_run} = SquidMesh.cancel_run(run.id)
{:ok, replayed_run} = SquidMesh.replay_run(run.id)
{:ok, reviewed_replay} = SquidMesh.replay_run(run.id, allow_irreversible: true)

Use SquidMesh.explain_run/2 when a host app needs council-facing diagnostics:

{:ok, explanation} = SquidMesh.explain_run(run.id)
explanation.reason
#=> :waiting_for_retry

inspect_run/2 returns the persisted runtime facts. explain_run/2 summarizes the current reason, valid next actions, and evidence in a structured shape that dashboards and CLIs can render themselves.

Documentation

Use the docs index for setup, workflow authoring, operations, and architecture:

Contributing