Forge

Forge Logo

Hex.pmDocumentationDownloadsLicenseGitHub

Domain-agnostic sample factory for building repeatable data pipelines in Elixir.

Forge helps you generate samples, apply staged transformations, compute measurements, and persist results for dataset creation, evaluation harnesses, enrichment jobs, and analytics workflows.

Highlights

Core Building Blocks

Runners

Quick Start

Define a pipeline:

defmodule MyApp.Pipelines do
  use Forge.Pipeline

  pipeline :narratives do
    source Forge.Source.Generator,
      count: 3,
      generator: fn idx -> %{id: idx, text: "narrative-#{idx}"} end

    stage MyApp.NormalizeStage
    measurement MyApp.Measurements.Length
    storage Forge.Storage.ETS, table: :narrative_samples
  end
end

Run it with the GenServer runner:

{:ok, runner} =
  Forge.Runner.start_link(pipeline_module: MyApp.Pipelines, pipeline_name: :narratives)

samples = Forge.Runner.run(runner)
Forge.Runner.stop(runner)

Stream it with backpressure instead:

stream =
  MyApp.Pipelines.__pipeline__(:narratives)
  |> Forge.Runner.Streaming.run(concurrency: 8)

stream |> Enum.take(10)

Measurements & Orchestration

{:ok, :computed, value} =
  Forge.Measurement.Orchestrator.measure_sample(sample_id, MyApp.Measurements.Length, [])

Persistence & Artifacts

Observability

Anvil Integration

Installation

Add the dependency to mix.exs:

def deps do
  [
    {:forge_ex, "~> 0.1.1"}
  ]
end

For Postgres-backed features (storage, measurement orchestrator), configure Forge.Repo and run the provided migrations. Defaults use postgres/postgres on localhost; override via environment or config.

Development

mix deps.get            # Install dependencies
mix forge.setup         # Create & migrate Postgres schemas (if using Repo-backed features)
mix test                # Run the test suite (sets up DB via alias)
mix docs                # Generate ExDoc documentation

License

MIT License © 2024-2025 North-Shore-AI