LlmCore

Provider-agnostic LLM orchestration for Elixir. Route to any model, run agentic loops, extract structured output, and connect to Hindsight semantic memory — all through composable ALF pipelines with hot-reload TOML configuration.

LlmCore is the shared LLM substrate that powers the Fosferon ecosystem. It handles the messy parts of working with LLMs — provider routing, CLI wrapping, structured extraction, tool-calling loops, and Hindsight semantic memory integration — so your application code stays clean.

Why LlmCore?

Installation

Add llm_core to your dependencies in mix.exs:

def deps do
  [
    {:llm_core, "~> 0.3"}
  ]
end

Then fetch dependencies:

mix deps.get

Quick Start

Send a prompt through the router

# Routes automatically based on [routing.tasks] config
{:ok, response} = LlmCore.send("Explain pattern matching in Elixir", :reasoning)
IO.puts(response.content)

Stream a response

{:ok, stream} = LlmCore.stream("Write a GenServer example", :coding)
Enum.each(stream, fn chunk -> IO.write(chunk) end)

Extract structured output

schema = %{
  type: "object",
  properties: %{
    name: %{type: "string"},
    confidence: %{type: "number"}
  },
  required: ["name"]
}

{:ok, response} = LlmCore.send("Analyze this code", :reasoning,
  response_format: {:json_schema, schema}
)

response.structured
#=> %{"name" => "authenticate/2", "confidence" => 0.92}

Run an agentic tool-calling loop

alias LlmCore.Agent.Loop

tools = MyApp.Tools.available()
resolve = &MyApp.Tools.resolve/1

llm_send = fn messages, opts ->
  LlmCore.LLM.Provider.dispatch(LlmCore.LLM.Anthropic, messages, opts)
end

{:ok, response, messages} =
  Loop.run(
    [%{role: :user, content: "Research Elixir ALF"}],
    llm_send,
    tools: tools,
    resolve_tool: resolve,
    max_iterations: 10
  )

Semantic memory (via Hindsight)

LlmCore ships a resilient client for Hindsight, a standalone semantic memory server. The client handles caching, circuit breaking, retry with backoff, and write buffering so your application code doesn't have to.

# Store a fact (async, buffered)
:ok = LlmCore.retain("Schema-per-tenant isolation pattern", %{context: "architecture"})

# Recall by meaning
{:ok, results} = LlmCore.recall("how does multi-tenancy work?", bank_id: "my-bank")

# Synthesize an insight
{:ok, insight} = LlmCore.reflect("What patterns are most effective?", bank_id: "my-bank")

Query available providers

# All configured providers
providers = LlmCore.Provider.Registry.all()

# Only available ones (API keys present, binaries in PATH)
available = LlmCore.Provider.Registry.available()

# Find by alias
{:ok, provider} = LlmCore.Provider.Registry.lookup_alias("claude")

# Fuzzy suggestions (Jaro distance)
LlmCore.Provider.Registry.suggest_alias("claud")
#=> ["claude"]

# Capable providers for requirements
LlmCore.Provider.Registry.suggest_capable(%{streaming: true, tool_use: true})

CLI provider discovery

# List all CLI providers (built-in + configured)
entries = LlmCore.CLIProvider.Registry.list()

# Only those with binary in PATH
available = LlmCore.CLIProvider.Registry.available()

# Resolve by id or alias
{:ok, provider} = LlmCore.CLIProvider.Registry.resolve(:droid)

# Check capabilities
{:ok, caps} = LlmCore.CLIProvider.Registry.capabilities(:codex_cli)

Configuration

LlmCore uses layered TOML configuration. Later sources override earlier ones:

1. Compiled defaults    (priv/config/llm_core.toml)
2. Global override      (~/.llm_core/config/llm_core.toml)
3. Project override     (<project>/.llm_core/llm_core.toml)
4. Environment variable (LLM_CORE_CONFIG=path)
5. Custom path          (explicit :path option)
6. Runtime overrides    (ETS, via mix tasks or API)

Minimal configuration

[routing]
default = "claude"

[providers.anthropic]
module = "LlmCore.LLM.Anthropic"
aliases = ["claude"]

[providers.anthropic.auth]
api_key_env = "ANTHROPIC_API_KEY"

Task-based routing

[routing]
default = "claude"

[routing.tasks.coding]
alias = "openai"
mode = "passthrough"
capabilities = { structured_output = true, tool_use = true }

[routing.tasks.planning]
alias = "claude"
mode = "abstracted"
capabilities = { reasoning = true }

Add a CLI provider (no code needed)

[providers.my_tool]
type = "cli"
enabled = true
aliases = ["my-tool", "mt"]

[providers.my_tool.cli]
binary = "my-tool"
default_model = "v2"
default_timeout = 60000
prompt_position = "last"
install_hint = "pip install my-tool"
auto_approve_args = ["--yes"]

[providers.my_tool.cli.flags]
model = "--model"
temperature = "--temp"

[providers.my_tool.cli.preflight]
help_args = ["--help"]
expect_in_help = ["--model"]

Mix task helpers

# Inspect configuration
mix llm_core.config.show
mix llm_core.config.show --section providers --json

# Edit configuration
mix llm_core.config.set --path routing.default.alias --value claude
mix llm_core.config.set --path telemetry.sample_rate --value 0.25 --type float

# Validate configuration
mix llm_core.config.validate

See the Configuration Guide for the full TOML schema, environment variable interpolation, and agent registration rules.

Architecture

LlmCore is built on ALF (Antonmi's Flow-based Framework) for composable, observable data pipelines:

┌─────────────────────────────────────────────────────────────┐
│                       LlmCore                                │
│                                                              │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────────┐ │
│  │  Inference   │  │   Routing    │  │   Hindsight        │ │
│  │  Pipeline    │  │   Pipeline   │  │   Memory Client    │ │
│  └──────────────┘  └──────────────┘  └────────────────────┘ │
│                                                              │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────────┐ │
│  │  Agent Loop  │  │   Config     │  │   Telemetry        │ │
│  │  (Tool Use)  │  │   (Hot TOML) │  │   (Observable)     │ │
│  └──────────────┘  └──────────────┘  └────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Three ALF pipelines handle the core flows:

See the Architecture Guide for pipeline internals, provider behaviour contracts, and the agent loop design.

Telemetry Events

# Provider dispatch
[:llm_core, :provider, :send, :start | :stop | :exception]
[:llm_core, :provider, :stream, :start | :chunk | :stop]

# Router decisions
[:llm_core, :router, :resolve, :start | :stop]
[:llm_core, :router, :fallback]

# Agent loop
[:llm_core, :agent, :complete]

# Memory operations
[:llm_core, :hindsight, :retain | :recall | :reflect]
[:llm_core, :hindsight, :circuit_breaker, :state_change]

# Configuration
[:llm_core, :config, :reload]

Built-in Providers

Provider Type Module Key Capabilities
Anthropic API LlmCore.LLM.Anthropic Streaming, tool use, vision, structured output
OpenAI API LlmCore.LLM.OpenAI Streaming, tool use, vision, structured output
Ollama Local LlmCore.LLM.Ollama Streaming, JSON mode, local models
Appliance Local LlmCore.LLM.Appliance OpenAI-compatible local endpoints
Native API LlmCore.LLM.Native In-process agentic loop with cascade fallback
Claude Code CLI Config-driven --print, system prompt file, auto-approve
Droid CLI Config-driven exec subcommand, --auto, --cwd
Pi CLI CLI Config-driven --print, --provider, --thinking
Kimi CLI CLI Config-driven Agent-file YAML transform, final-message capture
Codex CLI CLI Config-driven --full-auto, file capture, sandbox bypass
Gemini CLI CLI Config-driven Model selection

Documentation

License

MIT — see the LICENSE file.