Arcanum
Provider-agnostic AI inference library for Elixir.
Overview
Arcanum provides a unified interface for chat completion, streaming, embeddings, tool use, and media generation across multiple AI providers. Model capabilities are declared upfront via profiles — no runtime detection or error-code fallbacks.
Supported Providers
| Provider | API Format | Features |
|---|---|---|
| OpenAI | OpenAI | Chat, stream, tools, vision, image generation |
| Anthropic | Anthropic | Chat, stream, tools, vision |
| DeepSeek | OpenAI | Chat, stream, tools |
| GitHub Copilot | OpenAI | Chat, stream, tools, vision (OAuth device flow) |
| OpenRouter | OpenAI | Chat, stream, tools |
| ZAI / Zhipu | OpenAI | Chat, stream, tools |
| LM Studio | OpenAI | Chat, stream, tools (auto model loading) |
Installation
def deps do
[
{:arcanum, "~> 0.1.1"}
]
endUsage
All inference goes through Arcanum.Gateway. Callers never touch adapters directly.
Provider Map
Every Gateway function takes a provider map describing the endpoint:
provider = %{
base_url: "https://api.openai.com",
api_key: "sk-...",
kind: "openai",
api_format: :openai,
type: :cloud
}| Key | Type | Description |
|---|---|---|
base_url | String.t() | Required. Provider API base URL. |
api_key | String.t() | nil | API key. Not needed for local providers or Copilot. |
api_format | :openai | :anthropic | :custom | Determines which adapter handles the request. |
kind | String.t() |
Provider ID (e.g. "openai", "anthropic", "ollama", "github-copilot"). Used for profile resolution and provider-specific behavior. |
type | :cloud | :local |
Used by Arcanum.Probe to skip TCP checks for cloud providers. |
extra_headers | [{String.t(), String.t()}] | nil | Additional HTTP headers (injected automatically for Copilot). |
Chat Completion
alias Arcanum.{Gateway, Intent}
intent = %Intent{
model: "gpt-4o",
messages: [
%{role: :system, content: Intent.text("You are a helpful assistant.")},
%{role: :user, content: Intent.text("What is Elixir?")}
],
temperature: 0.7,
max_tokens: 1024
}
{:ok, %Arcanum.Response{content: content}} = Gateway.chat(provider, intent)Streaming
{:ok, stream} = Gateway.stream(provider, intent)
Enum.each(stream, fn
{:data, %Arcanum.Response{content: chunk}} -> IO.write(chunk || "")
:done -> IO.puts("\n--- done ---")
{:error, reason} -> IO.puts("Error: #{inspect(reason)}")
end)Tool Use
Pass tools in the intent. Arcanum handles native, XML-text, and JSON-text tool call formats transparently based on the model profile.
intent = %Intent{
model: "gpt-4o",
messages: [%{role: :user, content: Intent.text("What is the weather in Berlin?")}],
tools: [
%{
type: "function",
function: %{
name: "get_weather",
description: "Get current weather for a location",
parameters: %{
"type" => "object",
"properties" => %{
"location" => %{"type" => "string", "description" => "City name"}
},
"required" => ["location"]
}
}
}
]
}
{:ok, %Arcanum.Response{tool_calls: tool_calls}} = Gateway.chat(provider, intent)
# tool_calls is a list of:
# %{id: "call_abc", function: %{name: "get_weather", arguments: "{\"location\":\"Berlin\"}"}}
Models that don't support native tool calls (e.g. some Ollama models) automatically get XML-text or JSON-text extraction based on their profile's tool_call_format.
Vision (Multimodal)
intent = %Intent{
model: "gpt-4o",
messages: [
%{role: :user, content: [
%{type: :text, text: "What's in this image?"},
%{type: :image_url, url: "https://example.com/photo.jpg"}
]}
]
}
# Or with base64:
%{type: :image_base64, media_type: "image/png", data: "iVBOR..."}Embeddings
{:ok, embeddings} = Gateway.embed(provider, "gpt-4o", "Hello world")
# embeddings is a list of floats
Supported by OpenAI and Ollama adapters. Returns {:error, :not_supported} for adapters that don't override the default.
Image Generation
alias Arcanum.MediaIntent
media_intent = %MediaIntent{
model: "gpt-image-1",
prompt: "A cat wearing a wizard hat",
size: "1024x1024",
quality: "auto",
n: 1,
format: "png"
}
{:ok, %Arcanum.MediaResponse{items: items}} = Gateway.generate_image(provider, media_intent)
# Each item: %{data: binary(), url: nil, revised_prompt: "...", content_type: "image/png"}Video Generation
{:ok, %Arcanum.MediaResponse{items: items}} = Gateway.generate_video(provider, media_intent)
Both generate_image/3 and generate_video/3 return {:error, :not_supported} for adapters that don't override the default implementation.
List Models
{:ok, models} = Gateway.list_models(provider)
# ["gpt-4o", "gpt-4o-mini", "gpt-4.1", ...]Probe Availability
Arcanum.Probe.probe_provider(provider)
# :online | :offline
Cloud providers always return :online. Local providers get a TCP connect check (2s timeout).
Ensure Model Loaded (LM Studio)
:ok = Arcanum.EnsureModel.ensure_loaded(provider, "qwen2.5-coder", context_length: 32_768)Pre-loads a model on LM Studio with the specified context length. No-op for all other providers.
GitHub Copilot Authentication
alias Arcanum.Auth.Copilot
# 1. Start device flow
{:ok, flow} = Copilot.start_device_flow()
# flow.verification_uri -> "https://github.com/login/device"
# flow.user_code -> "ABCD-1234"
# 2. User visits URL and enters code, then:
{:ok, access_token} = Copilot.poll_for_token(flow)
# 3. Use the token as the provider's api_key
provider = %{
base_url: Copilot.base_url(),
api_key: access_token,
kind: "github-copilot",
api_format: :openai,
type: :cloud,
extra_headers: Copilot.copilot_headers(access_token)
}
For non-blocking flows, use Copilot.poll_once/1 for single-attempt polling (e.g. from an Oban job).
Configuration
Application Config
# Required for GitHub Copilot OAuth
config :arcanum, copilot_client_id: "your-github-oauth-client-id"
# Optional: override HTTP client (defaults to Req)
config :arcanum, http_client: MyCustomClientModel Profile System
Every model gets a ModelProfile that declares its capabilities upfront. Profiles drive serialization, normalization, and feature gating — the adapter never guesses.
%Arcanum.ModelProfile{
supports_system_role: true, # can the model accept system messages?
supports_tools: true, # native tool call support?
supports_vision: false, # multimodal image input?
supports_image_generation: false, # image generation capability?
supports_video_generation: false, # video generation capability?
tool_call_format: :native, # :native | :xml_text
reasoning_field: nil, # atom — where the model puts thinking (e.g. :reasoning_content)
thinking_param: nil, # map sent to provider to enable thinking (e.g. %{type: "enabled"})
preserve_reasoning: false, # keep thinking content in response?
max_context: 131_072, # maximum context window
max_images_per_message: 4, # vision: max images per message
max_outputs_per_request: 4, # media generation: max outputs
supported_sizes: [], # media generation: allowed dimensions
supported_formats: [], # media generation: allowed formats
provider_routing: nil # provider-specific routing metadata
}Profile Resolution
Profiles are resolved automatically by Gateway via Arcanum.ModelProfile.Resolver. Resolution follows a strict priority chain:
1. User overrides (highest — caller-provided fields)
2. Overlay (provider/model-specific, from priv/overlays.json)
3. Registry (models.dev cache — single source of truth)
4. Provider default (fallback for local providers not in models.dev)
5. Global default (lowest — assumes weakest capabilities)Registry (models.dev)
The Arcanum.ModelProfile.Registry GenServer fetches model capabilities from models.dev and caches them in ETS. Refreshes hourly. Falls back gracefully if the fetch fails.
Default providers fetched: openai, anthropic, deepseek, openrouter, xai, zai, zhipuai, github-copilot, lmstudio.
# Lookup a cached profile (returns nil if not found)
Arcanum.ModelProfile.Registry.lookup("openai", "gpt-4o")
# List all cached provider IDs
Arcanum.ModelProfile.Registry.cached_providers()
Overlays (priv/overlays.json)
Overlays patch capabilities that models.dev doesn't track (vision, image generation, reasoning params). They are compiled into the Resolver at build time.
{
"overlays": {
"openai": {
"gpt-4o": { "supports_vision": true },
"gpt-image-1": {
"supports_image_generation": true,
"supported_sizes": ["1024x1024", "1024x1536", "1536x1024", "auto"],
"supported_formats": ["png", "webp", "jpeg"],
"max_outputs_per_request": 4
}
},
"deepseek": {
"deepseek-r1": { "preserve_reasoning": true }
}
},
"provider_defaults": {
"ollama": {
"supports_system_role": true,
"supports_tools": false,
"tool_call_format": "xml_text",
"max_context": 32768
}
}
}Provider Defaults
For local providers not in models.dev (Ollama, LM Studio, vLLM), provider defaults from priv/overlays.json are used as the base profile. These assume conservative capabilities.
Profile Overrides
Callers can override any profile field at call time via the :profile_overrides option. Overrides take the highest priority in the resolution chain.
# Force a model to use XML text tool calls
Gateway.chat(provider, intent, profile_overrides: %{tool_call_format: :xml_text})
# Override context window for a specific call
Gateway.chat(provider, intent, profile_overrides: %{max_context: 65_536})
# Enable vision for a model not in the registry
Gateway.chat(provider, intent, profile_overrides: %{supports_vision: true})
# Multiple overrides
Gateway.chat(provider, intent,
profile_overrides: %{
supports_tools: false,
tool_call_format: :xml_text,
max_context: 16_384
}
)
Any field from ModelProfile can be overridden. The override map is merged on top of the resolved profile, so you only need to specify the fields you want to change.
Gateway Options
All Gateway.chat/3 and Gateway.stream/3 calls accept an opts keyword list:
| Option | Type | Description |
|---|---|---|
:profile_overrides | map() |
Override any ModelProfile fields for this call. |
:adapter | module() | Override the adapter module (useful for testing). |
Architecture
Gateway (single public entry point)
-> Auth resolution (API key, Copilot OAuth headers)
-> Profile resolution (Resolver: overrides > overlay > registry > provider default > global default)
-> Adapter dispatch (OpenAI, Anthropic, Ollama)
-> Response normalization (Normalizer: content fallback, think-tag stripping, tool-call extraction)Core Modules
| Module | Purpose |
|---|---|
Arcanum.Gateway | Single entry point for all inference calls. |
Arcanum.Intent |
Canonical request struct. Content is always [content_block()]. |
Arcanum.Response | Canonical response struct (content, thinking, tool_calls, usage). |
Arcanum.MediaIntent | Request struct for image/video generation. |
Arcanum.MediaResponse | Response struct for generated media (items with data/url). |
Arcanum.ModelProfile | Declares model capabilities (tools, vision, reasoning, context). |
Arcanum.ModelProfile.Resolver | Multi-layer profile resolution with override support. |
Arcanum.ModelProfile.Registry | ETS cache backed by models.dev, refreshed hourly. |
Arcanum.Response.Normalizer | Profile-driven post-processing (XML/JSON tool extraction, think tags). |
Arcanum.Provider |
Behaviour + macro (use Arcanum.Provider) with defoverridable defaults. |
Arcanum.Probe | TCP availability check for local providers. |
Arcanum.EnsureModel | Pre-loads models on LM Studio before inference. |
Arcanum.Auth.Copilot | GitHub Copilot OAuth device code flow (RFC 8628). |
Adapters
| Adapter | Behaviour Callbacks |
|---|---|
Arcanum.Adapters.OpenAI | chat, stream, list_models, embed, generate_image |
Arcanum.Adapters.Anthropic | chat, stream, list_models |
Arcanum.Adapters.Ollama | chat, stream, list_models, embed |
Error Handling
All Gateway functions return {:ok, result} or {:error, reason}. Error shapes:
| Error | Meaning |
|---|---|
{:error, {:api_error, status, body}} | HTTP error from the provider. |
{:error, :context_overflow} | Input exceeded the model's context window. |
{:error, :not_supported} | Adapter doesn't implement the requested callback. |
{:error, :copilot_auth_required} | Copilot provider needs OAuth authentication. |
{:error, term()} | Network or other transient errors. |
Transient HTTP errors (429, 502, 503, 529) are retried automatically up to 3 times by the adapters.
Design Principles
- Profile-driven. Model capabilities are declared upfront, never discovered via error codes.
- Everything has a limit. Retries, timeouts, model counts, poll attempts — all bounded.
- Callers never touch adapters directly. Gateway is the only public interface.
- Two-layer separation. Adapters handle wire protocol faithfully. Normalizer handles model-specific post-processing.
License
MIT