Arcanum

Provider-agnostic AI inference library for Elixir.

Overview

Arcanum provides a unified interface for chat completion, streaming, embeddings, tool use, and media generation across multiple AI providers. Model capabilities are declared upfront via profiles — no runtime detection or error-code fallbacks.

Supported Providers

Provider API Format Features
OpenAI OpenAI Chat, stream, tools, vision, image generation
Anthropic Anthropic Chat, stream, tools, vision
Ollama Ollama Chat, stream, tools, vision, embeddings
DeepSeek OpenAI Chat, stream, tools
GitHub Copilot OpenAI Chat, stream, tools, vision (OAuth device flow)
OpenRouter OpenAI Chat, stream, tools
xAI (Grok) OpenAI Chat, stream, tools, image generation
ZAI / Zhipu OpenAI Chat, stream, tools

Installation

def deps do
  [
    {:arcanum, "~> 0.1.2"}
  ]
end

Usage

All inference goes through Arcanum.Gateway. Callers never touch adapters directly.

Provider Map

Every Gateway function takes a provider map describing the endpoint:

provider = %{
  base_url: "https://api.openai.com",
  api_key: "sk-...",
  kind: "openai",
  api_format: :openai,
  type: :cloud
}
Key Type Description
base_urlString.t() Required. Provider API base URL.
api_keyString.t() | nil API key. Not needed for local providers or Copilot.
api_format:openai | :anthropic | :custom Determines which adapter handles the request.
kindString.t() Provider ID (e.g. "openai", "anthropic", "ollama", "github-copilot"). Used for profile resolution and provider-specific behavior.
type:cloud | :local Used by Arcanum.Probe to skip TCP checks for cloud providers.
extra_headers[{String.t(), String.t()}] | nil Additional HTTP headers (injected automatically for Copilot).

Chat Completion

alias Arcanum.{Gateway, Intent}

intent = %Intent{
  model: "gpt-4o",
  messages: [
    %{role: :system, content: Intent.text("You are a helpful assistant.")},
    %{role: :user, content: Intent.text("What is Elixir?")}
  ],
  temperature: 0.7,
  max_tokens: 1024
}

{:ok, response} = Gateway.chat(provider, intent)
text = Arcanum.Response.text(response)

Streaming

{:ok, stream} = Gateway.stream(provider, intent)

Enum.each(stream, fn
  {:data, %Arcanum.Response{} = response} ->
    IO.write(Arcanum.Response.text(response) || "")
  :done -> IO.puts("\n--- done ---")
  {:error, reason} -> IO.puts("Error: #{inspect(reason)}")
end)

Tool Use

Pass tools in the intent. Arcanum handles native, XML-text, and JSON-text tool call formats transparently based on the model profile.

intent = %Intent{
  model: "gpt-4o",
  messages: [%{role: :user, content: Intent.text("What is the weather in Berlin?")}],
  tools: [
    %{
      type: "function",
      function: %{
        name: "get_weather",
        description: "Get current weather for a location",
        parameters: %{
          "type" => "object",
          "properties" => %{
            "location" => %{"type" => "string", "description" => "City name"}
          },
          "required" => ["location"]
        }
      }
    }
  ]
}

{:ok, %Arcanum.Response{tool_calls: tool_calls}} = Gateway.chat(provider, intent)

# tool_calls is a list of:
# %{id: "call_abc", function: %{name: "get_weather", arguments: "{\"location\":\"Berlin\"}"}}

Models that don't support native tool calls (e.g. some Ollama models) automatically get XML-text or JSON-text extraction based on their profile's tool_call_format.

Vision (Multimodal)

intent = %Intent{
  model: "gpt-4o",
  messages: [
    %{role: :user, content: [
      %{type: :text, text: "What's in this image?"},
      %{type: :image_url, url: "https://example.com/photo.jpg"}
    ]}
  ]
}

# Or with base64:
%{type: :image_base64, media_type: "image/png", data: "iVBOR..."}

Embeddings

{:ok, embeddings} = Gateway.embed(provider, "gpt-4o", "Hello world")
# embeddings is a list of floats

Supported by OpenAI and Ollama adapters. Returns {:error, :not_supported} for adapters that don't override the default.

Image Generation

alias Arcanum.MediaIntent

media_intent = %MediaIntent{
  model: "gpt-image-1",
  prompt: "A cat wearing a wizard hat",
  size: "1024x1024",
  quality: "auto",
  n: 1,
  format: "png"
}

{:ok, %Arcanum.MediaResponse{items: items}} = Gateway.generate_image(provider, media_intent)

# Each item: %{data: binary(), url: nil, revised_prompt: "...", content_type: "image/png"}

Video Generation

{:ok, %Arcanum.MediaResponse{items: items}} = Gateway.generate_video(provider, media_intent)

Both generate_image/3 and generate_video/3 return {:error, :not_supported} for adapters that don't override the default implementation.

List Models

{:ok, models} = Gateway.list_models(provider)
# ["gpt-4o", "gpt-4o-mini", "gpt-4.1", ...]

Probe Availability

Arcanum.Probe.probe_provider(provider)
# :online | :offline

Cloud providers always return :online. Local providers get a TCP connect check (2s timeout).

GitHub Copilot Authentication

alias Arcanum.Auth.Copilot

# 1. Start device flow
{:ok, flow} = Copilot.start_device_flow()
# flow.verification_uri -> "https://github.com/login/device"
# flow.user_code -> "ABCD-1234"

# 2. User visits URL and enters code, then:
{:ok, access_token} = Copilot.poll_for_token(flow)

# 3. Use the token as the provider's api_key
provider = %{
  base_url: Copilot.base_url(),
  api_key: access_token,
  kind: "github-copilot",
  api_format: :openai,
  type: :cloud,
  extra_headers: Copilot.copilot_headers(access_token)
}

For non-blocking flows, use Copilot.poll_once/1 for single-attempt polling (e.g. from an Oban job).

Configuration

Application Config

# Required for GitHub Copilot OAuth
config :arcanum, copilot_client_id: "your-github-oauth-client-id"

# Optional: override HTTP client (defaults to Req)
config :arcanum, http_client: MyCustomClient

Model Profile System

Every model gets a ModelProfile that declares its capabilities upfront. Profiles drive serialization, normalization, and feature gating — the adapter never guesses.

%Arcanum.ModelProfile{
  supports_system_role:      true,       # can the model accept system messages?
  supports_tools:            true,       # native tool call support?
  supports_vision:           false,      # multimodal image input?
  supports_image_generation: false,      # image generation capability?
  supports_video_generation: false,      # video generation capability?
  tool_call_format:          :native,    # :native | :xml_text
  reasoning_field:           nil,        # atom — where the model puts thinking (e.g. :reasoning_content)
  thinking_param:            nil,        # map sent to provider to enable thinking (e.g. %{type: "enabled"})
  preserve_reasoning:        false,      # keep thinking content in response?
  uses_max_completion_tokens: false,     # use max_completion_tokens instead of max_tokens?
  max_context:               131_072,    # maximum context window
  max_images_per_message:    4,          # vision: max images per message
  max_outputs_per_request:   4,          # media generation: max outputs
  supported_sizes:           [],         # media generation: allowed dimensions
  supported_formats:         [],         # media generation: allowed formats
  provider_routing:          nil         # provider-specific routing metadata
}

Profile Resolution

Profiles are resolved automatically by Gateway via Arcanum.ModelProfile.Resolver. Resolution follows a strict priority chain:

1. User overrides     (highest — caller-provided fields)
2. Overlay            (provider/model-specific, from priv/overlays.json)
3. Registry           (models.dev cache — single source of truth)
4. Provider default   (fallback for local providers not in models.dev)
5. Global default     (lowest — assumes weakest capabilities)

Registry (models.dev)

The Arcanum.ModelProfile.Registry GenServer fetches model capabilities from models.dev and caches them in ETS. Refreshes hourly. Falls back gracefully if the fetch fails.

Default providers fetched: openai, anthropic, deepseek, openrouter, xai, zai, zhipuai, github-copilot.

# Lookup a cached profile (returns nil if not found)
Arcanum.ModelProfile.Registry.lookup("openai", "gpt-4o")

# List all cached provider IDs
Arcanum.ModelProfile.Registry.cached_providers()

Overlays (priv/overlays.json)

Overlays patch capabilities that models.dev doesn't track (vision, image generation, reasoning params). They are compiled into the Resolver at build time.

{
  "overlays": {
    "openai": {
      "gpt-4o": { "supports_vision": true },
      "gpt-image-1": {
        "supports_image_generation": true,
        "supported_sizes": ["1024x1024", "1024x1536", "1536x1024", "auto"],
        "supported_formats": ["png", "webp", "jpeg"],
        "max_outputs_per_request": 4
      }
    },
    "deepseek": {
      "deepseek-r1": { "preserve_reasoning": true }
    }
  },
  "provider_defaults": {
    "ollama": {
      "supports_system_role": true,
      "supports_tools": false,
      "tool_call_format": "xml_text",
      "max_context": 32768
    }
  }
}

Provider Defaults

For local providers not in models.dev (Ollama), provider defaults from priv/overlays.json are used as the base profile. These assume conservative capabilities.

Profile Overrides

Callers can override any profile field at call time via the :profile_overrides option. Overrides take the highest priority in the resolution chain.

# Force a model to use XML text tool calls
Gateway.chat(provider, intent, profile_overrides: %{tool_call_format: :xml_text})

# Override context window for a specific call
Gateway.chat(provider, intent, profile_overrides: %{max_context: 65_536})

# Enable vision for a model not in the registry
Gateway.chat(provider, intent, profile_overrides: %{supports_vision: true})

# Multiple overrides
Gateway.chat(provider, intent,
  profile_overrides: %{
    supports_tools: false,
    tool_call_format: :xml_text,
    max_context: 16_384
  }
)

Any field from ModelProfile can be overridden. The override map is merged on top of the resolved profile, so you only need to specify the fields you want to change.

Gateway Options

All Gateway.chat/3 and Gateway.stream/3 calls accept an opts keyword list:

Option Type Description
:profile_overridesmap() Override any ModelProfile fields for this call.
:adaptermodule() Override the adapter module (useful for testing).

Architecture

Gateway (single public entry point)
  -> Auth resolution (API key, Copilot OAuth headers)
  -> Profile resolution (Resolver: overrides > overlay > registry > provider default > global default)
  -> Adapter dispatch (OpenAI, Anthropic, Ollama)
  -> Response normalization (Normalizer: content fallback, think-tag stripping, tool-call extraction)

Core Modules

Module Purpose
Arcanum.Gateway Single entry point for all inference calls.
Arcanum.Intent Canonical request struct. Content is always [content_block()].
Arcanum.Response Canonical response struct (content, thinking, tool_calls, usage).
Arcanum.MediaIntent Request struct for image/video generation.
Arcanum.MediaResponse Response struct for generated media (items with data/url).
Arcanum.ModelProfile Declares model capabilities (tools, vision, reasoning, context).
Arcanum.ModelProfile.Resolver Multi-layer profile resolution with override support.
Arcanum.ModelProfile.Registry ETS cache backed by models.dev, refreshed hourly.
Arcanum.Response.Normalizer Profile-driven post-processing (XML/JSON tool extraction, think tags).
Arcanum.Provider Behaviour + macro (use Arcanum.Provider) with defoverridable defaults.
Arcanum.Probe TCP availability check for local providers.
Arcanum.Auth.Copilot GitHub Copilot OAuth device code flow (RFC 8628).

Adapters

Adapter Behaviour Callbacks
Arcanum.Adapters.OpenAIchat, stream, list_models, embed, generate_image
Arcanum.Adapters.Anthropicchat, stream, list_models
Arcanum.Adapters.Ollamachat, stream, list_models, embed

Error Handling

All Gateway functions return {:ok, result} or {:error, reason}. Error shapes:

Error Meaning
{:error, {:api_error, status, body}} HTTP error from the provider.
{:error, :context_overflow} Input exceeded the model's context window.
{:error, :not_supported} Adapter doesn't implement the requested callback.
{:error, :copilot_auth_required} Copilot provider needs OAuth authentication.
{:error, term()} Network or other transient errors.

Transient HTTP errors (429, 502, 503, 529) are retried automatically up to 3 times by the adapters.

Design Principles

Contributing

See CONTRIBUTING.md for how to add new providers and models.

License

MIT