Arcanum

Provider-agnostic AI inference library for Elixir.

Overview

Arcanum provides a unified interface for chat completion, streaming, embeddings, tool use, and media generation across multiple AI providers. Model capabilities are declared upfront via profiles — no runtime detection or error-code fallbacks.

Supported Providers

Provider	API Format	Features
OpenAI	OpenAI	Chat, stream, tools, vision, image generation
Anthropic	Anthropic	Chat, stream, tools, vision
DeepSeek	OpenAI	Chat, stream, tools
GitHub Copilot	OpenAI	Chat, stream, tools, vision (OAuth device flow)
OpenRouter	OpenAI	Chat, stream, tools
ZAI / Zhipu	OpenAI	Chat, stream, tools
LM Studio	OpenAI	Chat, stream, tools (auto model loading)

Installation

def deps do
  [
    {:arcanum, "~> 0.1.1"}
  ]
end

Usage

All inference goes through Arcanum.Gateway. Callers never touch adapters directly.

Provider Map

Every Gateway function takes a provider map describing the endpoint:

provider = %{
  base_url: "https://api.openai.com",
  api_key: "sk-...",
  kind: "openai",
  api_format: :openai,
  type: :cloud
}

Key	Type	Description
`base_url`	`String.t()`	Required. Provider API base URL.
`api_key`	`String.t() \| nil`	API key. Not needed for local providers or Copilot.
`api_format`	`:openai \| :anthropic \| :custom`	Determines which adapter handles the request.
`kind`	`String.t()`	Provider ID (e.g. `"openai"`, `"anthropic"`, `"ollama"`, `"github-copilot"`). Used for profile resolution and provider-specific behavior.
`type`	`:cloud \| :local`	Used by `Arcanum.Probe` to skip TCP checks for cloud providers.
`extra_headers`	`[{String.t(), String.t()}] \| nil`	Additional HTTP headers (injected automatically for Copilot).

Chat Completion

alias Arcanum.{Gateway, Intent}

intent = %Intent{
  model: "gpt-4o",
  messages: [
    %{role: :system, content: Intent.text("You are a helpful assistant.")},
    %{role: :user, content: Intent.text("What is Elixir?")}
  ],
  temperature: 0.7,
  max_tokens: 1024
}

{:ok, %Arcanum.Response{content: content}} = Gateway.chat(provider, intent)

Streaming

{:ok, stream} = Gateway.stream(provider, intent)

Enum.each(stream, fn
  {:data, %Arcanum.Response{content: chunk}} -> IO.write(chunk || "")
  :done -> IO.puts("\n--- done ---")
  {:error, reason} -> IO.puts("Error: #{inspect(reason)}")
end)

Tool Use

Pass tools in the intent. Arcanum handles native, XML-text, and JSON-text tool call formats transparently based on the model profile.

intent = %Intent{
  model: "gpt-4o",
  messages: [%{role: :user, content: Intent.text("What is the weather in Berlin?")}],
  tools: [
    %{
      type: "function",
      function: %{
        name: "get_weather",
        description: "Get current weather for a location",
        parameters: %{
          "type" => "object",
          "properties" => %{
            "location" => %{"type" => "string", "description" => "City name"}
          },
          "required" => ["location"]
        }
      }
    }
  ]
}

{:ok, %Arcanum.Response{tool_calls: tool_calls}} = Gateway.chat(provider, intent)

# tool_calls is a list of:
# %{id: "call_abc", function: %{name: "get_weather", arguments: "{\"location\":\"Berlin\"}"}}

Models that don't support native tool calls (e.g. some Ollama models) automatically get XML-text or JSON-text extraction based on their profile's tool_call_format.

Vision (Multimodal)

intent = %Intent{
  model: "gpt-4o",
  messages: [
    %{role: :user, content: [
      %{type: :text, text: "What's in this image?"},
      %{type: :image_url, url: "https://example.com/photo.jpg"}
    ]}
  ]
}

# Or with base64:
%{type: :image_base64, media_type: "image/png", data: "iVBOR..."}

Embeddings

{:ok, embeddings} = Gateway.embed(provider, "gpt-4o", "Hello world")
# embeddings is a list of floats

Supported by OpenAI and Ollama adapters. Returns {:error, :not_supported} for adapters that don't override the default.

Image Generation

alias Arcanum.MediaIntent

media_intent = %MediaIntent{
  model: "gpt-image-1",
  prompt: "A cat wearing a wizard hat",
  size: "1024x1024",
  quality: "auto",
  n: 1,
  format: "png"
}

{:ok, %Arcanum.MediaResponse{items: items}} = Gateway.generate_image(provider, media_intent)

# Each item: %{data: binary(), url: nil, revised_prompt: "...", content_type: "image/png"}

Video Generation

{:ok, %Arcanum.MediaResponse{items: items}} = Gateway.generate_video(provider, media_intent)

Both generate_image/3 and generate_video/3 return {:error, :not_supported} for adapters that don't override the default implementation.

List Models

{:ok, models} = Gateway.list_models(provider)
# ["gpt-4o", "gpt-4o-mini", "gpt-4.1", ...]

Probe Availability

Arcanum.Probe.probe_provider(provider)
# :online | :offline

Cloud providers always return :online. Local providers get a TCP connect check (2s timeout).

Ensure Model Loaded (LM Studio)

:ok = Arcanum.EnsureModel.ensure_loaded(provider, "qwen2.5-coder", context_length: 32_768)

Pre-loads a model on LM Studio with the specified context length. No-op for all other providers.

GitHub Copilot Authentication

alias Arcanum.Auth.Copilot

# 1. Start device flow
{:ok, flow} = Copilot.start_device_flow()
# flow.verification_uri -> "https://github.com/login/device"
# flow.user_code -> "ABCD-1234"

# 2. User visits URL and enters code, then:
{:ok, access_token} = Copilot.poll_for_token(flow)

# 3. Use the token as the provider's api_key
provider = %{
  base_url: Copilot.base_url(),
  api_key: access_token,
  kind: "github-copilot",
  api_format: :openai,
  type: :cloud,
  extra_headers: Copilot.copilot_headers(access_token)
}

For non-blocking flows, use Copilot.poll_once/1 for single-attempt polling (e.g. from an Oban job).

Configuration

Application Config

# Required for GitHub Copilot OAuth
config :arcanum, copilot_client_id: "your-github-oauth-client-id"

# Optional: override HTTP client (defaults to Req)
config :arcanum, http_client: MyCustomClient

Model Profile System

Every model gets a ModelProfile that declares its capabilities upfront. Profiles drive serialization, normalization, and feature gating — the adapter never guesses.

%Arcanum.ModelProfile{
  supports_system_role:      true,       # can the model accept system messages?
  supports_tools:            true,       # native tool call support?
  supports_vision:           false,      # multimodal image input?
  supports_image_generation: false,      # image generation capability?
  supports_video_generation: false,      # video generation capability?
  tool_call_format:          :native,    # :native | :xml_text
  reasoning_field:           nil,        # atom — where the model puts thinking (e.g. :reasoning_content)
  thinking_param:            nil,        # map sent to provider to enable thinking (e.g. %{type: "enabled"})
  preserve_reasoning:        false,      # keep thinking content in response?
  max_context:               131_072,    # maximum context window
  max_images_per_message:    4,          # vision: max images per message
  max_outputs_per_request:   4,          # media generation: max outputs
  supported_sizes:           [],         # media generation: allowed dimensions
  supported_formats:         [],         # media generation: allowed formats
  provider_routing:          nil         # provider-specific routing metadata
}

Profile Resolution

Profiles are resolved automatically by Gateway via Arcanum.ModelProfile.Resolver. Resolution follows a strict priority chain:

1. User overrides     (highest — caller-provided fields)
2. Overlay            (provider/model-specific, from priv/overlays.json)
3. Registry           (models.dev cache — single source of truth)
4. Provider default   (fallback for local providers not in models.dev)
5. Global default     (lowest — assumes weakest capabilities)

Registry (models.dev)

The Arcanum.ModelProfile.Registry GenServer fetches model capabilities from models.dev and caches them in ETS. Refreshes hourly. Falls back gracefully if the fetch fails.

Default providers fetched: openai, anthropic, deepseek, openrouter, xai, zai, zhipuai, github-copilot, lmstudio.

# Lookup a cached profile (returns nil if not found)
Arcanum.ModelProfile.Registry.lookup("openai", "gpt-4o")

# List all cached provider IDs
Arcanum.ModelProfile.Registry.cached_providers()

Overlays (`priv/overlays.json`)

Overlays patch capabilities that models.dev doesn't track (vision, image generation, reasoning params). They are compiled into the Resolver at build time.

{
  "overlays": {
    "openai": {
      "gpt-4o": { "supports_vision": true },
      "gpt-image-1": {
        "supports_image_generation": true,
        "supported_sizes": ["1024x1024", "1024x1536", "1536x1024", "auto"],
        "supported_formats": ["png", "webp", "jpeg"],
        "max_outputs_per_request": 4
      }
    },
    "deepseek": {
      "deepseek-r1": { "preserve_reasoning": true }
    }
  },
  "provider_defaults": {
    "ollama": {
      "supports_system_role": true,
      "supports_tools": false,
      "tool_call_format": "xml_text",
      "max_context": 32768
    }
  }
}

Provider Defaults

For local providers not in models.dev (Ollama, LM Studio, vLLM), provider defaults from priv/overlays.json are used as the base profile. These assume conservative capabilities.

Profile Overrides

Callers can override any profile field at call time via the :profile_overrides option. Overrides take the highest priority in the resolution chain.

# Force a model to use XML text tool calls
Gateway.chat(provider, intent, profile_overrides: %{tool_call_format: :xml_text})

# Override context window for a specific call
Gateway.chat(provider, intent, profile_overrides: %{max_context: 65_536})

# Enable vision for a model not in the registry
Gateway.chat(provider, intent, profile_overrides: %{supports_vision: true})

# Multiple overrides
Gateway.chat(provider, intent,
  profile_overrides: %{
    supports_tools: false,
    tool_call_format: :xml_text,
    max_context: 16_384
  }
)

Any field from ModelProfile can be overridden. The override map is merged on top of the resolved profile, so you only need to specify the fields you want to change.

Gateway Options

All Gateway.chat/3 and Gateway.stream/3 calls accept an opts keyword list:

Option	Type	Description
`:profile_overrides`	`map()`	Override any `ModelProfile` fields for this call.
`:adapter`	`module()`	Override the adapter module (useful for testing).

Architecture

Gateway (single public entry point)
  -> Auth resolution (API key, Copilot OAuth headers)
  -> Profile resolution (Resolver: overrides > overlay > registry > provider default > global default)
  -> Adapter dispatch (OpenAI, Anthropic, Ollama)
  -> Response normalization (Normalizer: content fallback, think-tag stripping, tool-call extraction)

Core Modules

Module	Purpose
`Arcanum.Gateway`	Single entry point for all inference calls.
`Arcanum.Intent`	Canonical request struct. Content is always `[content_block()]`.
`Arcanum.Response`	Canonical response struct (content, thinking, tool_calls, usage).
`Arcanum.MediaIntent`	Request struct for image/video generation.
`Arcanum.MediaResponse`	Response struct for generated media (items with data/url).
`Arcanum.ModelProfile`	Declares model capabilities (tools, vision, reasoning, context).
`Arcanum.ModelProfile.Resolver`	Multi-layer profile resolution with override support.
`Arcanum.ModelProfile.Registry`	ETS cache backed by models.dev, refreshed hourly.
`Arcanum.Response.Normalizer`	Profile-driven post-processing (XML/JSON tool extraction, think tags).
`Arcanum.Provider`	Behaviour + macro (`use Arcanum.Provider`) with defoverridable defaults.
`Arcanum.Probe`	TCP availability check for local providers.
`Arcanum.EnsureModel`	Pre-loads models on LM Studio before inference.
`Arcanum.Auth.Copilot`	GitHub Copilot OAuth device code flow (RFC 8628).

Adapters

Adapter	Behaviour Callbacks
`Arcanum.Adapters.OpenAI`	`chat`, `stream`, `list_models`, `embed`, `generate_image`
`Arcanum.Adapters.Anthropic`	`chat`, `stream`, `list_models`
`Arcanum.Adapters.Ollama`	`chat`, `stream`, `list_models`, `embed`

Error Handling

All Gateway functions return {:ok, result} or {:error, reason}. Error shapes:

Error	Meaning
`{:error, {:api_error, status, body}}`	HTTP error from the provider.
`{:error, :context_overflow}`	Input exceeded the model's context window.
`{:error, :not_supported}`	Adapter doesn't implement the requested callback.
`{:error, :copilot_auth_required}`	Copilot provider needs OAuth authentication.
`{:error, term()}`	Network or other transient errors.

Transient HTTP errors (429, 502, 503, 529) are retried automatically up to 3 times by the adapters.

Design Principles

Profile-driven. Model capabilities are declared upfront, never discovered via error codes.
Everything has a limit. Retries, timeouts, model counts, poll attempts — all bounded.
Callers never touch adapters directly. Gateway is the only public interface.
Two-layer separation. Adapters handle wire protocol faithfully. Normalizer handles model-specific post-processing.

License

MIT