Arcanum

Provider-agnostic AI inference library for Elixir.

Overview

Arcanum provides a unified interface for chat completion, streaming, embeddings, and tool use across multiple AI providers. Model capabilities are declared upfront via profiles — no runtime detection or error-code fallbacks.

Supported Providers

Provider	API Format	Features
OpenAI	OpenAI	Chat, stream, tools, embeddings
Anthropic	Anthropic	Chat, stream, tools
DeepSeek	OpenAI	Chat, stream, tools
GitHub Copilot	OpenAI	Chat, stream, tools (OAuth device flow)
OpenRouter	OpenAI	Chat, stream, tools
xAI (Grok)	OpenAI	Chat, stream, tools
ZAI / Zhipu	OpenAI	Chat, stream, tools
Ollama	Native	Chat, stream, tools, embeddings
LM Studio	OpenAI	Chat, stream, tools (auto model loading)
vLLM	OpenAI	Chat, stream, tools

Installation

def deps do
  [
    {:arcanum, "~> 0.1.0-rc.1"}
  ]
end

Usage

All inference goes through Arcanum.Gateway:

provider = %{
  base_url: "https://api.openai.com",
  api_key: "sk-...",
  kind: "openai",
  api_format: :openai
}

intent = %Arcanum.Intent{
  model: "gpt-4o",
  messages: [%{role: :user, content: "Hello"}]
}

# Synchronous
{:ok, %Arcanum.Response{content: content}} = Arcanum.Gateway.chat(provider, intent)

# Streaming
{:ok, stream} = Arcanum.Gateway.stream(provider, intent)

# List models
{:ok, models} = Arcanum.Gateway.list_models(provider)

# Embeddings (OpenAI, Ollama)
{:ok, %Arcanum.Response{}} = Arcanum.Gateway.embed(provider, intent)

Tool Use

Pass tools in the intent. Arcanum handles native, XML-text, and JSON-text tool call formats transparently based on the model profile.

intent = %Arcanum.Intent{
  model: "gpt-4o",
  messages: [%{role: :user, content: "What is the weather in Berlin?"}],
  tools: [
    %{
      type: "function",
      function: %{
        name: "get_weather",
        description: "Get current weather",
        parameters: %{
          type: "object",
          properties: %{location: %{type: "string"}},
          required: ["location"]
        }
      }
    }
  ]
}

{:ok, %Arcanum.Response{tool_calls: tool_calls}} = Arcanum.Gateway.chat(provider, intent)

Architecture

Gateway (public entry point)
  -> Auth resolution (API key, Copilot OAuth)
  -> Adapter dispatch (OpenAI, Anthropic, Ollama)
  -> Response normalization (profile-driven post-processing)

Arcanum.Gateway — single entry point for all inference calls
Arcanum.Intent — canonical request struct
Arcanum.Response — canonical response struct
Arcanum.ModelProfile — declares model capabilities (tools, system role, reasoning, context length)
Arcanum.ModelProfile.Registry — ETS cache backed by models.dev, refreshed hourly
Arcanum.Response.Normalizer — profile-driven post-processing (content fallback, think-tag stripping, tool call extraction)
Arcanum.Probe — TCP availability check for local providers
Arcanum.EnsureModel — pre-loads models on LM Studio before inference
Arcanum.Auth.Copilot — GitHub Copilot OAuth device code flow (RFC 8628)

Configuration

# Required for GitHub Copilot
config :arcanum, copilot_client_id: "your-client-id"

# Optional: custom HTTP client (defaults to Req)
config :arcanum, http_client: MyCustomClient

Design Principles

Profile-driven. Model capabilities are declared upfront, never discovered via error codes.
Everything has a limit. Retries, timeouts, model counts, poll attempts — all bounded.
Callers never touch adapters directly. Gateway is the only public interface.
Two-layer separation. Adapters handle wire protocol. Normalizer handles model-specific post-processing.

License

MIT