OpenResponses

An Elixir/Phoenix implementation of the Open Responses specification — a unified, provider-agnostic API for LLM interactions with first-class streaming, multi-turn conversation, tool dispatch, and agentic loops.

What it does

OpenResponses acts as a multi-provider LLM proxy. Your application makes a single, spec-compliant API call; OpenResponses routes it to the right provider (OpenAI, Anthropic, Gemini, Ollama, or your own adapter), streams the response back, and manages the full agentic loop — including server-side tool execution — without any client-side orchestration.

curl http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5-20251001",
    "input": [{"role": "user", "content": "Hello!"}]
  }'

Features

Multi-provider routing — OpenAI, Anthropic (Claude), Google Gemini, Ollama, and any custom adapter
Streaming SSE — spec-compliant server-sent events with sequence numbers
Agentic loop — automatic tool dispatch with hosted (server-side) tool execution
Multi-turn conversations — previous_response_id reconstructs full conversation context
Middleware pipeline — intercept the loop for logging, token budgets, rate limiting, content filtering
MCP integration — connect Model Context Protocol servers for external tool execution
Observability — structured telemetry events and Prometheus metrics via PromEx
BEAM-native — each request runs in an isolated GenServer process; thousands of concurrent loops on a single node

Supported providers

Provider	Model pattern	Adapter
OpenAI	`gpt-`, `o1`	`OpenResponses.Adapters.OpenAI`
Anthropic / z.ai	`claude-*`	`OpenResponses.Adapters.Anthropic`
Google Gemini	`gemini-*`	`OpenResponses.Adapters.Gemini`
Ollama (local)	`llama`, `mistral`, `phi`, `qwen`	`OpenResponses.Adapters.Ollama`

Installation

Add to your mix.exs:

def deps do
  [
    {:open_responses, "~> 0.1"}
  ]
end

Then run the Igniter installer:

mix open_responses.install

This adds the router scope, supervision tree entries, and a config block with placeholder API keys. See the Installation guide for manual setup.

Configuration

# config/runtime.exs
config :open_responses, :provider_config, %{
  openai:    [api_key: System.fetch_env!("OPENAI_API_KEY")],
  anthropic: [api_key: System.fetch_env!("ANTHROPIC_API_KEY")],
  gemini:    [api_key: System.fetch_env!("GEMINI_API_KEY")]
}

config :open_responses, :routing, %{
  ~r/^gpt-/    => OpenResponses.Adapters.OpenAI,
  ~r/^claude-/ => OpenResponses.Adapters.Anthropic,
  ~r/^gemini-/ => OpenResponses.Adapters.Gemini,
  "default"    => OpenResponses.Adapters.OpenAI
}

Streaming example

const response = await fetch("http://localhost:4000/v1/responses", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "claude-haiku-4-5-20251001",
    stream: true,
    input: [{ role: "user", content: "Write a haiku about the BEAM." }]
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split("\n");
  buffer = lines.pop();

  for (const line of lines) {
    if (!line.startsWith("data: ") || line.includes("[DONE]")) continue;
    const event = JSON.parse(line.slice(6));
    if (event.type === "response.output_text.delta") {
      process.stdout.write(event.delta);
    }
  }
}

Each SSE frame arrives as event: <type>\ndata: <json>\n\n. Raw body chunks don't align to frame boundaries, so the buffer accumulates bytes until a complete \n-terminated line is available before parsing.

Server-side tool execution

defmodule MyApp.Tools.Weather do
  @behaviour OpenResponses.Tool

  @impl OpenResponses.Tool
  def execute(%{"location" => location}, _context) do
    {:ok, "72F and sunny in #{location}"}
  end
end

config :open_responses, :hosted_tools, %{
  "get_weather" => MyApp.Tools.Weather
}

Documentation

License

MIT