OpenResponses
An Elixir/Phoenix implementation of the Open Responses specification — a unified, provider-agnostic API for LLM interactions with first-class streaming, multi-turn conversation, tool dispatch, and agentic loops.
What it does
OpenResponses acts as a multi-provider LLM proxy. Your application makes a single, spec-compliant API call; OpenResponses routes it to the right provider (OpenAI, Anthropic, Gemini, Ollama, or your own adapter), streams the response back, and manages the full agentic loop — including server-side tool execution — without any client-side orchestration.
curl http://localhost:4000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "claude-haiku-4-5-20251001",
"input": [{"role": "user", "content": "Hello!"}]
}'Features
- Multi-provider routing — OpenAI, Anthropic (Claude), Google Gemini, Ollama, and any custom adapter
- Streaming SSE — spec-compliant server-sent events with sequence numbers
- Agentic loop — automatic tool dispatch with hosted (server-side) tool execution
- Multi-turn conversations —
previous_response_idreconstructs full conversation context - Middleware pipeline — intercept the loop for logging, token budgets, rate limiting, content filtering
- MCP integration — connect Model Context Protocol servers for external tool execution
- Observability — structured telemetry events and Prometheus metrics via PromEx
- BEAM-native — each request runs in an isolated GenServer process; thousands of concurrent loops on a single node
Supported providers
| Provider | Model pattern | Adapter |
|---|---|---|
| OpenAI | gpt-*, o1* | OpenResponses.Adapters.OpenAI |
| Anthropic / z.ai | claude-* | OpenResponses.Adapters.Anthropic |
| Google Gemini | gemini-* | OpenResponses.Adapters.Gemini |
| Ollama (local) | llama*, mistral*, phi*, qwen* | OpenResponses.Adapters.Ollama |
Installation
Add to your mix.exs:
def deps do
[
{:open_responses, "~> 0.1"}
]
endThen run the Igniter installer:
mix open_responses.installThis adds the router scope, supervision tree entries, and a config block with placeholder API keys. See the Installation guide for manual setup.
Configuration
# config/runtime.exs
config :open_responses, :provider_config, %{
openai: [api_key: System.fetch_env!("OPENAI_API_KEY")],
anthropic: [api_key: System.fetch_env!("ANTHROPIC_API_KEY")],
gemini: [api_key: System.fetch_env!("GEMINI_API_KEY")]
}
config :open_responses, :routing, %{
~r/^gpt-/ => OpenResponses.Adapters.OpenAI,
~r/^claude-/ => OpenResponses.Adapters.Anthropic,
~r/^gemini-/ => OpenResponses.Adapters.Gemini,
"default" => OpenResponses.Adapters.OpenAI
}Streaming example
const response = await fetch("http://localhost:4000/v1/responses", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "claude-haiku-4-5-20251001",
stream: true,
input: [{ role: "user", content: "Write a haiku about the BEAM." }]
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop();
for (const line of lines) {
if (!line.startsWith("data: ") || line.includes("[DONE]")) continue;
const event = JSON.parse(line.slice(6));
if (event.type === "response.output_text.delta") {
process.stdout.write(event.delta);
}
}
}
Each SSE frame arrives as event: <type>\ndata: <json>\n\n. Raw body chunks don't align to frame boundaries, so the buffer accumulates bytes until a complete \n-terminated line is available before parsing.
Server-side tool execution
Register a tool module and it executes inside the agentic loop — no client round-trip needed:
defmodule MyApp.Tools.Weather do
@behaviour OpenResponses.Tool
@impl OpenResponses.Tool
def execute(%{"location" => location}, _context) do
{:ok, "72F and sunny in #{location}"}
end
endconfig :open_responses, :hosted_tools, %{
"get_weather" => MyApp.Tools.Weather
}Documentation
- Overview
- Getting Started
- Providers
- Streaming
- Tool Dispatch
- Conversation History
- Middleware
- MCP Integration
- Observability
- Configuration Reference
- Scaling
- Deploying
License
MIT