GenAgentCodex
Codex backend for GenAgent, built on top of codex_wrapper.
Provides GenAgent.Backends.Codex, which wraps the codex CLI and
translates its NDJSON event output into the normalized GenAgent.Event
values the state machine consumes.
Prerequisites
The codex CLI must be installed and on your PATH. See the
Codex docs for install instructions.
Installation
def deps do
[
{:gen_agent, "~> 0.2.0"},
{:gen_agent_codex, "~> 0.1.0"}
]
endQuick start
defmodule MyApp.Coder do
use GenAgent
defmodule State do
defstruct [:path, responses: []]
end
@impl true
def init_agent(opts) do
path = Keyword.fetch!(opts, :cwd)
backend_opts = [
cwd: path,
sandbox: :read_only,
skip_git_repo_check: true
]
{:ok, backend_opts, %State{path: path}}
end
@impl true
def handle_response(_ref, response, state) do
{:noreply, %{state | responses: state.responses ++ [response.text]}}
end
end
{:ok, _pid} = GenAgent.start_agent(MyApp.Coder,
name: "my-coder",
backend: GenAgent.Backends.Codex,
cwd: "/path/to/project"
)
{:ok, response} = GenAgent.ask("my-coder", "What does lib/foo.ex do?")
IO.puts(response.text)Session continuation
Codex tracks conversation state via a server-side thread_id. The backend
captures it from the first thread.started event of a turn and threads it
through codex exec resume on subsequent turns -- transparently, no caller
code required.
{:ok, r1} = GenAgent.ask("my-coder", "Remember the number 42")
{:ok, r2} = GenAgent.ask("my-coder", "What number did I ask you to remember?")
# r2.text == "42"
Why this backend uses exec_json instead of streaming
CodexWrapper.Exec.stream/2 and CodexWrapper.ExecResume.stream/2 were
historically broken against codex-cli >= 0.118 due to a Port+stdin hang
(see codex_wrapper#37,
fixed in codex_wrapper 0.2.2). Even after the fix, this backend still
uses the non-streaming Exec.execute_json/2 path because:
-
GenAgent's prompt task blocks on the whole turn anyway -- the caller
waits for a full
GenAgent.Responseregardless. handle_stream_event/2still fires for every event in arrival order, just all at once whenexec_jsonreturns instead of progressively.- The path is simpler and has fewer moving parts.
If you need real-time streaming events before the turn completes, you
can provide your own :exec_fn that calls Exec.stream/2 (which now
works) and wrap it in something that yields events over time.
Backend options
Config:
:binary,:working_dir(aliased as:cwd),:env,:timeout,:verbose
Exec:
:model,:sandbox,:approval_policy,:full_auto,:dangerously_bypass_approvals_and_sandbox,:skip_git_repo_check,:ephemeral,:cd,:add_dirs,:search,:output_schema,:config_overrides,:enabled_features,:disabled_features,:images
Backend-only:
:exec_fn-- a 2-arity function(prompt, session) -> {:ok, [events]} | {:error, term()}that replaces the defaultExec/ExecResumedispatch. Intended for tests.
Codex has no equivalent of Claude's --system-prompt; if you need
system-level instructions, pass them via AGENTS.md in the working
directory or through Codex's configuration layer.
See GenAgent.Backends.Codex for the full module docs.
Event translation
Codex CLI's NDJSON output is translated into GenAgent.Event values by
GenAgent.Backends.Codex.EventTranslator:
| Codex event | GenAgent event |
|---|---|
thread.started |
captured for thread_id, then filtered |
turn.started | filtered |
item.completed (agent_message) | :text |
item.completed (tool_call) | :tool_use |
item.completed (tool_result) | :tool_result |
turn.completed | :usage + terminal :result (with captured thread_id as session_id) |
turn.failed / error |
terminal :error |
| anything else | filtered |
Unlike Claude, Codex emits thread_id in the first event of a turn,
not the terminal one. The translator does a first pass to extract it and
injects it into the :result event emitted at the end.
Testing
# Unit tests only (default, no CLI invocation)
mix test
# Include live integration tests that actually call the codex CLI
mix test --only integration
Integration tests are tagged :integration so they do not run by
default. They burn real tokens -- keep them cheap.
License
MIT. See LICENSE.