Jido Browser

Hex.pmHex DocsCILicenseWebsiteEcosystemDiscord

Browser automation for Jido AI agents.

Overview

Jido.Browser is organized around three simple lanes:

agent-browser remains the default adapter. Web also supports warm pools when you want browser-backed sessions with lower cold-start overhead. Vibium remains available without warm-pool support.

The Hex package and OTP app remain jido_browser, while the public Elixir namespace is Jido.Browser.*.

Installation

Add the dependency:

def deps do
[
{:jido_browser, "~> 2.0"}
]
end

Install the default browser backend:

mix jido_browser.install

That installs the pinned agent-browser binary for the current platform and runs agent-browser install to provision the browser runtime.

defp aliases do
[
setup: ["deps.get", "jido_browser.install --if-missing"],
test: ["jido_browser.install --if-missing", "test"]
]
end

Installing Specific Backends

mix jido_browser.install agent_browser
mix jido_browser.install vibium
mix jido_browser.install web

Quick Start

{:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, snapshot} = Jido.Browser.snapshot(session)
snapshot["snapshot"] || snapshot[:snapshot]
{:ok, session, _} = Jido.Browser.click(session, "@e1")
{:ok, _session, %{content: markdown}} = Jido.Browser.extract_content(session, format: :markdown)
:ok = Jido.Browser.end_session(session)

Selectors remain supported, but ref-based interaction is the preferred 2.0 flow:

  1. snapshot
  2. act on @eN refs
  3. re-snapshot

Stateless Web Fetch

{:ok, result} =
Jido.Browser.web_fetch(
"https://example.com/docs",
format: :markdown,
allowed_domains: ["example.com"],
focus_terms: ["API", "authentication"],
citations: true
)
result.content
result.passages
result.metadata # present when extraction returns document metadata

web_fetch/2 keeps HTML handling native for selector extraction and markdown conversion, and uses extractous_ex for fetched binary documents such as PDFs, Word, Excel, PowerPoint, OpenDocument, EPUB, and common email formats. Binary document responses may also include result.metadata when extraction returns document metadata.

State Persistence

state_path = Path.expand("tmp/browser-state.json")
File.mkdir_p!(Path.dirname(state_path))
{:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, _} = Jido.Browser.save_state(session, state_path)
:ok = Jido.Browser.end_session(session)
{:ok, restored} = Jido.Browser.start_session()
{:ok, restored, _} = Jido.Browser.load_state(restored, state_path)

Tab Workflow

{:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, _} = Jido.Browser.new_tab(session, "https://example.org")
{:ok, session, tabs} = Jido.Browser.list_tabs(session)
{:ok, session, _} = Jido.Browser.switch_tab(session, 1)
{:ok, session, _} = Jido.Browser.close_tab(session, 1)

Warm Session Pools

Warm pools are explicit and optional. They speed up browser-backed workflows, while web_fetch/2 stays stateless and never uses pools.

For OTP applications, prefer adding a named pool to your supervision tree:

defmodule MyApp.Application do
use Application
def start(_type, _args) do
children = [
{Jido.Browser.Pool,
name: :default,
size: 2,
headless: true,
startup_timeout: 60_000}
]
Supervisor.start_link(children, strategy: :one_for_one, name: MyApp.Supervisor)
end
end

Then check out pooled sessions by name:

{:ok, session} =
Jido.Browser.start_session(
pool: :default,
checkout_timeout: 5_000
)
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
:ok = Jido.Browser.end_session(session)

Use start_pool/1 for scripts, tests, or ad hoc startup:

{:ok, _pool} =
Jido.Browser.start_pool(
name: :default,
size: 2,
headless: true
)
{:ok, session} =
Jido.Browser.start_session(
pool: :default,
checkout_timeout: 5_000
)
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
:ok = Jido.Browser.end_session(session)

Warm pools are currently supported by Jido.Browser.Adapters.AgentBrowser and Jido.Browser.Adapters.Web.

For the Web adapter, pooled sessions are still browser sessions, not HTTP fetches. Use web_fetch/2 when you want the simplest request/response API without browser state.

Plugin Setup

defmodule MyBrowsingAgent do
use Jido.Agent,
name: "browser_agent",
plugins: [
{Jido.Browser.Plugin,
[
adapter: Jido.Browser.Adapters.AgentBrowser,
pool: :default,
checkout_timeout: 5_000,
headless: true,
timeout: 30_000
]}
]
end

Configuration

config :jido_browser,
adapter: Jido.Browser.Adapters.AgentBrowser
config :jido_browser, :agent_browser,
binary_path: "/usr/local/bin/agent-browser",
headed: false

Other adapters can still be configured explicitly:

config :jido_browser, :vibium,
binary_path: "/path/to/vibium"
config :jido_browser, :web,
binary_path: "/usr/local/bin/web",
profile: "default"

Optional web fetch settings:

config :jido_browser, :web_fetch,
cache_ttl_ms: 300_000,
extractous: [
pdf: [extract_annotation_text: true],
office: [include_headers_and_footers: true]
]

Configured extractous options are merged with any per-call extractous: keyword options passed to Jido.Browser.web_fetch/2.

Backends

AgentBrowser (Default)

Vibium (Legacy)

Web (Legacy)

Public API

Core operations:

Agent-browser-native operations:

Available Actions

Session

Navigation

Interaction

Waiting and Queries

Content and Diagnostics

Tabs

Advanced and Composite

Using With Jido Agents

defmodule MyBrowsingAgent do
use Jido.Agent,
name: "web_browser",
description: "An agent that can browse the web",
plugins: [{Jido.Browser.Plugin, [headless: true]}]
end

Jido.Browser.Plugin now exposes 37 browser actions, including snapshot/refs workflows, browser state actions, diagnostics, and tab management.

License

Apache-2.0 - See LICENSE for details.