Gemini Elixir Client Logo

Gemini Elixir Client

CIElixirOTPHex.pmDocumentationLicense

A comprehensive Elixir client for Google's Gemini AI API with dual authentication support, advanced streaming capabilities, type safety, and built-in telemetry.

Features

ALTAR Integration: The Path to Production

gemini_ex is the first project to integrate with the ALTAR Productivity Platform, a system designed to bridge the gap between local AI development and enterprise-grade production deployment.

We've adopted ALTAR's LATER protocol to provide a best-in-class local tool-calling experience. This is the first step in a long-term vision to offer a seamless "promotion path" for your AI tools, from local testing to a secure, scalable, and governed production environment via ALTAR's GRID protocol.

Learn the full story behind our integration in ALTAR_INTEGRATION.md

Installation

Add gemini to your list of dependencies in mix.exs:

def deps do
  [
    {:gemini_ex, "~> 0.13.0"}
  ]
end

Quick Start

Basic Configuration

Configure your API key in config/runtime.exs:

import Config

config :gemini_ex,
  api_key: System.get_env("GEMINI_API_KEY")

Or set the environment variable:

export GEMINI_API_KEY="your_api_key_here"

For default Gemini auth resolution, config :gemini_ex, api_key: ... now takes precedence over GEMINI_API_KEY. Narrower overrides still win: pass api_key: directly on a request or on Gemini.Live.Session.start_link/1 for session-scoped credentials.

Simple Content Generation

# Basic text generation
{:ok, response} = Gemini.generate("Tell me about Elixir programming")
{:ok, text} = Gemini.extract_text(response)
IO.puts(text)

# With options
{:ok, response} = Gemini.generate("Explain quantum computing", [
  model: "gemini-flash-lite-latest",
  temperature: 0.7,
  max_output_tokens: 1000
])

# Advanced generation config with structured output
{:ok, response} = Gemini.generate("Analyze this topic and provide a summary", [
  response_json_schema: %{
    "type" => "object",
    "properties" => %{
      "summary" => %{"type" => "string"},
      "key_points" => %{"type" => "array", "items" => %{"type" => "string"}},
      "confidence" => %{"type" => "number"}
    }
  },
  response_mime_type: "application/json",
  temperature: 0.3
])

System Instructions

Set persistent guardrails that apply across an entire call or chat session without bloating your message history:

{:ok, response} =
  Gemini.generate("List three tips for interviewing junior engineers",
    system_instruction: "Be concise, avoid markdown, and keep answers under 40 words."
  )

{:ok, text} = Gemini.extract_text(response)
# Works the same with `Gemini.create_chat_session/1` and streaming calls via the `system_instruction:` option.

Simple Tool Calling

# Define a simple tool
defmodule WeatherTool do
  def get_weather(%{"location" => location}) do
    %{location: location, temperature: 22, condition: "sunny"}
  end
end

# Create and register the tool
{:ok, weather_declaration} = Altar.ADM.new_function_declaration(%{
  name: "get_weather",
  description: "Gets weather for a location",
  parameters: %{
    type: "object",
    properties: %{location: %{type: "string", description: "City name"}},
    required: ["location"]
  }
})

Gemini.Tools.register(weather_declaration, &WeatherTool.get_weather/1)

# Use the tool automatically - the model will call it as needed
{:ok, response} = Gemini.generate_content_with_auto_tools(
  "What's the weather like in Tokyo?",
  tools: [weather_declaration]
)

{:ok, text} = Gemini.extract_text(response)
IO.puts(text) # "The weather in Tokyo is sunny with a temperature of 22°C."

Advanced Streaming

# Start a streaming session
{:ok, stream_id} = Gemini.stream_generate("Write a long story about AI", [
  on_chunk: fn chunk -> IO.write(chunk) end,
  on_complete: fn -> IO.puts("\nStream complete!") end,
  on_error: fn error -> IO.puts("Error: #{inspect(error)}") end
])

# Stream management
Gemini.Streaming.pause_stream(stream_id)
Gemini.Streaming.resume_stream(stream_id)
Gemini.Streaming.stop_stream(stream_id)

Streaming knobs: pass timeout: (per attempt, default config :gemini_ex, :timeout = 120_000), stream_timeout: (collect timeout, default 60_000; orphaned streams are cleaned up on timeout), max_retries: (default 3), max_backoff_ms: (default 10_000), and connect_timeout: (default 5_000). Manager cleanup delay can be tuned via config :gemini_ex, :streaming, cleanup_delay_ms: ....

Interactions Quick Start

alias Gemini.APIs.Interactions
alias Gemini.Types.Interactions.Events.ContentDelta
alias Gemini.Types.Interactions.DeltaTextDelta

{:ok, stream} =
  Interactions.create("Write a short poem about Elixir",
    model: "gemini-2.5-flash",
    stream: true
  )

for event <- stream do
  case event do
    %ContentDelta{delta: %DeltaTextDelta{text: text}} when is_binary(text) ->
      IO.write(text)

    _ ->
      :ok
  end
end

See guides/interactions.md for CRUD, resumption (last_event_id), and background/cancel/delete examples.

Live API (WebSocket)

Real-time bidirectional streaming for voice, video, and text interactions. For Gemini Live connections, v1beta is the default API version, while v1alpha is available for advanced native-audio features. Vertex Live connections use the Vertex v1 WebSocket endpoint.

Gemini Live and Vertex Live do not emit identical usage metadata fields. gemini_ex normalizes both backends into Gemini.Types.Live.UsageMetadata.candidates_token_count / candidates_tokens_details, keeps response_* aliases for backwards compatibility, and exposes Gemini.Types.Live.UsageMetadata.output_token_count/1 and output_tokens_details/1 as backend-agnostic helpers. Vertex Live may also populate server_content.turn_complete_reason.

Model Resolution

Live API model availability varies by API key and regional rollout. Gemini.Live.Models.resolve/1 uses the model registry plus runtime list_models results to select a compatible model:

alias Gemini.Live.Models

# Resolve best available model for your key
audio_model = Models.resolve(:audio) # Current Gemini Live sessions

Basic Usage

alias Gemini.Live.{Models, Session}

{:ok, session} = Session.start_link(
  model: Models.resolve(:audio),
  auth: :gemini,
  generation_config: %{response_modalities: ["AUDIO"]},
  output_audio_transcription: %{},
  on_message: fn msg -> IO.inspect(msg) end
)

:ok = Session.connect(session)
:ok = Session.send_text(session, "Hello!")

# Messages delivered via on_message callback
# Close when done
Session.close(session)

Audio Streaming with Native Audio Features

alias Gemini.Live.{Models, Session}

{:ok, session} = Session.start_link(
  model: Models.resolve(:audio),
  auth: :gemini,
  api_version: "v1alpha",  # Required for affective dialog and proactivity
  generation_config: %{response_modalities: ["AUDIO"]},
  input_audio_transcription: %{},
  output_audio_transcription: %{},
  enable_affective_dialog: true,  # Emotion-aware responses
  proactivity: %{proactive_audio: true},  # Model can choose not to respond
  on_message: fn msg -> handle_audio_response(msg) end
)

# Send audio chunks (16-bit PCM, 16kHz, mono)
Session.send_realtime_input(session, audio: %{
  data: pcm_data,
  mime_type: "audio/pcm;rate=16000"
})

Function Calling

alias Gemini.Live.{Models, Session}

tools = [
  %{function_declarations: [
    %{name: "get_weather", description: "Get weather", parameters: %{...}}
  ]}
]

{:ok, session} = Session.start_link(
  model: Models.resolve(:audio),
  tools: tools,
  generation_config: %{response_modalities: ["AUDIO"]},
  output_audio_transcription: %{},
  on_tool_call: fn %{function_calls: calls} ->
    responses = Enum.map(calls, &execute_function/1)
    {:tool_response, responses}  # Return to send automatically
  end
)

See the Live API Guide for complete documentation including voice activity detection, session resumption, thinking budgets, and context window compression.

Rate Limiting & Concurrency (built-in)

Model aliases: resolve the built-in use-case aliases via Gemini.Config.model_for_use_case/2 (e.g., :cache_context, :report_section, :fast_path) to avoid scattering raw model strings and to respect the recommended token minima for each use case.

Timeouts (HTTP & Streaming)

Advanced Generation Configuration

# Using GenerationConfig struct for complex configurations
config = %Gemini.Types.GenerationConfig{
  temperature: 0.7,
  max_output_tokens: 2000,
  response_json_schema: %{
    "type" => "object",
    "properties" => %{
      "analysis" => %{"type" => "string"},
      "recommendations" => %{"type" => "array", "items" => %{"type" => "string"}}
    }
  },
  response_mime_type: "application/json",
  stop_sequences: ["END", "COMPLETE"],
  presence_penalty: 0.5,
  frequency_penalty: 0.3
}

{:ok, response} = Gemini.generate("Analyze market trends", generation_config: config)

# All generation config options are supported:
{:ok, response} = Gemini.generate("Creative writing task", [
  temperature: 0.9,           # Creativity level
  top_p: 0.8,                # Nucleus sampling
  top_k: 40,                 # Top-k sampling
  candidate_count: 3,        # Multiple responses
  response_logprobs: true,   # Include probabilities
  logprobs: 5               # Token probabilities
])

Structured JSON Outputs

Generate responses that guarantee adherence to a specific JSON Schema:

# Define your schema
schema = %{
  "type" => "object",
  "properties" => %{
    "answer" => %{"type" => "string"},
    "confidence" => %{
      "type" => "number",
      "minimum" => 0.0,
      "maximum" => 1.0
    }
  }
}

# Use the convenient helper
config = Gemini.Types.GenerationConfig.structured_json(schema)

{:ok, response} = Gemini.generate(
  "What is the capital of France?",
  model: "gemini-2.5-flash",
  generation_config: config
)

{:ok, text} = Gemini.extract_text(response)
{:ok, data} = Jason.decode(text)
# => %{"answer" => "Paris", "confidence" => 0.99}

GenerationConfig.structured_json/2 uses response_json_schema (standard JSON Schema) by default. If you need Gemini's internal schema format, pass schema_type: :response_schema:

config =
  GenerationConfig.structured_json(%{"type" => "OBJECT"}, schema_type: :response_schema)

New Features (November 2025):

For Gemini 2.0 models, add explicit property ordering:

config =
  GenerationConfig.structured_json(schema)
  |> GenerationConfig.property_ordering(["answer", "confidence"])

See Structured Outputs Guide for details.

Context Caching (New in v0.6.0!)

Cache large prompts/contexts once and reuse the cache ID to avoid resending bytes:

alias Gemini.Types.Content

# Create a cache from your content (supports system_instruction, tools, fileUri)
{:ok, cache} =
  Gemini.create_cache(
    [
      Content.text("long document or conversation history"),
      %Content{role: "user", parts: [%{file_uri: "gs://cloud-samples-data/generative-ai/pdf/scene.pdf"}]}
    ],
    display_name: "My Cache",
    model: "gemini-2.5-flash",  # Use models that support caching
    system_instruction: "Answer in one concise paragraph."
  )

# Use cached content by name (e.g., "cachedContents/123")
{:ok, response} =
  Gemini.generate("Summarize the cached content",
    cached_content: cache.name,
    model: "gemini-2.5-flash"
  )

TTL defaults: The default cache TTL is configurable via config :gemini_ex, :context_cache, default_ttl_seconds: ... (defaults to 3_600). You can also override per call with default_ttl_seconds: or pass :ttl/:expire_time explicitly.

Models that support explicit caching:

You can list, get, update TTL, and delete caches via the top-level Gemini.*cache* helpers or Gemini.APIs.ContextCache.*. Vertex AI names are auto-expanded when auth: :vertex_ai or configured credentials are present.

Files API (New in v0.7.0!)

Upload and manage files for use with Gemini models. Perfect for multimodal content generation with images, videos, audio, and documents.

alias Gemini.APIs.Files

# Upload a file (Gemini Developer API only)
{:ok, file} = Files.upload("path/to/image.png", auth: :gemini)

# Use the File struct directly in content generation
{:ok, response} = Gemini.generate([file, "What&#39;s in this image?"])

# Wait for processing only when the file is still processing
{:ok, video} = Files.upload("path/to/video.mp4", auth: :gemini)
{:ok, ready_video} = Files.wait_for_processing(video.name, auth: :gemini)
{:ok, video_response} = Gemini.generate([ready_video, "Describe this video clip"])

# List all files
{:ok, files} = Files.list_all(auth: :gemini)

# Clean up
:ok = Files.delete(file.name, auth: :gemini)
:ok = Files.delete(video.name, auth: :gemini)

Key Features:

Note: The Files API is available only on the Gemini Developer API. It is not supported on Vertex AI.

See Files API Guide for complete documentation.

File Search Stores (New in v0.8.x!)

Create semantic search stores for RAG and ground model responses with your own data (Vertex AI only).

alias Gemini.APIs.FileSearchStores
alias Gemini.Types.CreateFileSearchStoreConfig

# Create and activate a store
config = %CreateFileSearchStoreConfig{display_name: "Support KB"}
{:ok, store} = FileSearchStores.create(config, auth: :vertex_ai)
{:ok, active_store} = FileSearchStores.wait_for_active(store.name)

# Upload documents directly to the store and wait for indexing
{:ok, doc} =
  FileSearchStores.upload_to_store(active_store.name, "docs/faq.pdf",
    display_name: "Support FAQ",
    auth: :vertex_ai
  )

{:ok, _} = FileSearchStores.wait_for_document(doc.name, auth: :vertex_ai)

# Use the store to ground a generation request
{:ok, response} =
  Gemini.generate_content(
    "What is the warranty policy for the Pro model?",
    tools: [%{file_search_stores: [active_store.name]}],
    auth: :vertex_ai
  )

Key features: automatic chunking/indexing, upload/import existing Files API uploads or GCS URIs, list/delete stores, and helpers to wait for readiness.

Documents API (New in v0.7.0!)

Manage the documents inside your File Search Stores.

alias Gemini.APIs.Documents

# List and inspect documents
{:ok, page} = Documents.list("ragStores/support-kb", auth: :vertex_ai)
{:ok, doc} = Documents.get("ragStores/support-kb/documents/doc123", auth: :vertex_ai)

# Wait for processing and clean up
{:ok, ready_doc} = Documents.wait_for_processing(doc.name, on_status: &IO.inspect/1, auth: :vertex_ai)
:ok = Documents.delete(ready_doc.name, auth: :vertex_ai)

List helpers (list_all/2) collapse pagination, and wait helpers make it easy to block until documents are indexed.

Batches API (New in v0.7.0!)

Submit large batches of requests with 50% cost savings. Ideal for bulk processing, overnight jobs, and high-volume workloads.

alias Gemini.APIs.{Files, Batches}
alias Gemini.Types.BatchJob

# 1. Upload input file (JSONL format)
{:ok, input} = Files.upload("requests.jsonl")

# 2. Create batch job
{:ok, batch} = Batches.create("gemini-2.5-flash",
  file_name: input.name,
  display_name: "My Batch"
)

# 3. Wait for completion with progress
{:ok, completed} = Batches.wait(batch.name,
  on_progress: fn b ->
    if progress = BatchJob.get_progress(b) do
      IO.puts("Progress: #{Float.round(progress, 1)}%")
    end
  end
)

# 4. Check results
if BatchJob.succeeded?(completed) do
  IO.puts("Completed #{completed.completion_stats.success_count} requests")
end

Key Features:

See Batches API Guide for complete documentation.

Operations API (New in v0.7.0!)

Track and manage long-running operations like video generation, file imports, and model tuning.

alias Gemini.APIs.Operations
alias Gemini.Types.Operation

# Check operation status
{:ok, op} = Operations.get("operations/abc123")
IO.puts("Done: #{op.done}")

# Wait for completion with exponential backoff
{:ok, completed} = Operations.wait_with_backoff("operations/abc123",
  initial_delay: 1_000,
  max_delay: 60_000,
  timeout: 3_600_000,
  on_progress: fn op ->
    if progress = Operation.get_progress(op) do
      IO.puts("Progress: #{progress}%")
    end
  end
)

if Operation.succeeded?(completed) do
  IO.inspect(completed.response)
end

Key Features:

See Operations API Guide for complete documentation.

Tunings API (New in v0.8.x!)

Fine-tune base models with supervised datasets (Vertex AI).

alias Gemini.APIs.Tunings
alias Gemini.Types.Tuning.CreateTuningJobConfig

config = %CreateTuningJobConfig{
  base_model: "gemini-2.5-flash-001",
  tuned_model_display_name: "support-bot",
  training_dataset_uri: "gs://bucket/train.jsonl",
  validation_dataset_uri: "gs://bucket/val.jsonl",
  epoch_count: 3,
  learning_rate_multiplier: 1.0,
  adapter_size: "x1"
}

{:ok, job} = Tunings.tune(config, auth: :vertex_ai)
{:ok, completed} = Tunings.wait_for_completion(job.name, auth: :vertex_ai)
IO.puts("Tuned model: #{completed.tuned_model}")

You also get list/1, list_all/1, get/2, and cancel/2 helpers plus polling and progress callbacks.

RegisterFiles API (Gemini API only)

Register existing GCS files with the Gemini API without uploading. Ideal for large files already in Cloud Storage.

alias Gemini.APIs.Files

# Fetch an OAuth token with read access to the bucket
{:ok, token} = Goth.fetch(MyApp.Goth)

# Register GCS files
{:ok, response} = Files.register_files(
  ["gs://my-bucket/documents/report.pdf", "gs://my-bucket/images/photo.jpg"],
  credentials: %{token: token.token},
  auth: :gemini
)

# Use registered files in generation
Enum.each(response.files, fn file ->
  IO.puts("Registered: #{file.name} - #{file.uri}")
end)

file = hd(response.files)
{:ok, response} = Gemini.generate([
  file,
  "Summarize this document"
])

Note: This feature is only available in the Gemini Developer API, not Vertex AI. Pass either %{token: "..."} or a Goth.Token-style struct, and ensure the token can read the referenced GCS objects.

Model Armor (Vertex AI only)

Enterprise content filtering with centralized policy management. Apply Model Armor templates to filter prompt and response content.

alias Gemini.Types.ModelArmorConfig

config = %ModelArmorConfig{
  prompt_template_name: "projects/my-project/locations/us-central1/templates/prompt-filter",
  response_template_name: "projects/my-project/locations/us-central1/templates/response-filter"
}

# Use in generate request (Vertex AI only)
{:ok, response} = Gemini.generate("Hello world",
  auth: :vertex_ai,
  model_armor_config: config
)

Important:

Multi-turn Conversations

# Create a chat session
{:ok, session} = Gemini.create_chat_session([
  model: "gemini-flash-lite-latest",
  system_instruction: "You are a helpful programming assistant."
])

# Send messages
{:ok, response1} = Gemini.send_message(session, "What is functional programming?")
{:ok, response2} = Gemini.send_message(session, "Show me an example in Elixir")

# Get conversation history
history = Gemini.get_conversation_history(session)

Tool Calling (Function Calling)

Tool calling enables the Gemini model to interact with external functions and APIs, making it possible to build powerful agents that can perform actions, retrieve real-time data, and integrate with your systems. This transforms the model from a text generator into an intelligent agent capable of complex workflows.

Automatic Execution (Recommended)

The automatic tool calling system provides the easiest and most robust way to use tools. It handles the entire multi-turn conversation loop automatically, executing tool calls and managing the conversation state behind the scenes.

Step 1: Define & Register Your Tools

# Define your tool functions
defmodule DemoTools do
  def get_weather(%{"location" => location}) do
    # Your weather API integration here
    %{
      location: location,
      temperature: 22,
      condition: "sunny",
      humidity: 65
    }
  end

  def calculate(%{"operation" => op, "a" => a, "b" => b}) do
    result = case op do
      "add" -> a + b
      "multiply" -> a * b
      "divide" when b != 0 -> a / b
      _ -> {:error, "Invalid operation"}
    end
    
    %{operation: op, result: result}
  end
end

# Create function declarations
{:ok, weather_declaration} = Altar.ADM.new_function_declaration(%{
  name: "get_weather",
  description: "Gets current weather information for a specified location",
  parameters: %{
    type: "object",
    properties: %{
      location: %{
        type: "string",
        description: "The location to get weather for (e.g., &#39;San Francisco&#39;)"
      }
    },
    required: ["location"]
  }
})

{:ok, calc_declaration} = Altar.ADM.new_function_declaration(%{
  name: "calculate",
  description: "Performs basic mathematical calculations",
  parameters: %{
    type: "object",
    properties: %{
      operation: %{type: "string", enum: ["add", "multiply", "divide"]},
      a: %{type: "number", description: "First operand"},
      b: %{type: "number", description: "Second operand"}
    },
    required: ["operation", "a", "b"]
  }
})

# Register the tools
Gemini.Tools.register(weather_declaration, &DemoTools.get_weather/1)
Gemini.Tools.register(calc_declaration, &DemoTools.calculate/1)

Step 2: Call the Model

# Single call with automatic tool execution
{:ok, response} = Gemini.generate_content_with_auto_tools(
  "What&#39;s the weather like in Tokyo? Also calculate 15 * 23.",
  tools: [weather_declaration, calc_declaration],
  model: "gemini-flash-lite-latest",
  temperature: 0.1
)

Step 3: Get the Final Result

# Extract the final text response
{:ok, text} = Gemini.extract_text(response)
IO.puts(text)
# Output: "The weather in Tokyo is sunny with 22°C and 65% humidity. 
#          The calculation of 15 * 23 equals 345."

The model automatically:

Streaming with Automatic Execution

For real-time responses with tool calling:

# Start streaming with automatic tool execution
{:ok, stream_id} = Gemini.stream_generate_with_auto_tools(
  "Check the weather in London and calculate the tip for a $50 meal",
  tools: [weather_declaration, calc_declaration],
  model: "gemini-flash-lite-latest"
)

# Subscribe to the stream
:ok = Gemini.subscribe_stream(stream_id)

# The subscriber will only receive the final text chunks
# All tool execution happens automatically in the background
receive do
  {:stream_event, ^stream_id, event} -> 
    case Gemini.extract_text(event) do
      {:ok, text} -> IO.write(text)
      _ -> :ok
    end
  {:stream_complete, ^stream_id} -> IO.puts("\n✅ Complete!")
end

Built-in Tools (Gemini 3)

Gemini 3 models can call built-in tools for Google Search, URL Context, and Code Execution. Enable them in tools: and optionally combine with structured outputs:

{:ok, response} =
  Gemini.generate(
    "Find the latest Elixir release notes and summarize the key changes.",
    model: "gemini-3-flash-preview",
    tools: [:google_search, :url_context],
    response_mime_type: "application/json",
    response_json_schema: %{
      "type" => "object",
      "properties" => %{
        "summary" => %{"type" => "string"},
        "sources" => %{"type" => "array", "items" => %{"type" => "string"}}
      },
      "required" => ["summary"]
    }
  )

Built-in tools can be mixed with your own function declarations in the same tools: list.

Manual Execution (Advanced)

For advanced use cases requiring full control over the conversation loop, custom state management, or detailed logging of tool executions:

# Step 1: Generate content with tool declarations
{:ok, response} = Gemini.generate_content(
  "What&#39;s the weather in Paris?",
  tools: [weather_declaration],
  model: "gemini-flash-lite-latest"
)

# Step 2: Check for function calls in the response
case response.candidates do
  [%{content: %{parts: parts}}] ->
    function_calls = Enum.filter(parts, &match?(%{function_call: _}, &1))
    
    if function_calls != [] do
      # Step 3: Execute the function calls
      {:ok, tool_results} = Gemini.Tools.execute_calls(function_calls)
      
      # Step 4: Create content from tool results
      tool_content = Gemini.Types.Content.from_tool_results(tool_results)
      
      # Step 5: Continue the conversation with results
      conversation_history = [
        %{role: "user", parts: [%{text: "What&#39;s the weather in Paris?"}]},
        response.candidates |> hd() |> Map.get(:content),
        tool_content
      ]
      
      {:ok, final_response} = Gemini.generate_content(
        conversation_history,
        model: "gemini-flash-lite-latest"
      )
      
      {:ok, text} = Gemini.extract_text(final_response)
      IO.puts(text)
    end
end

This manual approach gives you complete visibility and control over each step of the tool calling process, which can be valuable for debugging, logging, or implementing custom conversation management logic.

Embeddings (New in v0.3.0!)

Generate semantic embeddings for text to power RAG systems, semantic search, classification, and more.

Quick Start

# Generate an embedding
{:ok, response} = Gemini.embed_content("Hello, world!")
values = response.embedding.values  # [0.123, -0.456, ...]

# Compute similarity
alias Gemini.Types.Response.ContentEmbedding

{:ok, resp1} = Gemini.embed_content("The cat sat on the mat")
{:ok, resp2} = Gemini.embed_content("A feline rested on the rug")

# Normalize for accurate similarity (required for non-3072 dimensions)
norm1 = ContentEmbedding.normalize(resp1.embedding)
norm2 = ContentEmbedding.normalize(resp2.embedding)

similarity = ContentEmbedding.cosine_similarity(norm1, norm2)
# => 0.85 (high similarity)

MRL (Matryoshka Representation Learning)

The gemini-embedding-001 model supports flexible dimensions (128-3072) with minimal quality loss:

# 768 dimensions - RECOMMENDED (25% storage, 0.26% quality loss)
{:ok, response} = Gemini.embed_content(
  "Your text",
  model: "gemini-embedding-001",
  output_dimensionality: 768
)

# 1536 dimensions - High quality (50% storage, same MTEB score as 3072!)
{:ok, response} = Gemini.embed_content(
  "Your text",
  output_dimensionality: 1536
)

MTEB Benchmark Scores:

Task Types for Better Quality

Optimize embeddings for your specific use case:

# For knowledge base documents
{:ok, doc_emb} = Gemini.embed_content(
  document_text,
  task_type: "RETRIEVAL_DOCUMENT",
  title: "Document Title"  # Improves quality!
)

# For search queries
{:ok, query_emb} = Gemini.embed_content(
  user_query,
  task_type: "RETRIEVAL_QUERY"
)

# For classification
{:ok, emb} = Gemini.embed_content(
  text,
  task_type: "CLASSIFICATION"
)

Distance Metrics

alias Gemini.Types.Response.ContentEmbedding

# Cosine similarity (higher = more similar, -1 to 1)
similarity = ContentEmbedding.cosine_similarity(emb1, emb2)

# Euclidean distance (lower = more similar, 0 to ∞)
distance = ContentEmbedding.euclidean_distance(emb1, emb2)

# Dot product (equals cosine for normalized embeddings)
dot = ContentEmbedding.dot_product(emb1, emb2)

# L2 norm (should be ~1.0 after normalization)
norm = ContentEmbedding.norm(embedding)

Batch Embedding

Efficient for multiple texts:

texts = ["Text 1", "Text 2", "Text 3"]
{:ok, response} = Gemini.batch_embed_contents(
  "gemini-embedding-001",
  texts,
  task_type: "RETRIEVAL_DOCUMENT"
)

# Access embeddings
embeddings = response.embeddings  # List of ContentEmbedding structs

Advanced Use Cases

Complete production-ready examples in examples/use_cases/:

See examples/EMBEDDINGS.md for comprehensive documentation.

Critical: Normalization

IMPORTANT: Only 3072-dimensional embeddings are pre-normalized. All other dimensions MUST be normalized before computing similarity:

# WRONG - Produces incorrect similarity scores
similarity = ContentEmbedding.cosine_similarity(emb1, emb2)

# CORRECT - Normalize first for non-3072 dimensions
norm1 = ContentEmbedding.normalize(emb1)
norm2 = ContentEmbedding.normalize(emb2)
similarity = ContentEmbedding.cosine_similarity(norm1, norm2)

Async Batch Embedding (New in v0.3.1!)

For production-scale embedding generation with 50% cost savings:

# Submit large batch asynchronously
{:ok, batch} = Gemini.async_batch_embed_contents(
  texts,
  display_name: "Knowledge Base Index",
  task_type: :retrieval_document,
  output_dimensionality: 768
)

# Poll for completion with progress tracking
{:ok, completed_batch} = Gemini.await_batch_completion(
  batch.name,
  poll_interval: 10_000,  # 10 seconds
  timeout: 30 * 60 * 1000,  # 30 minutes
  on_progress: fn b ->
    progress = b.batch_stats.successful_request_count / b.batch_stats.request_count * 100
    IO.puts("Progress: #{Float.round(progress, 1)}%")
  end
)

# Retrieve embeddings
{:ok, embeddings} = Gemini.get_batch_embeddings(completed_batch)

When to use:

Live Examples:

mix run examples/async_batch_embedding_demo.exs
mix run examples/async_batch_production_demo.exs

See examples/ASYNC_BATCH_EMBEDDINGS.md for complete guide.

Examples

The repository includes comprehensive examples demonstrating all library features. All examples are ready to run and include proper error handling.

Running Examples

All examples use the same execution method:

mix run examples/[example_name].exs

Available Examples

1. demo.exs - Comprehensive Feature Showcase

The main library demonstration covering all core features.

mix run examples/demo.exs

Features demonstrated:

Requirements:GEMINI_API_KEY environment variable


2. streaming_demo.exs - Real-time Streaming

Live demonstration of Server-Sent Events streaming with progressive text delivery.

mix run examples/streaming_demo.exs

Features demonstrated:

Requirements:GEMINI_API_KEY or Vertex AI credentials


3. demo_unified.exs - Multi-Auth Architecture

Showcases the unified architecture supporting multiple authentication methods.

mix run examples/demo_unified.exs

Features demonstrated:

Requirements: None (works with or without credentials)


4. multi_auth_demo.exs - Concurrent Authentication

Demonstrates concurrent usage of multiple authentication strategies.

mix run examples/multi_auth_demo.exs

Features demonstrated:

Requirements:GEMINI_API_KEY recommended (demonstrates Vertex AI auth failure)


5. telemetry_showcase.exs - Comprehensive Telemetry System

Complete demonstration of the built-in telemetry and observability features.

mix run examples/telemetry_showcase.exs

Features demonstrated:

Requirements:GEMINI_API_KEY for live telemetry (works without for utilities demo)


6. auto_tool_calling_demo.exs - Automatic Tool Execution (Recommended)

Demonstrates the powerful automatic tool calling system for building intelligent agents.

mix run examples/auto_tool_calling_demo.exs

Features demonstrated:

Requirements:GEMINI_API_KEY for live tool execution


7. tool_calling_demo.exs - Manual Tool Execution

Shows manual control over the tool calling conversation loop for advanced use cases.

mix run examples/tool_calling_demo.exs

Features demonstrated:

Requirements:GEMINI_API_KEY for live tool execution


8. manual_tool_calling_demo.exs - Advanced Manual Tool Control

Comprehensive manual tool calling patterns for complex agent workflows.

mix run examples/manual_tool_calling_demo.exs

Features demonstrated:

Requirements:GEMINI_API_KEY for live tool execution


9. live_auto_tool_test.exs - Live End-to-End Tool Calling Test LIVE EXAMPLE

A comprehensive live test demonstrating real automatic tool execution with the Gemini API.

mix run examples/live_auto_tool_test.exs

Features demonstrated:

What makes this special:

Requirements:GEMINI_API_KEY environment variable (this is a live API test)

Example output:

SUCCESS! Final Response from Gemini:
The `Enum` module in Elixir is a powerful tool for working with collections...
Based on the information retrieved using `get_elixir_module_info`, here's a breakdown:
1. Main Purpose: Provides consistent iteration over enumerables (lists, maps, ranges)
2. Common Functions: map/2, filter/2, reduce/3, sum/1, sort/1...
3. Usefulness: Unified interface, functional programming, high performance...

10. live_api_test.exs - API Testing and Validation

Comprehensive testing utility for validating both authentication methods.

mix run examples/live_api_test.exs

Features demonstrated:

Requirements:GEMINI_API_KEY and/or Vertex AI credentials


11. 11_live_text_chat.exs - Live API Multi-Turn Text Chat

Text-oriented chat UX over a Live audio session with output transcription.

mix run examples/11_live_text_chat.exs

Features demonstrated:

Requirements:GEMINI_API_KEY environment variable


12. 12_live_audio_streaming.exs - Live API Audio Streaming

Audio input/output streaming for voice applications.

mix run examples/12_live_audio_streaming.exs

Features demonstrated:

Requirements:GEMINI_API_KEY environment variable


13. 13_live_session_resumption.exs - Live API Session Resumption

Reconnect to sessions while preserving conversation context.

mix run examples/13_live_session_resumption.exs

Features demonstrated:

Requirements:GEMINI_API_KEY environment variable


14. 14_live_function_calling.exs - Live API Function Calling

Tool/function calling with full telemetry observability.

mix run examples/14_live_function_calling.exs

Features demonstrated:

Requirements:GEMINI_API_KEY environment variable

Example Output

Each example provides detailed output with:

Setting Up Authentication

For the examples to work with live API calls, set up authentication:

# For Gemini API (recommended for examples)
export GEMINI_API_KEY="your_gemini_api_key"

# For Vertex AI (optional, for multi-auth demos)
export VERTEX_JSON_FILE="/path/to/service-account.json"
# Alternative (standard ADC path)
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
export VERTEX_PROJECT_ID="your-gcp-project-id"

Example Development Pattern

The examples follow a consistent pattern:

Authentication

Gemini API Key (Recommended for Development)

Default Gemini key precedence is:

export GEMINI_API_KEY="your_api_key"
config :gemini_ex, api_key: "your_api_key"

Gemini.generate("Hello", api_key: "specific_key")

{:ok, session} =
  Gemini.Live.Session.start_link(
    model: Gemini.Live.Models.resolve(:audio),
    api_key: "session_specific_key"
  )

Vertex AI (Recommended for Production)

# Service Account JSON file
export VERTEX_SERVICE_ACCOUNT="/path/to/service-account.json"
export VERTEX_PROJECT_ID="your-gcp-project"
export VERTEX_LOCATION="us-central1"

# Application config
config :gemini_ex, :auth,
  type: :vertex_ai,
  credentials: %{
    service_account_key: System.get_env("VERTEX_SERVICE_ACCOUNT"),
    project_id: System.get_env("VERTEX_PROJECT_ID"),
    location: "us-central1"
  }

Application Default Credentials (ADC)

Zero-config GCP authentication with automatic credential discovery and token refresh.

# Configure ADC explicitly (optional)
export GOOGLE_APPLICATION_CREDENTIALS_JSON='{"type":"service_account",...}'  # gemini_ex extension
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service_account.json"
# Works on GCE/Cloud Run/GKE with no extra setup
{:ok, response} = Gemini.generate("Hello from Vertex AI", auth: :vertex_ai)

# Also works with either env var above
{:ok, response} = Gemini.generate("Hello", auth: :vertex_ai)

The client checks GOOGLE_APPLICATION_CREDENTIALS_JSON (gemini_ex extension), GOOGLE_APPLICATION_CREDENTIALS (standard ADC), gcloud user credentials, and metadata server endpoints, caching access tokens for you via ETS. Official Google ADC order starts with GOOGLE_APPLICATION_CREDENTIALS; JSON-content env support is a gemini_ex convenience.

Model Configuration System

The library includes an intelligent model registry that handles the differences between Gemini API (AI Studio) and Vertex AI.

Auth-Aware Model Defaults

Default models are automatically selected based on detected authentication:

# With GEMINI_API_KEY set:
Gemini.Config.default_model()        #=> "gemini-flash-lite-latest"
Gemini.Config.default_embedding_model()  #=> "gemini-embedding-001"

# With VERTEX_PROJECT_ID set (no GEMINI_API_KEY):
Gemini.Config.default_model()        #=> "gemini-2.5-flash-lite"
Gemini.Config.default_embedding_model()  #=> "embeddinggemma"

Model Compatibility

Models are organized by API compatibility:

Category Example Models Gemini API Vertex AI
Universalgemini-2.5-flash, gemini-3-flash-preview
AI Studio Onlygemini-flash-lite-latest, gemini-pro-latest
Vertex AI Onlyembeddinggemma, embeddinggemma-300m
# Check model availability
Gemini.Config.model_available?(:flash_2_5, :vertex_ai)     #=> true
Gemini.Config.model_available?(:flash_lite_latest, :vertex_ai) #=> false

# Get models for a specific API
Gemini.Config.models_for(:vertex_ai)  # All Vertex-compatible models
Gemini.Config.models_for(:both)       # Only universal models

# Get model by key with validation
Gemini.Config.get_model(:flash_2_5)  #=> "gemini-2.5-flash"
Gemini.Config.get_model(:flash_2_5, api: :vertex_ai)  # Validates compatibility

# Registry metadata helpers
Gemini.Config.model_info(:pro_3_1_preview)
Gemini.Config.model_supports?(:pro_3_1_preview, :thinking)     #=> true
Gemini.Config.models_with_capability(:live_api, :supported)

Embedding Model Differences

Embedding models differ significantly between APIs:

Model API Default Dims Task Type Handling
gemini-embedding-001 Gemini API 3072 taskType parameter
embeddinggemma Vertex AI 768 Prompt prefixes
# Gemini API - uses taskType parameter
{:ok, emb} = Gemini.embed_content("Search query",
  task_type: :retrieval_query  # Sent as API parameter
)

# Vertex AI with EmbeddingGemma - task embedded in prompt
{:ok, emb} = Gemini.embed_content("Search query",
  task_type: :retrieval_query  # Becomes: "task: search result | query: Search query"
)

The library handles this automatically based on detected authentication.

Custom Model Configuration

Override defaults in your application config:

config :gemini_ex,
  default_model: "gemini-2.5-flash",
  default_embedding_model: "gemini-embedding-001"

Or specify per-request:

Gemini.generate("Hello", model: "gemini-3.1-pro-preview")
Gemini.embed_content("Text", model: "gemini-embedding-001")

Documentation

Architecture

The library features a modular, layered architecture:

Telemetry

The library emits telemetry events for observability and monitoring. Attach handlers to observe API calls and Live API sessions.

Live API Telemetry Events

# Attach a handler for Live API messages
:telemetry.attach(
  "my-live-handler",
  [:gemini, :live, :session, :message, :received],
  fn event, measurements, metadata, _config ->
    IO.inspect({event, measurements, metadata})
  end,
  nil
)

Available Live API events:

See examples/telemetry_showcase.exs for a complete telemetry integration example.

Advanced Usage

Complete Generation Configuration Support

All generation config options are fully supported across all API entry points:

# Structured output with JSON schema
{:ok, response} = Gemini.generate("Analyze this data", [
  response_json_schema: %{
    "type" => "object",
    "properties" => %{
      "summary" => %{"type" => "string"},
      "insights" => %{"type" => "array", "items" => %{"type" => "string"}}
    }
  },
  response_mime_type: "application/json"
])

# Creative writing with advanced controls
{:ok, response} = Gemini.generate("Write a story", [
  temperature: 0.9,
  top_p: 0.8,
  top_k: 40,
  presence_penalty: 0.6,
  frequency_penalty: 0.4,
  stop_sequences: ["THE END", "EPILOGUE"]
])

Custom Model Configuration

# List available models
{:ok, models} = Gemini.list_models()

# Get model details
{:ok, model_info} = Gemini.get_model("gemini-flash-lite-latest")

# Count tokens
{:ok, token_count} = Gemini.count_tokens("Your text here", model: "gemini-flash-lite-latest")

Model quick picks

Multimodal Content (New in v0.2.2!)

The library now accepts multiple intuitive input formats for images and text:

# Anthropic-style format (flexible and intuitive)
content = [
  %{type: "text", text: "What&#39;s in this image?"},
  %{type: "image", source: %{type: "base64", data: base64_image}}
]

{:ok, response} = Gemini.generate(content)

# Automatic MIME type detection from image data
{:ok, image_data} = File.read("photo.png")
content = [
  %{type: "text", text: "Describe this photo"},
  %{type: "image", source: %{type: "base64", data: Base.encode64(image_data)}}
  # No mime_type needed - auto-detected as image/png!
]

# Or use the original Content struct format
alias Gemini.Types.{Content, Part}

content = [
  Content.text("What is this?"),
  Content.image("path/to/image.png")
]

{:ok, response} = Gemini.generate(content)

# Mix and match formats in a single request
content = [
  "Describe this image:",                    # Simple string
  %{type: "image", source: %{...}},          # Anthropic-style
  %Content{role: "user", parts: [...]}       # Content struct
]

Supported image formats: PNG, JPEG, GIF, WebP (auto-detected from magic bytes)

Image Generation API (Imagen)

Use the dedicated Imagen endpoints for text-to-image, editing, and upscaling. As of v0.10.0, auth: :vertex_ai is set automatically on all Images API calls.

alias Gemini.APIs.Images
alias Gemini.Types.Generation.Image.{ImageGenerationConfig, EditImageConfig, UpscaleImageConfig}

# Text-to-image
{:ok, images} =
  Images.generate(
    "An isometric illustration of a futuristic Elixir server farm",
    %ImageGenerationConfig{
      number_of_images: 2,
      aspect_ratio: "16:9",
      safety_filter_level: :standard
    },
    auth: :vertex_ai
  )

# Inpainting / editing with masks
{:ok, edited} =
  Images.edit(
    "Remove the logos and brighten the lighting",
    File.read!("assets/sample.png"),
    File.read!("assets/mask.png"),
    %EditImageConfig{edit_mode: :inpainting},
    auth: :vertex_ai
  )

# Upscale existing images (2x or 4x)
{:ok, sharp} =
  Images.upscale(
    File.read!("assets/sample.png"),
    %UpscaleImageConfig{upscale_factor: :x4},
    auth: :vertex_ai
  )

The API returns base64 image data plus metadata; you can also pull from GCS/HTTP URIs and control person_generation, safety filters, and aspect ratios.

Video Generation API (Veo, New in v0.8.x!)

Generate short-form videos with Veo via Vertex AI.

alias Gemini.APIs.Videos
alias Gemini.Types.Generation.Video.VideoGenerationConfig

{:ok, op} =
  Videos.generate(
    "A cinematic drone shot over misty mountains at sunrise",
    %VideoGenerationConfig{duration_seconds: 8, aspect_ratio: "16:9"},
    auth: :vertex_ai
  )

{:ok, completed} = Videos.wait_for_completion(op.name, auth: :vertex_ai)
IO.inspect(completed.response)

Use get_operation/2 or list_operations/1 to poll or enumerate jobs, and cancel/2 to stop a run mid-flight.

Inline Image Generation (Gemini 3 models)

Generate images with aspect ratio and resolution control:

config = Gemini.Types.GenerationConfig.image_config(
  aspect_ratio: "16:9",
  image_size: "4K"
)

{:ok, response} =
  Gemini.generate("A sunrise over the mountains",
    model: "gemini-3-pro-image-preview",
    generation_config: config
  )

images = Gemini.Types.Response.Image.extract_base64(response)

Cost Optimization with Thinking Budgets (New in v0.2.2!)

Gemini 2.5 series models use internal "thinking" for complex reasoning. Control thinking token usage to optimize costs:

# Disable thinking for simple tasks (save costs)
{:ok, response} = Gemini.generate(
  "What is 2 + 2?",
  model: "gemini-2.5-flash",
  thinking_config: %{thinking_budget: 0}
)
# Result: No thinking tokens charged!

# Set fixed budget (balance cost and quality)
{:ok, response} = Gemini.generate(
  "Write a Python function to sort a list",
  model: "gemini-2.5-flash",
  thinking_config: %{thinking_budget: 1024}
)

# Dynamic thinking (model decides - default behavior)
{:ok, response} = Gemini.generate(
  "Solve this complex problem...",
  model: "gemini-2.5-flash",
  thinking_config: %{thinking_budget: -1}
)

# Get thought summaries (see model&#39;s reasoning)
{:ok, response} = Gemini.generate(
  "Explain your reasoning step by step",
  model: "gemini-2.5-flash",
  thinking_config: %{
    thinking_budget: 2048,
    include_thoughts: true
  }
)

# Using GenerationConfig struct
alias Gemini.Types.GenerationConfig

config = GenerationConfig.new()
|> GenerationConfig.thinking_budget(1024)
|> GenerationConfig.include_thoughts(true)
|> GenerationConfig.temperature(0.7)

{:ok, response} = Gemini.generate("prompt", generation_config: config)

Budget ranges by model:

Special values:

Rate Limiting and Retries (Default ON)

Configure in config :gemini_ex, :rate_limiter:

config :gemini_ex, :rate_limiter,
  max_concurrency_per_model: 4,
  max_attempts: 3,
  base_backoff_ms: 1000,
  jitter_factor: 0.25,
  adaptive_concurrency: false,
  adaptive_ceiling: 8

Per-call overrides:

Error Handling

case Gemini.generate("Hello world") do
  {:ok, response} -> 
    # Handle success
    {:ok, text} = Gemini.extract_text(response)
    
  {:error, %Gemini.Error{type: :rate_limit} = error} -> 
    # Handle rate limiting
    IO.puts("Rate limited. Retry after: #{error.retry_after}")
    
  {:error, %Gemini.Error{type: :authentication} = error} -> 
    # Handle auth errors
    IO.puts("Auth error: #{error.message}")
    
  {:error, error} -> 
    # Handle other errors
    IO.puts("Unexpected error: #{inspect(error)}")
end

Testing

# Run all tests
mix test

# Run with coverage
mix test --cover

# Run integration tests (requires API key)
GEMINI_API_KEY="your_key" mix test --only integration

# Run Gemini Live session tests when GEMINI_API_KEY is already exported
mix test --only live_gemini test/gemini/live/session_live_test.exs

# Run Gemini Live feature tests when GEMINI_API_KEY is already exported
mix test --only live_gemini test/gemini/live/features_live_test.exs

# Run billed Vertex Live tests when Vertex credentials are already exported
RUN_BILLED_VERTEX_LIVE_TESTS=1 mix test --only live_vertex_ai test/gemini/live/session_vertex_live_test.exs

If your Vertex project does not expose a compatible Live audio model, the Vertex session tests will skip instead of failing.

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments