Candil
LLM inference and model management for Elixir. Run local models via llama.cpp or remote models via OpenAI-compatible APIs.
Installation
def deps do
[
{:candil, "~> 1.0"}
]
end
Dependencies
Candil requires:
:apero- System utilities (included automatically via path in dev):arrea- Parallel execution (included automatically via path in dev):jason- JSON encoding/decoding:req- HTTP client
Configuration
Engines
An engine represents a local llama-server binary that serves one model at a time.
# Precompiled binary (auto-downloaded)
engine = %Candil.Engine{
alias: :llama_server,
use_precompiled: true,
precompiled_version: :latest,
host: "127.0.0.1",
port: 8080,
start_args: ["--n-gpu-layers", "35"]
}
# Custom binary path
engine = %Candil.Engine{
alias: :llama_server,
binary_dir: "/usr/local/bin",
use_precompiled: false,
host: "127.0.0.1",
port: 8080
}
Candil.Config.register_engine(engine)
Models
Local Model
model = %Candil.Model{
alias: :llama3,
type: :local,
model_dir: "/models",
filename: "llama-3-8b-q4_k_m.gguf",
download_url: "https://huggingface.co/.../llama-3-8b-q4_k_m.gguf",
context_size: 8192,
engine: :llama_server,
usage: [:chat, :completion],
model_args: ["--n-gpu-layers", "35"]
}
Candil.Config.register_model(model)
Remote Model
model = %Candil.Model{
alias: :gpt4o,
type: :remote,
name: "gpt-4o",
context_size: 128_000,
provider: :openai,
usage: [:chat, :completion, :embeddings]
}
Candil.Config.register_model(model)
Providers
OpenAI
provider = %Candil.Provider{
alias: :openai,
type: :openai,
base_url: "https://api.openai.com/v1",
api_key: System.get_env("OPENAI_API_KEY")
}
Candil.Config.register_provider(provider)
Anthropic
provider = %Candil.Provider{
alias: :anthropic,
type: :anthropic,
base_url: "https://api.anthropic.com",
api_key: System.get_env("ANTHROPIC_API_KEY")
}
Candil.Config.register_provider(provider)
Ollama
provider = %Candil.Provider{
alias: :ollama,
type: :ollama,
base_url: "http://localhost:11434"
}
Candil.Config.register_provider(provider)
OpenAI-Compatible (Groq, LM Studio, etc.)
provider = %Candil.Provider{
alias: :groq,
type: :openai_compatible,
base_url: "https://api.groq.com/openai",
api_key: System.get_env("GROQ_API_KEY")
}
Candil.Config.register_provider(provider)
Usage
Local Model (llama.cpp)
# Download binary (automatic if use_precompiled: true)
:ok = Candil.download_engine(engine)
# Download model
{:ok, _path} = Candil.download_model(model)
# Start engine
{:ok, pid} = Candil.start_engine(engine, model)
# Run inference
{:ok, response} = Candil.chat(:llama3, [
%{role: "user", content: "Hello!"}
])
IO.puts(response.content)
# Stop when done
:ok = Candil.stop_engine(:llama3)
Remote Model (OpenAI)
# Run inference directly
{:ok, response} = Candil.chat(model, provider, [
%{role: "user", content: "Hello!"}
])
IO.puts(response.content)
Streaming Responses
# Local streaming
Candil.stream(:llama3, [
%{role: "user", content: "Write a story"}
], fn chunk ->
IO.write(chunk.content)
end)
# Remote streaming
Candil.stream(model, provider, [
%{role: "user", content: "Write a story"}
], fn chunk ->
IO.write(chunk.content)
end)
Embeddings
# Local embeddings (engine must be running and model must support :embeddings)
{:ok, embeddings} = Candil.embed(:llama3, ["Hello world", "How are you?"])
# Remote embeddings
{:ok, embeddings} = Candil.embed(model, provider, ["Hello world", "How are you?"])
Conversation Management
conv = Candil.Conversation.new(
model: :llama3,
system: "You are a helpful assistant.",
max_context_tokens: 4096
)
{:ok, conv, response} = Candil.Conversation.chat(conv, "What is Elixir?")
IO.puts(response.content)
{:ok, conv, response} = Candil.Conversation.chat(conv, "Give me a code example.")
IO.puts(response.content)
Architecture
- Candil.Llm - Main entry point for all LLM operations
- Candil.Engine - Manages local llama-server processes
- Candil.Engine.Server - GenServer wrapping the llama-server OS process
- Candil.Inference - Handles chat completions and embeddings
- Candil.Stream - SSE streaming support
- Candil.Provider - Remote API provider abstraction (OpenAI, Anthropic, Ollama)
- Candil.Model - Model definitions (local or remote)
- Candil.Config - ETS-based registry for engines, models, and providers
- Candil.Detector - OS/GPU detection for binary selection
- Candil.Installer - Download and extraction utilities
- Candil.Conversation - Conversation history with context window management
- Candil.RequestBuilder - Request body builders for all provider APIs
License
MIT