OmnivoiceEx

Hex.pmLicense

Elixir wrapper for OmniVoice — a unified speech generation model from K2-FSA.

Voice Cloning · Voice Design · Multilingual TTS · Deterministic Generation · 24kHz Output

Features

Requirements

Installation

Add to your mix.exs:

def deps do
[
{:omnivoice_ex, "~> 0.2.0"}
]
end

Then install Python dependencies:

mix omnivoice_ex.setup

Quick Start

# Start the model server
{:ok, pid} = OmnivoiceEx.start_link(device: "cuda")
# Wait for model to load
:ok = OmnivoiceEx.await_ready(pid)
# Generate speech
{:ok, audio} = OmnivoiceEx.generate(pid, "Hello, world!")
# Save to file
:ok = OmnivoiceEx.save(audio, "output.wav")
# Clean shutdown
OmnivoiceEx.stop(pid)

Voice Design

Describe a voice in natural language and OmniVoice generates it:

{:ok, audio} = OmnivoiceEx.generate(pid,
"Welcome to our luxury resort.",
instruct: "A warm, professional female concierge with a British accent"
)

Voice Cloning

Clone a voice from a reference audio file:

{:ok, audio} = OmnivoiceEx.generate(pid,
"This is a cloned voice speaking English.",
ref_audio: "/path/to/reference.wav",
ref_text: "Transcript of the reference audio" # optional, improves quality
)

Deterministic / Reproducible Generation (v0.2.0+)

OmnivoiceEx now supports fully deterministic generation for stable outputs across runs. This is useful for:

Key options:

Example:

{:ok, audio} = OmnivoiceEx.generate(pid,
"This output is fully reproducible.",
seed: 12345,
position_temperature: 0.0,
class_temperature: 0.0
)

Under the hood (v0.2.0 fix):

Language Selection

Common IDs: zh (Chinese), en (English), ja (Japanese), ko (Korean), yue (Cantonese), fr (French), de (German), es (Spanish), ru (Russian), pt (Portuguese), it (Italian), th (Thai), vi (Vietnamese), hi (Hindi), ar (Arabic), nl (Dutch), pl (Polish), sv (Swedish), tr (Turkish).

Full list of 646 languages: OmniVoice docs/languages.md

Generation Options

Architecture

Elixir (GenServer) ←→ Erlang Port ←→ Python Bridge ←→ OmniVoice Model
(stdin/stdout) (msgpack framed)

Uses MessagePack binary framing over Erlang Ports — audio is transmitted as raw WAV bytes inside msgpack, eliminating the 33% base64 overhead of JSON-based solutions.

Changelog

v0.2.0

v0.1.0

Production & Engineering

This section provides practical guidance for using OmnivoiceEx in real systems: concurrency, reliability, monitoring, and common pitfalls.

Concurrency and Request Handling

Example: named server in supervision tree

defmodule MyApp.OmniVoiceSupervisor do
use Supervisor
def start_link(opts) do
Supervisor.start_link(__MODULE__, opts, name: __MODULE__)
end
def init(_opts) do
children = [
{OmnivoiceEx,
name: OmniVoiceServer,
device: System.get_env("OMNIVOICE_DEVICE") || "cuda",
model: System.get_env("OMNIVOICE_MODEL") || "k2-fsa/OmniVoice"}]
Supervisor.init(children, strategy: :one_for_one)
end
end
# Usage elsewhere:
{:ok, audio} = OmnivoiceEx.generate(OmniVoiceServer, "Hello!", seed: 1)

Timeouts and Backpressure

Example:

case OmnivoiceEx.generate(OmniVoiceServer, text, opts, timeout: 60_000) do
{:ok, audio} ->
# handle audio
{:error, :timeout} ->
# fallback / retry / user message
{:error, reason} ->
# log and handle
end

If your system is under heavy load:

Error Handling

OmnivoiceEx can return errors from:

General pattern:

case OmnivoiceEx.generate(OmniVoiceServer, text, opts) do
{:ok, audio} ->
# success
{:error, :timeout} ->
Logger.warn("TTS request timed out")
{:error, msg} when is_binary(msg) ->
Logger.error("TTS bridge error: #{msg}")
{:error, other} ->
Logger.error("TTS unexpected error: #{inspect(other)}")
end

If the Python bridge process exits unexpectedly:

Telemetry and Monitoring

OmnivoiceEx emits telemetry events you can use for observability:

Example: attach a handler in your application:

defmodule MyApp do
def start(_type, _args) do
children = [
MyApp.OmniVoiceSupervisor,
{TelemetryPoller, []}
]
Supervisor.start_link(children, strategy: :one_for_one)
:telemetry.attach_many(
"omnivoice-ex-logger",
[
[:omnivoice_ex, :generate],
[:omnivoice_ex, :await_ready]
],
&__MODULE__.handle_event/4,
nil
)
end
def handle_event(event, measurements, _meta, _config) do
Logger.debug(
"OmnivoiceEx #{inspect(event)} duration_ms=#{measurements.duration_ms}"
)
end
end

You can plug this into Prometheus, Grafana, or your internal metrics stack.

Deployment Notes

Common Pitfalls (FAQ-style)

License

Apache 2.0 — see LICENSE.