langelic_epub

CIHex.pmDocs

EPUB read and write for Elixir, backed by a Rustler NIF. Parses EPUB 2 and EPUB 3 documents into structured Elixir data and generates EPUB 3 documents (with a backward-compatible toc.ncx) from the same structures.

Installation

Add to mix.exs:

def deps do
[
{:langelic_epub, "~> 0.1"}
]
end

Precompiled NIFs are published for macOS (aarch64, x86_64) and Linux (aarch64-gnu, x86_64-gnu, x86_64-musl). Users on those platforms do not need a Rust toolchain. The artifacts target NIF ABI 2.16, which loads on current OTP releases (tested through OTP 29). Users on other platforms can build from source — see Building from source.

Quick start

# Read
{:ok, doc} = LangelicEpub.parse(File.read!("book.epub"))
doc.title # => "The Hobbit"
doc.language # => "en"
length(doc.spine) # => 23
# Modify a chapter
[first | rest] = doc.spine
translated =
%LangelicEpub.Chapter{first | data: translate(first.data)}
modified = %LangelicEpub.Document{doc | spine: [translated | rest]}
# Write
{:ok, bytes} = LangelicEpub.build(modified)
File.write!("translated.epub", bytes)

Why this library exists

There was a gap on Hex. bupe (the only EPUB-focused Elixir library) was last updated nine years ago and is minimal; other packages are single-purpose or metadata-only. The Rust ecosystem has mature EPUB tooling, so rather than reimplement format handling in pure Elixir — where EPUB 2/3 metadata variants, NCX vs. nav.xhtml, embedded fonts, refines metadata, and OPF schema quirks all accumulate bugs over time — this package wraps two mature Rust crates through a Rustler NIF:

A small OPF re-parse layer (quick-xml) fills in the fields iepub drops (<dc:language>, <dc:rights>, multiple <dc:creator> entries). A post- processing pass rewrites the generated OPF to preserve identifiers verbatim and inject DC elements epub-builder doesn't emit natively (<dc:publisher>, <dc:date>, <dc:rights>).

Supported features

FeatureReadWrite
EPUB 2 inputyesn/a
EPUB 3 inputyesyes (always emitted)
Multiple creatorsyesyes
NCX TOCyesyes (emitted for EPUB 2 readers)
nav.xhtml TOCyesyes
Embedded fontsyesyes
Embedded imagesyesyes
Embedded CSSyesyes
Cover imageyesyes
DRM-encrypted contentdetected, not decryptedn/a
MOBInono

Limitations and known issues

Error model

Every public function returns {:ok, term} | {:error, %LangelicEpub.Error{}}. The :kind field is a well-documented atom (:invalid_zip, :missing_container, :malformed_opf, :io, :missing_required_field, :invalid_chapter, :duplicate_id, :panic). The full list is in the moduledoc of LangelicEpub.Error. Panics on the Rust side are caught and converted to {:error, %Error{kind: :panic}} so a malformed EPUB cannot crash the BEAM scheduler thread.

Architecture

langelic_epub is an Elixir wrapper around a Rustler NIF. The native code lives in native/langelic_epub/ and is compiled as a cdylib. Both NIF functions run on the DirtyCpu scheduler because parsing or building a 5 MB EPUB takes 50–200 ms, well past the 1 ms guideline.

lib/langelic_epub/ # Public API, struct modules, error module
lib/langelic_epub/native.ex # RustlerPrecompiled binding
native/langelic_epub/src/ # Rust NIF (reader, writer, opf, types, error)

Building from source

Required:

Set the environment variable to force a source build rather than downloading the precompiled NIF:

LANGELIC_EPUB_BUILD=true mix deps.get && mix compile

Contributing

Issues and pull requests are welcome. Before submitting a PR:

License

MIT. See LICENSE.

This package wraps two Rust crates under separate licenses; see NOTICE for attribution.