ExPdfium
Elixir bindings for pdfium — Google's
Chromium PDF engine — via the Rust
pdfium-render crate, shipped as a
precompiled NIF with rustler_precompiled.
No Rust toolchain. No separately-installed pdfium. Add the dep and go.
A read & extract toolkit. Open documents and count pages, render pages to bitmaps, extract and search text, read metadata, page geometry and permissions, walk structure (bookmarks, links, attachments), and read forms and annotations. It does not create, edit, or save PDFs.
Why
The native PDF-rendering gap in Elixir: Vix/Image (libvips) ships without PDF
support, so rasterizing a PDF normally means building libvips from source with
poppler/pdfium. ExPdfium fills that gap with a precompiled pdfium binding —
rendering, plus text extraction and metadata that pure-libvips can't give you.
This is a ground-up Rust rewrite of the older
gmile/pdfium C++ NIF, adopting the
rustler_precompiled release model so every supported OTP (27/28/29+) gets a
precompiled binary from one build matrix.
Installation
def deps do
[{:ex_pdfium, "~> 0.1"}]
end
Usage
{:ok, doc} = ExPdfium.open("file.pdf") # or open(<<"%PDF...">> = bytes)
{:ok, n} = ExPdfium.page_count(doc)
:ok = ExPdfium.close(doc)
# Encrypted documents:
{:ok, doc} = ExPdfium.open("secret.pdf", password: "hunter2")
Documents are closed automatically on garbage collection; call
ExPdfium.close/1 to release pdfium memory early. open/2 returns
{:error, reason} for problems like :enoent, :invalid_pdf, or
:password_error.
Rendering
{:ok, %ExPdfium.Bitmap{data: data, width: w, height: h}} =
ExPdfium.render_page(doc, 0, dpi: 300) # or scale:, or width:/height:
# Hand the raw RGBA buffer straight to Vix/Image:
{:ok, image} = Vix.Vips.Image.new_from_binary(data, w, h, 4, :VIPS_FORMAT_UCHAR)
Image.write(image, "page.png")
render_page/3 takes :dpi / :scale / :width / :height for sizing,
format: :rgba | :bgra, and background: :white | :transparent. The bitmap is an
uncompressed 4-channel buffer (width * height * 4 bytes).
Text & search
{:ok, text} = ExPdfium.extract_text(doc, 0) # one page
{:ok, text} = ExPdfium.extract_text(doc) # whole document
# Text runs with bounding boxes (PDF points, origin bottom-left):
{:ok, segments} = ExPdfium.text_segments(doc, 0)
# => [%{text: "Hello", bounds: %{left: 41.9, bottom: 115.2, right: 89.0, top: 137.5}}, ...]
# Search a page (case-insensitive by default):
{:ok, matches} = ExPdfium.search_text(doc, 0, "invoice", match_case: false)
# => [%{text: "Invoice", rects: [%{left: ..., bottom: ..., right: ..., top: ...}]}, ...]
Metadata, geometry & permissions
{:ok, meta} = ExPdfium.metadata(doc)
# => %{title: "…", author: "…", creation_date: "D:…", producer: "…", ...}
{:ok, info} = ExPdfium.page_info(doc, 0)
# => %{width: 612.0, height: 792.0, rotation: 0, label: nil,
# boxes: %{media: %{left: 0.0, bottom: 0.0, right: 612.0, top: 792.0},
# crop: nil, bleed: nil, trim: nil, art: nil}} # non-media boxes often nil
{:ok, perms} = ExPdfium.permissions(doc)
# => %{print_high_quality: true, extract_text_and_graphics: true, modify_content: true, ...}
Structure & navigation
{:ok, tree} = ExPdfium.outline(doc) # bookmark tree
# => [%{title: "Chapter 1", page: 0, children: [%{title: "1.1", page: 0, children: []}]}, ...]
{:ok, links} = ExPdfium.links(doc, 0) # links on a page
# => [%{bounds: %{...}, uri: "https://example.com", page: nil},
# %{bounds: %{...}, uri: nil, page: 1}] # internal link to page 1
{:ok, files} = ExPdfium.attachments(doc) # => [%{index: 0, name: "note.txt", size: 25}]
{:ok, bytes} = ExPdfium.attachment_data(doc, 0)
Forms & annotations (read)
{:ok, :acrobat} = ExPdfium.form_type(doc) # :none | :acrobat | :xfa_full | :xfa_foreground
{:ok, fields} = ExPdfium.form_fields(doc) # AcroForm fields, one per widget
# => [%{name: "full_name", type: :text, value: "Ada Lovelace", checked: nil,
# read_only: false, required: false, page: 0, bounds: %{...}},
# %{name: "subscribe", type: :checkbox, value: "Yes", checked: true, ...}]
{:ok, anns} = ExPdfium.annotations(doc, 0) # annotations on a page (markup + widgets)
# => [%{type: :highlight, contents: "Important", bounds: %{...}, name: nil,
# hidden: false, printed: false}, ...]
Reading is the whole scope — ExPdfium does not create, fill, or save PDFs. XFA form data needs a V8-enabled pdfium build, which is not shipped;
:xfa_fulldocuments may expose an empty or partial AcroForm view.
Development
The shipped NIF binds pdfium dynamically and loads a libpdfium bundled
inside the precompiled tarball, right beside the NIF (bblanchon publishes no
static libpdfium.a). For local work, download a libpdfium once and point the
tests at it:
just fetch-pdfium # downloads libpdfium for this host into priv/pdfium
just test # EXPDFIUM_BUILD=1 mix test (forces a from-source build)
just fmt # mix format + cargo fmt
EXPDFIUM_BUILD=1 forces a from-source NIF build instead of downloading a
precompiled one. CI runs the full gate: mix format --check-formatted,
cargo fmt --check, cargo clippy -- -D warnings,
mix compile --warnings-as-errors, and mix test.
Releasing
See UPDATE_PROCEDURE.md.
In short: just release bumps
the version, rolls the CHANGELOG, tags, and pushes; the tag triggers a build
matrix that attaches one NIF per target to a GitHub release; checksums are
regenerated from those artifacts; Hex publish is gated behind a manual approval.
License
MIT — see LICENSE. pdfium itself is BSD-3-Clause (Google/Chromium); the precompiled pdfium binaries come from bblanchon/pdfium-binaries.