Native Elixir PDF Utilities

A small, native Elixir toolkit for working with classic PDFs:

No NIFs, no ports, no external binaries just Elixir. Targeted at structural tasks and best-effort merging for common PDFs.

Status


Features

Installation

This project is not yet published on Hex. Add it as a local dependency or use a Git reference once public.

Local path (for development):

def deps do
  [
    {:native_elixir_pdf_utilities, path: "../native_elixir_pdf_utilities"}
  ]
end

When the package is published to Hex, it will look like:

def deps do
  [
    {:native_elixir_pdf_utilities, "~> 0.1.0"}
  ]
end

Quick Start

Interactive shell:

iex -S mix

Tokenize a PDF binary:

alias NativeElixirPdfUtilities.Tokenizer

pdf = File.read!("sample.pdf")
st = Tokenizer.new(pdf)
tokens = Tokenizer.tokenize_all(st)
IO.inspect(tokens, limit: 50)

Merge PDFs:

alias NativeElixirPdfUtilities.Merge

bins = [
  File.read!("a.pdf"),
  File.read!("b.pdf")
]

{:ok, merged} = Merge.merge(bins)
File.write!("merged.pdf", merged)

Tokenizer API

Module: NativeElixirPdfUtilities.Tokenizer

Token forms include:

Notes:


Merging PDFs

Module: NativeElixirPdfUtilities.Merge

Example:

{:ok, out} = NativeElixirPdfUtilities.Merge.merge([
  File.read!("doc1.pdf"),
  File.read!("doc2.pdf"),
  File.read!("doc3.pdf")
])
File.write!("merged.pdf", out)

Behavior and constraints:


Running Tests

mix test

The test suite exercises the tokenizer (numbers, names, strings, dicts, arrays, operators, streams via /Length and fallback scanning).


Roadmap / Ideas


License

MIT. See LICENSE.