FastDecimal

Hex.pmCILicense

Fast arbitrary-precision decimal arithmetic for Elixir.

A pure-Elixir alternative to decimal — designed for the hot paths fintech, ledger, and pricing code live in: add, sub, mult, div, sum, parse, format. Drop-in via a compat shim, ships with Ecto integration. No native dependencies.

import FastDecimal

~d"1.23"
|> FastDecimal.add(~d"4.567")
|> FastDecimal.mult(~d"2")
|> FastDecimal.to_string()
# => "11.594"

FastDecimal.sum([~d"1.5", ~d"2.5", ~d"3"])
# => ~d"7.0"

FastDecimal.round(~d"1.236", 2)         # ~d"1.24"
FastDecimal.sqrt(~d"2", precision: 10)   # ~d"1.414213562"
FastDecimal.div(~d"10", ~d"3", precision: 5)  # ~d"3.3333"

Benchmarks

mix bench reproduces the headline summary in about a minute.

Methodology

Every number below is the median across 7 independent samples × 200,000 iterations per scenario. Each sample runs in a fresh process (resetting BEAM state), with 1,000 warmup iterations and a forced GC before measurement. Times use :erlang.monotonic_time(:nanosecond).

We report median (p25–p75 IQR) — the interquartile range survives outliers from GC pauses and scheduler steals. A row is marked stable when even the pessimistic ratio (FastDecimal's p75 vs Decimal's p25) clears 2×.

The geometric mean speedup is reproducible across runs (observed: 11.11× – 11.28× across 4 consecutive runs on the same JIT-enabled OTP install). Specific per-op nanosecond values shift 5-10% per run due to macOS scheduler noise (E-core vs P-core dispatch, GC interactions); the speedup ratios are stable. Numbers below are from one representative run — run mix bench to see your own.

Headline summary (mix bench)

Tested on macOS arm64 / 10 cores against two flavors of OTP 26 on the same hardware:

Emulator Geometric mean speedup Scenarios faster Stable ≥2× at IQR edges
OTP 26, BEAMAsm JIT (asdf 26.0.2, emu_flavor=jit) 9.87× 21/22 19/22
OTP 26, threaded-code interpreter (asdf 26.2.4, emu_flavor=emu) 7.71× 21/22 18/22

JIT helps FastDecimal proportionally more than decimal (FastDecimal's hot paths have more inlining opportunities per work-unit), so the speedup ratio is larger on JIT — but even on the older interpreter without JIT, FastDecimal is still ~8× faster on average.

Detailed table (BEAMAsm JIT, OTP 26)

Format: median (p25–p75 IQR). Speedup column: median (pessimistic – optimistic ratios).

op size decimal FastDecimal speedup
add medium 282 ns (264–294) 15 ns (13–16) 19× (17–23)
add large 1.72 µs (1.68–1.77) 20 ns (20–21) 81× (79–86)
sub medium 359 ns (332–585) 14 ns (12–24) 25× (14–47)
sub large 747 ns (744–797) 22 ns (21–23) 34× (33–37)
mult medium 224 ns (224–227) 13 ns (13–13) 18× (17–18)
mult large 1.94 µs (1.93–1.96) 20 ns (20–21) 97× (92–98)
div p=28 medium 2.92 µs (2.91–2.94) 374 ns (371–378) 7.8× (7.7–7.9)
div p=28 large 6.88 µs (6.84–6.98) 416 ns (414–421) 17× (16–17)
div_int medium 128 ns (128–131) 15 ns (15–16) 8.4× (8.2–8.6)
div_rem medium 139 ns (137–141) 50 ns (50–51) 2.8× (2.7–2.8)
compare medium 85 ns (84–85) 8.5 ns (8.5–8.8) 10× (10–10)
compare large 302 ns (298–304) 16 ns (15–17) 19× (17–20)
negate medium 181 ns (178–182) 15 ns (15–16) 12× (11–12)
abs medium 162 ns (159–164) 15 ns (14–15) 11× (11–12)
round (3dp) medium 433 ns (427–435) 33 ns (32–35) 13× (12–14)
normalize medium 180 ns (176–181) 18 ns (18–18) 10× (10–10)
parse small 179 ns (177–181) 52 ns (51–57) 3.4× (3.1–3.5)
parse medium 242 ns (235–246) 65 ns (64–66) 3.7× (3.6–3.8)
to_string medium 137 ns (137–138) 135 ns (134–136) 1.0× — parity
to_string sci medium 137 ns (136–138) 181 ns (180–182) 0.76× — regression
to_integer medium 16 ns (16–17) 11 ns (10–11) 1.5× (1.5–1.6)
sum of 100 23.4 µs (22.7–24.3) 785 ns (775–804) 30× (28–31)

At-parity ops (called out honestly):

Realistic workloads (mix run bench/realistic.exs)

Production-style code patterns. Speedups vary 10-25% across runs (the workload code allocates more, so GC interactions vary), but every workload comes in 10×+ faster than decimal:

Workload typical speedup
Invoice total (50 line items × price) 14-17×
10% discount + 8.25% tax × 100 prices 18-22×
FX conversion + round 2dp × 100 prices 12-15×
Sum + min + max over 1000 amounts 23-28×
Parse 100 CSV strings 2.7-3.2×

Allocations + reductions (mix run bench/profile.exs)

op dec time fd time dec alloc fd alloc dec reds fd reds
add (medium) 266 ns 12 ns 266 B 53 B 63 4
add (large) 1536 ns 19 ns 552 B 12 B 164 4
mult (large) 1970 ns 20 ns 777 B 11 B 273 4
compare 85 ns 8 ns 0 B 0 B 20 4
sum of 100 22.0 µs 0.88 µs 983 B 4947 B 6214 307

4 reductions per add is at the BEAM floor — no operation on a struct can do less.

Reproduce

The whole suite is in bench/ and runs from mix. No Docker, no setup beyond mix deps.get. See the Benchmark suite page for methodology and per-file detail.

mix deps.get
mix test                  # 13 doctests + 35 properties + 277 unit tests = 325 total
mix bench                 # → bench/summary.exs (headline table, ~1 minute)
mix bench.all             # → every bench file end-to-end (~20 minutes)

# Or run a specific bench:
mix run bench/division.exs        # div / div_int / div_rem / rem
mix run bench/rounding.exs        # round/3 × 7 modes
mix run bench/sqrt.exs            # sqrt at 6 precisions
mix run bench/conversion.exs      # to_string formats, cast, to_int/float
mix run bench/special_values.exs  # NaN/Inf overhead
mix run bench/realistic.exs       # fintech-style workloads
mix run bench/batch.exs           # sum/product at 4 list sizes
mix run bench/profile.exs         # per-op time + alloc + reductions
mix run bench/parse.exs           # parser strategy shootout
mix run bench/representation.exs  # struct vs raw tuple
mix run bench/disasm.exs          # BEAM bytecode dump

See bench/README.md for what each script measures and the design decision it backed.

Test coverage

The suite is the regression gate for future optimization work and the correctness floor for trusting outputs:

Run with mix test. Full suite finishes in under a second.

Total: 344 tests/properties/doctests — stable across consecutive runs. Includes 19 dedicated security regression tests covering CVE-2026-32686-class exponent-amplification DoS protection.

Security

FastDecimal is not vulnerable to CVE-2026-32686 (exponent-amplification DoS that affected ericmj/decimal < 2.4.0). Three layers of defense:

  1. Parser rejects scientific-notation inputs with explicit exponent magnitude > 65,535. FastDecimal.parse("1e1000000000") returns :error rather than producing a value whose materialization would OOM the BEAM.
  2. pow10/1 internal cap raises on n > 100,000. Catches operations that would materialize huge values even when the value was constructed directly via new(coef, exp) bypassing the parser.
  3. to_string(_, :normal) refuses to produce output larger than 1 MB. The :scientific and :raw formats remain available for legitimate large-exponent values (they don't materialize the zeros).

These bounds are well above any practical use case (IEEE 754 decimal128 itself tops out at exp ±6,144) but kill the runaway path. Regression tests live at test/fastdecimal/security_test.exs.

Where the two libraries legitimately diverge

FastDecimal does exact arithmetic; decimal rounds to its Context.precision (28 by default). For inputs whose true result has >28 significant digits, the two libraries produce different values — that's a documented design difference, not a bug. The differential tests constrain inputs so the result stays within 28 sig figs (where the libs should agree); the property tests document the divergence explicitly.

Installation

def deps do
  [
    {:fastdecimal, "~> 1.0"}
  ]
end

Feature surface

Construction

import FastDecimal

~d"1.23"                          # Compile-time literal (zero parse cost at runtime)
~d"1.23e10"                       # Scientific notation
~d"Infinity"                      # +∞
~d"-Inf"                          # -∞
~d"NaN"                           # NaN

FastDecimal.new("1.23")           # Runtime parse, raises on bad input
FastDecimal.new(42)               # From integer
FastDecimal.new(123, -2)          # From coef + exp
FastDecimal.parse("1.23")         # {:ok, t} | :error  — no raise
FastDecimal.cast(value)           # Soft parse, accepts FastDecimal/Decimal/int/string/float/nil

Arithmetic

FastDecimal.add(a, b)
FastDecimal.sub(a, b)
FastDecimal.mult(a, b)
FastDecimal.div(a, b, precision: 28, rounding: :half_even)
FastDecimal.div_int(a, b)         # Truncated integer division
FastDecimal.div_rem(a, b)         # {quotient, remainder}
FastDecimal.rem(a, b)
FastDecimal.negate(a)
FastDecimal.abs(a)
FastDecimal.sqrt(a, precision: 28)  # Newton-Raphson
FastDecimal.round(a, places, mode)   # All 7 rounding modes

Batch

FastDecimal.sum(list)             # Tight Elixir-side reduce
FastDecimal.product(list)

Comparison & predicates

FastDecimal.compare(a, b)         # :lt | :eq | :gt | :nan
FastDecimal.equal?(a, b)
FastDecimal.lt?(a, b) ; FastDecimal.gt?(a, b)
FastDecimal.min(a, b) ; FastDecimal.max(a, b)

FastDecimal.zero?(d) ; FastDecimal.positive?(d) ; FastDecimal.negative?(d)
FastDecimal.nan?(d)  ; FastDecimal.inf?(d)     ; FastDecimal.finite?(d)

Conversion

FastDecimal.to_string(d)              # "1.23"
FastDecimal.to_string(d, :scientific) # "1.23" — IEEE compact (only emits E for very small/large)
FastDecimal.to_string(d, :raw)        # "123E-2"
FastDecimal.to_string(d, :xsd)        # XSD canonical (= :normal for our repr)

FastDecimal.to_integer(d)             # raises on fractional
FastDecimal.to_float(d)               # lossy for non-terminating binaries
FastDecimal.normalize(d)              # strips trailing zeros

Guard-safe macro

require FastDecimal

def process(d) when FastDecimal.is_decimal(d), do: ...

Migrating from decimal

The 30-second version, for the common case:

defmodule MyLedger do
  alias FastDecimal.Compat, as: Decimal   # add this line, rest stays the same

  def total(items) do
    Enum.reduce(items, Decimal.new(0), fn item, acc ->
      Decimal.add(acc, item.amount)
    end)
  end
end

The Compat shim mirrors decimal's public surface and auto-coerces inputs (real %Decimal{}, %FastDecimal{}, strings, integers, floats). It costs 5-15% vs calling FastDecimal.* directly.

Five things that don't translate cleanly and how to handle each:

See MIGRATION.md for the full guide — decision tree, mechanical steps, real before/after examples, and an FAQ. Most projects migrate in under an hour; some need a wrapper module around precision-policy code; a few should stay on decimal.

Differences from decimal (summary)

decimal FastDecimal
Precision context Per-process (Decimal.Context) Per call (only div, sqrt, round take precision)
Default rounding mode :half_up:half_even (the Compat shim uses :half_up for parity)
NaN distinction :sNaN, :qNaN Single :nan (no signaling NaN)
Sign storage Separate sign field In coef
Negative zero -0 distinguishable from 0 Collapsed to 0
Arithmetic semantics Bounded by context precision Exact — chain add/mult without rounding
compare/2 with NaN Raises Returns :nan
DoS protection (CVE-2026-32686) Sticky-bit precision-bounded scaling, per-call :max_digits/:max_exponent opts Hardcoded global limits (parser caps at exp ±65,535; pow10 caps internally at n=100,000; to_string :normal caps output at 1 MB). No per-call options.

Ecto integration

defmodule MyApp.Invoice do
  use Ecto.Schema

  schema "invoices" do
    field :total, FastDecimal.Ecto.Type
  end
end

FastDecimal.Ecto.Type is automatically compiled when Ecto is in your deps. It bridges between Decimal (what the database adapter speaks) and FastDecimal (what your code holds). cast/1, load/1, dump/1, equal?/2 are all implemented.

Design philosophy

Every operation's implementation was chosen by running a benchmark, not by guessing. The full decision record — with the measurements behind each call — lives in bench/README.md. A few highlights:

The rule: if you have a hypothesis about a faster way, write the bench, run it, commit the script. Negative results stay in the tree so we don't re-test the same idea.

Why pure Elixir (the bench data)

NIF dispatch overhead is ~36 ns on this machine. A pure-Elixir add total is ~12 ns. The dispatch cost alone is 3× the work-cost for every cheap op. The Rust NIF prototype we built and benchmarked confirmed this — it lost on every per-op arithmetic and only won at high-precision div and long-string parse. Not enough to justify shipping a binary dependency that requires Rust on every consumer's machine. FastDecimal is pure Elixir; no native compilation step.

License

MIT. See LICENSE.