murmur_nif
Erlang NIF wrapper around MurmurHash3 (x64_128) with a Cassandra-compatible signed-byte variant for token-aware routing against Cassandra and Scylla.
Why
Replaces git-ref dependencies on hand-rolled Murmur3 NIF forks.
Modern build toolchain (correct OTP 27+ -eval order, macOS
-undefined dynamic_lookup, dirty-scheduler dispatch and
enif_consume_timeslice accounting on the inline path), tested
against OTP 25-28 in CI, and published to hex.pm.
Install
{deps, [{murmur_nif, "0.1.0"}]}.
Requires a C compiler (cc) on the build host -- universally
available on systems that already run Erlang.
API
-spec murmur_nif:murmur3_x64_128(binary()) -> binary().
-spec murmur_nif:murmur3_cassandra_x64_128(binary()) -> binary().Both functions return a fixed 16-byte binary representing the 128-bit hash, using seed 0.
1> murmur_nif:murmur3_x64_128(<<"hello">>).
<<2,155,189,65,179,167,216,203,25,29,174,72,106,144,30,91>>Which variant to use
murmur3_x64_128/1-- Austin Appleby's standard MurmurHash3 x64_128. Use for general-purpose hashing.murmur3_cassandra_x64_128/1-- Cassandra/Scylla-compatible variant. The input bytes are interpreted as signed (matching Java's signedbytetype), which changes the sign-extension of the tail-block accumulator and produces hashes that match Cassandra's partitioner. Use to compute partition tokens for token-aware routing.
For pure-ASCII inputs (all bytes < 128) the two variants produce identical output. They only diverge when high bits are set.
Behaviour notes
- Dirty CPU scheduler for inputs above 20 KB. In practice hash inputs are small (partition keys are typically tens to hundreds of bytes), but the threshold protects against scheduler hogs on large inputs.
- Inline path reduction accounting via
enif_consume_timeslice, proportional to bytes processed. Cost model: ~500 bytes/reduction (calibrated for ~5 GB/s hash throughput), 4000-reduction timeslice.
Build
rebar3 compile runs c_src/build.sh:
-
Resolves
ERTS_INCLUDE_DIRviaerl -noshell -eval ... -s init stop(option order is correct for OTP 27+). -
Compiles
c_src/murmur_nif.c+c_src/murmur3/murmur3.cwith-O3 -march=native. -
Outputs
priv/murmur_nif.so.
Env vars honored:
| Var | Effect |
|---|---|
ERTS_INCLUDE_DIR |
Skip the erl probe; use this path for erl_nif.h. |
CC |
Compiler (default cc). |
CFLAGS | Extra flags appended after defaults. |
MURMUR_NIF_NO_NATIVE |
If set, omit -march=native/-mtune=native (use for portable cross-platform builds). |
License
The Erlang wrapper code (src/, c_src/murmur_nif.c) is MIT.
The MurmurHash3 algorithm in c_src/murmur3/ was written by Austin
Appleby and placed in the public domain. The Cassandra-compatible
variant uses signed integer arithmetic to match Java's reference
implementation; the algorithmic modification is trivial enough to
remain in the public domain alongside the upstream code.