Fuzler

A tiny, Rust‑powered string‑similarity helper for Elixir.

Fuzler gives you one public function:

Fuzler.similarity_score(query :: String.t(), target :: String.t()) :: float

It returns a normalised score in $0.0 – 1.0$ that tells you how closely two pieces of text match—robust to typos, word‑order swaps, case and basic punctuation.

Behind the scenes it calls a compiled Rust NIF that mixes:

The result is symmetric (score(a,b) ≈ score(b,a)), length‑normalised and remains meaningful from single words to multi‑sentence paragraphs.


Installation

Add to your mix.exs:

def deps do
  [
    {:fuzler, "~> 0.1.2"}
  ]
end

You need Rust ≥ 1.70 installed; rustler will compile the NIF automatically.


Quick examples

iex> Fuzler.similarity_score("ciao", "ciao")
1.0

iex> Fuzler.similarity_score("bella ciao", "ciao bella")
0.70       # same words, different order

iex> long_text = "bella ciao come va oggi spero che tu stia bene ..."
iex> Fuzler.similarity_score("ciao", long_text)
0.75       # query appears once inside a 40‑token paragraph

iex> Fuzler.similarity_score("bonjour", long_text)
0.12       # word not present

When should I use it?

Use case Why it works well
typo‑tolerant autocomplete / “did‑you‑mean” Hamming + Levenshtein catch small edits fast
matching short queries inside long blobs windowed partial ratio focuses on the best slice
order‑agnostic key comparison token‑bag Jaccard treats “ciao bella” = “bella ciao”
quick relevance scoring in Elixir pure NIF call, no external service needed

Not a full‑text search engine or a semantic synonym matcher—that’s what Tantivy / Embeddings are for.


API

@doc "Returns a similarity score ∈ [0.0, 1.0]"
@spec similarity_score(String.t(), String.t()) :: float

If the NIF failed to load you’ll get:

:erlang.nif_error(:nif_not_loaded)

so your code can decide to fall back or skip tests.


How good is the score?

Query / Target Score ≈
identical strings (any case / punctuation) 1.00
same words, swapped order 0.68 – 0.72
one‑word query present once in 45‑token paragraph \~0.75
one‑word query absent from paragraph ≤ 0.15
80‑token paragraph vs same with 1 typo ≥ 0.90
“ciao bella” with +30 random filler tokens appended \~0.58

Running the test suite

mix test runs a handful of ExUnit cases covering:

All similarity tests auto‑skip if the NIF isn’t loaded (e.g. on CI without Rust).


License

MIT License