FaissEx

Elixir NIF bindings for FAISS — Facebook's library for efficient similarity search and clustering of dense vectors.

Binds to FAISS via its official C API (libfaiss_c). No external dependencies beyond elixir_make.

Features

Prerequisites

Installation

Add faiss_ex to your dependencies in mix.exs:

def deps do
  [
    {:faiss_ex, "~> 0.1.0"}
  ]
end

Then fetch and compile:

mix deps.get
mix compile

Option A: Build FAISS from source (default)

By default, the first mix compile clones FAISS from GitHub and builds it from source. This takes ~5-15 minutes but requires no pre-installed FAISS. The result is cached at ~/.cache/faiss_ex/ — subsequent builds are fast.

# Just works — no extra setup needed (besides cmake and a C++ compiler)
mix compile

Option B: Use a system-installed FAISS

If you already have FAISS installed (via a package manager or custom build), point FAISS_PREFIX at it to skip building from source entirely:

# macOS (Homebrew)
brew install faiss
FAISS_PREFIX=$(brew --prefix faiss) mix compile

# Ubuntu/Debian
sudo apt install libfaiss-dev
FAISS_PREFIX=/usr mix compile

# Conda
conda install -c pytorch faiss-cpu
FAISS_PREFIX=$CONDA_PREFIX mix compile

# Custom install location
FAISS_PREFIX=/opt/faiss mix compile

FAISS_PREFIX should point to a directory containing include/ (with c_api/ headers) and lib/ (with libfaiss and libfaiss_c shared libraries).

Note: Most system packages ship libfaiss but not libfaiss_c (the C API wrapper). If you get linker errors about libfaiss_c, you may need to build from source — just omit FAISS_PREFIX and let FaissEx handle it.

Build configuration

Variable Default Description
FAISS_PREFIX(unset) Path to system FAISS install. When set, skips building from source
FAISS_OPT_LEVELgeneric SIMD optimization level (see below). Only used when building from source
USE_CUDAfalse Set to true to enable GPU support
FAISS_GIT_REPOhttps://github.com/facebookresearch/faiss.git FAISS git repository (ignored when FAISS_PREFIX is set)
FAISS_GIT_REVv1.10.0 FAISS version tag or commit (ignored when FAISS_PREFIX is set)

SIMD optimization

FAISS uses SIMD instructions for fast distance computation. By default it builds with generic (portable) code. Set FAISS_OPT_LEVEL to enable optimized codepaths for your CPU:

Value Platform Instructions
generic Any Portable, no special instructions
avx2 x86-64 AVX2 + FMA (most Intel/AMD since ~2015)
avx512 x86-64 AVX-512 (Intel Xeon, AMD Zen 4+)
sve aarch64 SVE (ARM Neoverse V1+)
# Build with AVX2 optimizations
FAISS_OPT_LEVEL=avx2 mix compile

# Force a fresh FAISS build after changing opt level
rm -rf ~/.cache/faiss_ex && FAISS_OPT_LEVEL=avx2 mix compile

Note: Apple Silicon (M1-M4) uses NEON which is always enabled — FAISS_OPT_LEVEL has no effect on ARM macOS.

Usage

All functions accept plain lists (single vector) or lists of lists (batch):

# Single vector
FaissEx.Index.add(index, [1.0, 2.0, 3.0])
# Batch of vectors
FaissEx.Index.add(index, [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])

Creating an index and searching

# Create a flat L2 index with 128 dimensions
{:ok, index} = FaissEx.Index.new(128, "Flat")

# Add vectors
:ok = FaissEx.Index.add(index, [[0.1, 0.2, ...], [0.3, 0.4, ...]])

# Search for 5 nearest neighbors
{:ok, %{distances: distances, labels: labels}} = FaissEx.Index.search(index, query, 5)
# distances: [[float]] — n rows of k distances
# labels: [[integer]] — n rows of k vector indices

Index types

FAISS uses index factory strings to create different index types:

# Flat (exact search, no training needed)
{:ok, index} = FaissEx.Index.new(128, "Flat")

# IVF with flat quantizer (needs training)
{:ok, index} = FaissEx.Index.new(128, "IVF256,Flat")

# HNSW graph-based index
{:ok, index} = FaissEx.Index.new(128, "HNSW32")

# Inner product metric instead of L2
{:ok, index} = FaissEx.Index.new(128, "Flat", metric: :inner_product)

Training (IVF, PQ indexes)

Some index types require training before adding vectors:

{:ok, index} = FaissEx.Index.new(128, "IVF256,Flat")

# Train on representative data (list of lists)
training_data = for _ <- 1..10_000, do: for(_ <- 1..128, do: :rand.uniform())
:ok = FaissEx.Index.train(index, training_data)

# Now add and search
:ok = FaissEx.Index.add(index, training_data)
{:ok, results} = FaissEx.Index.search(index, query, 10)

Adding vectors with IDs

{:ok, index} = FaissEx.Index.new(128, "IDMap,Flat")

vectors = for _ <- 1..100, do: for(_ <- 1..128, do: :rand.uniform())
ids = Enum.to_list(1000..1099)
:ok = FaissEx.Index.add_with_ids(index, vectors, ids)

Saving and loading

:ok = FaissEx.Index.to_file(index, "/tmp/my_index.faiss")
{:ok, loaded} = FaissEx.Index.from_file("/tmp/my_index.faiss")

Cloning and resetting

{:ok, copy} = FaissEx.Index.clone(index)
:ok = FaissEx.Index.reset(index)  # clear all vectors

Index properties

{:ok, dim} = FaissEx.Index.dim(index)
{:ok, count} = FaissEx.Index.ntotal(index)
{:ok, trained?} = FaissEx.Index.trained?(index)

Reconstructing vectors

{:ok, vectors} = FaissEx.Index.reconstruct(index, [0, 5, 10])
# vectors: [[float]] — 3 rows of dim floats

Computing residuals

Compute the difference between vectors and their quantized reconstructions:

{:ok, index} = FaissEx.Index.new(4, "Flat")
vectors = [[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0]]
:ok = FaissEx.Index.add(index, vectors)

{:ok, residuals} = FaissEx.Index.compute_residuals(index, vectors, [0, 1])
# For a flat index, residuals are zero (exact reconstruction)

Batch search

Search with multiple query vectors at once for better throughput:

{:ok, index} = FaissEx.Index.new(128, "Flat")
:ok = FaissEx.Index.add(index, corpus_vectors)

# Search 100 queries at once
{:ok, %{distances: distances, labels: labels}} = FaissEx.Index.search(index, batch_of_queries, 10)
# distances: [[float]] — 100 rows of 10 distances
# labels: [[integer]] — 100 rows of 10 vector IDs

Example: Semantic Search with External Embeddings

FaissEx works with embeddings from any source — OpenAI, Cohere, Bumblebee, etc. Just pass them as lists of floats:

# Embeddings from any source (OpenAI, Cohere, Voyage, etc.)
embeddings = [
  [0.023, -0.041, 0.067, ...],  # 1536-dim for text-embedding-3-small
  [0.011, -0.032, 0.089, ...],
  # ...
]

dim = length(hd(embeddings))

# For cosine similarity, normalize vectors to unit length first
normalize = fn vec ->
  norm = :math.sqrt(Enum.reduce(vec, 0.0, fn x, acc -> acc + x * x end))
  Enum.map(vec, &(&1 / norm))
end

normalized = Enum.map(embeddings, normalize)

{:ok, index} = FaissEx.Index.new(dim, "Flat", metric: :inner_product)
:ok = FaissEx.Index.add(index, normalized)

# Query with a new embedding
query = normalize.(query_floats)
{:ok, %{distances: scores, labels: indices}} = FaissEx.Index.search(index, query, 10)
# scores: [[float]] — similarity scores (higher = more similar)
# indices: [[integer]] — vector IDs

Choosing an Index

Index Training Memory Speed Recall Best for
"Flat" No High Slow (brute force) 100% < 100K vectors, exact results
"IVFx,Flat" Yes High Fast ~95% 100K-10M vectors
"HNSWx" No High Very fast ~99% Read-heavy workloads
"IVFx,PQy" Yes Low Fast ~90% 1M+ vectors, memory constrained
"IVFx,SQfp16" Yes Medium Fast ~97% 1M+ vectors, good recall/memory balance
"Flat" + :inner_product No High Slow 100% Cosine similarity (normalize first)

x = number of IVF clusters (typical: 4*sqrt(n)), y = PQ subquantizers (typical: dim/4).

Practical guidelines

# Cosine similarity search — normalize vectors to unit length, then use inner product
{:ok, index} = FaissEx.Index.new(128, "Flat", metric: :inner_product)
:ok = FaissEx.Index.add(index, normalized_vectors)

IDMap wrapper

FAISS indexes assign sequential IDs by default (0, 1, 2...). Wrap with "IDMap," to use your own IDs:

# Custom IDs with any index type
{:ok, index} = FaissEx.Index.new(128, "IDMap,Flat")
{:ok, index} = FaissEx.Index.new(128, "IDMap,HNSW32")

:ok = FaissEx.Index.add_with_ids(index, vectors, [42, 99, 1337])

{:ok, %{labels: labels}} = FaissEx.Index.search(index, query, 5)
# labels now contain your custom IDs (42, 99, 1337...)

K-means Clustering

Basic clustering

# Cluster 128-dimensional vectors into 10 groups
{:ok, clustering} = FaissEx.Clustering.new(128, 10)

data = for _ <- 1..5000, do: for(_ <- 1..128, do: :rand.uniform())
{:ok, trained} = FaissEx.Clustering.train(clustering, data)

{:ok, centroids} = FaissEx.Clustering.get_centroids(trained)
# centroids: [[float]] — 10 rows of 128 floats, center of each cluster

Assigning vectors to clusters

# Which cluster does each vector belong to?
{:ok, %{labels: labels, distances: distances}} =
  FaissEx.Clustering.get_cluster_assignment(trained, data)
# labels: [[integer]] — 5000 rows of 1 cluster ID each
# distances: [[float]] — 5000 rows of 1 distance each

Clustering for preprocessing

Use clustering to build an IVF quantizer or to reduce a dataset:

# Cluster embeddings, then keep only vectors near centroids
{:ok, clustering} = FaissEx.Clustering.new(dim, num_clusters)
{:ok, trained} = FaissEx.Clustering.train(clustering, embeddings)
{:ok, %{labels: labels, distances: dists}} =
  FaissEx.Clustering.get_cluster_assignment(trained, embeddings)

# Filter to vectors within distance threshold
threshold = 10.0
nearby = Enum.zip(embeddings, List.flatten(dists))
  |> Enum.filter(fn {_vec, dist} -> dist < threshold end)
  |> Enum.map(&elem(&1, 0))

Thread Safety

Architecture

FaissEx uses Erlang NIF (Native Implemented Functions) to call FAISS through its C API:

Elixir (FaissEx.Index) → NIF stubs (FaissEx.NIF) → C (faiss_ex_nif.c) → libfaiss_c → libfaiss (C++)

Troubleshooting

OpenMP not found during build

On macOS, install libomp:

brew install libomp

Library not loaded: libfaiss_c.dylib

The shared libraries weren't installed correctly. Clean and rebuild:

mix clean
mix compile

Build takes too long

The first build compiles FAISS from source. This is cached at ~/.cache/faiss_ex/. To force a clean FAISS rebuild:

rm -rf ~/.cache/faiss_ex/
mix compile

cmake not found

# macOS
brew install cmake

# Ubuntu/Debian
sudo apt install cmake

GPU Support

Building with CUDA

USE_CUDA=true mix compile

This requires the CUDA toolkit to be installed. The build adds -DFAISS_ENABLE_GPU=ON to cmake and compiles GPU-specific NIF functions.

Moving indexes to GPU

# Create index on CPU first
{:ok, index} = FaissEx.Index.new(128, "Flat")
:ok = FaissEx.Index.add(index, vectors)

# Move to GPU (device 0)
{:ok, gpu_index} = FaissEx.Index.cpu_to_gpu(index, 0)

# Search on GPU (faster for large datasets)
{:ok, results} = FaissEx.Index.search(gpu_index, query, 10)

# Move back to CPU (e.g. for saving to file)
{:ok, cpu_index} = FaissEx.Index.gpu_to_cpu(gpu_index)
:ok = FaissEx.Index.to_file(cpu_index, "/tmp/index.faiss")

GPU at index creation time

{:ok, gpu_index} = FaissEx.Index.new(128, "Flat", device: {:cuda, 0})

Checking GPU availability

{:ok, num_gpus} = FaissEx.NIF.nif_get_num_gpus()
# Returns 0 if not compiled with CUDA or no GPUs available

GPU lifecycle notes

Running GPU tests

USE_CUDA=true mix test --include cuda

Benchmarks

Measured on Apple M4 Max (16 cores, 64 GB RAM), Elixir 1.20.0-rc.1, OTP 28.3, FAISS v1.10.0. Index: 10,000 vectors, 128 dimensions, Flat L2.

Operation ips avg median 99th %
reconstruct 1 vector 215.33 K 4.64 μs 4.75 μs 11 μs
add 1 vector 133.28 K 7.50 μs 6.92 μs 14.54 μs
reconstruct 10 vectors 23.57 K 42.42 μs 42.21 μs 79.14 μs
compute_residuals 1 vector 17.00 K 58.83 μs 55.92 μs 124.25 μs
search k=10, 1 query 12.05 K 82.95 μs 80.92 μs 104.25 μs
compute_residuals 10 vectors 7.72 K 129.46 μs 124.46 μs 217.88 μs
search k=10, 10 queries 5.13 K 194.90 μs 190.42 μs 266.53 μs
reconstruct 100 vectors 1.89 K 528.36 μs 499.75 μs 930.06 μs
compute_residuals 100 vectors 1.11 K 903.15 μs 866.35 μs 1325.00 μs
search k=10, 100 queries 0.58 K 1.72 ms 1.72 ms 1.96 ms
add 1000 vectors 0.38 K 2.65 ms 2.58 ms 3.67 ms
kmeans k=10, 1000 vectors 0.108 K 9.22 ms 9.00 ms 13.87 ms
kmeans k=50, 1000 vectors 0.098 K 10.18 ms 9.97 ms 14.32 ms
add 10000 vectors 0.018 K 56.52 ms 55.05 ms 71.46 ms

Run benchmarks yourself:

mix run bench/faiss_ex_bench.exs

License

Apache License 2.0.