SafeNIF

Elixir CILicense: MITHex version badge

Wrap your untrusted NIFs so that they can never crash your node.

SafeNIF Is Experimental

SafeNIF is in early development and subject to changes in the behaviour and the API.

For right now, it can be used to wrap any function or MFA that might cause some sort of crash on the BEAM node in order to keep that function safe and isolated. It currently carries performance penalties of peer node startup and code loading, however warm node pooling is in development to optimize this performance penalty.

Benchmarks

Benchmarks can be found in the bench directory. As of v0.1.0, SafeNIF has not implemented pooling of peer nodes. This means that it currently incurs the high cost of starting up a peer node for every call, which can take anywhere from 100ms to over a second, depending on how much code needs to be loaded onto the peer node. You can see from the benchmarks that out of the three methods benchmarked (CLI+Port, NIF, SafeNIF), SafeNIF is currently the slowest due to this incurred cost.

Adding pooling will be implemented in v0.2.0 and should make this far more efficient as we will only need to incur the cost once per node created. It should be noted that pooling will incur different costs - namely memory and CPU since it spins up a node on the same machine.

The following information was generated by Claude and Reviewed by @probably-not. If issues in this README are found, feel free to open up a PR to fix them!

The Problem

NIFs (Native Implemented Functions) are powerful but dangerous. A buggy or malicious NIF can crash your entire BEAM node, taking down all processes and connections with it. There's no way to catch or recover from a NIF crash - your node simply dies.

SafeNIF solves this by running untrusted code on isolated peer nodes. If the NIF crashes, only the peer dies. Your main node continues running, and you get a clean error tuple back.

Usage

Basic Usage

SafeNIF provides a single function: SafeNIF.wrap/2. Pass it an MFA (module, function, arguments) tuple and it runs on an isolated peer node:

# Successful execution returns {:ok, result}
{:ok, 6} = SafeNIF.wrap({Kernel, :+, [2, 4]})

# Complex return values work fine
{:ok, %{name: "test"}} = SafeNIF.wrap({Map, :put, [%{}, :name, "test"]})

Wrapping Potentially Dangerous NIFs

The primary use case is wrapping NIFs that might crash:

defmodule MyApp.ImageProcessor do
  def safe_process(image_binary) do
    # UntrustedNIF.process/1 might crash the BEAM
    case SafeNIF.wrap({UntrustedNIF, :process, [image_binary]}) do
      {:ok, processed} -> 
        {:ok, processed}
      {:error, :noconnection} -> 
        # The NIF crashed the peer node
        {:error, :nif_crashed}
      {:error, :timeout} -> 
        {:error, :processing_timeout}
      {:error, reason} -> 
        {:error, reason}
    end
  end
end

Timeouts

The default timeout is 5 seconds. Specify a custom timeout as the second argument using to_timeout/1:

# 30 second timeout for long-running operations
SafeNIF.wrap({HeavyComputation, :run, [data]}, to_timeout(second: 30))

# 2 minute timeout for very long operations
SafeNIF.wrap({BatchJob, :process, [items]}, to_timeout(minute: 2))

# 500ms timeout for quick operations
SafeNIF.wrap({QuickCheck, :validate, [input]}, to_timeout(millisecond: 500))

When a timeout occurs, the peer node is killed and {:error, :timeout} is returned.

Anonymous Functions

Anonymous functions are supported but with an important caveat: the module that defines the function must be loadable on the peer node.

# Works
SafeNIF.wrap(fn -> 1 + 1 end)

# Works (application modules are loaded on the peer)
SafeNIF.wrap(fn -> MyApp.Worker.do_work() end)

# May fail if defined inside a code path that is not part of the application.
defmodule MyTest do
  def run_test do
    SafeNIF.wrap(fn -> :test_result end)
  end
end

For maximum reliability, prefer MFA tuples over anonymous functions.

Error Handling

SafeNIF returns tagged tuples to distinguish between successful results and failures:

case SafeNIF.wrap({SomeModule, :some_function, [arg]}) do
  {:ok, result} ->
    # Function executed successfully, result is the return value
    handle_success(result)
    
  {:error, :timeout} ->
    # Function exceeded the timeout
    handle_timeout()
    
  {:error, :noconnection} ->
    # Peer node crashed (NIF crash, :erlang.halt, etc.)
    handle_crash()
    
  {:error, :not_alive} ->
    # Current node isn't running in distributed mode
    handle_not_distributed()
    
  {:error, reason} ->
    # Function raised/exited with reason
    handle_error(reason)
end

Note that if your wrapped function returns an error tuple, it's wrapped in {:ok, ...}:

# Function returns {:error, :not_found}
{:ok, {:error, :not_found}} = SafeNIF.wrap({MyModule, :find, [123]})

This follows the same convention as Task.async_stream/5.

Requirements

Distributed Mode

SafeNIF requires your node to be running in distributed mode. If you call SafeNIF.wrap/2 on a non-distributed node, you'll get {:error, :not_alive}.

For development, start IEx with a node name:

iex --sname myapp -S mix

For production releases, ensure your node is started with distribution enabled.

Running Tests

Tests require distribution. Add this to your test/test_helper.exs:

{:ok, _} = Node.start(:"test@127.0.0.1", :shortnames)
ExUnit.start()

Or run tests with:

mix test --sname test

How It Works

When you call SafeNIF.wrap/2:

  1. A new BEAM node is started as a hidden peer using OTP's :peer module
  2. All code paths and application configuration are copied to the peer
  3. Applications are started on the peer
  4. Your function executes on the peer node
  5. The result is sent back via Erlang distribution
  6. The peer node shuts down

Hidden Nodes

Peer nodes are started with the -hidden flag. This means they:

This prevents SafeNIF's ephemeral peers from interfering with your cluster topology.

Performance Considerations

Starting a peer node is expensive. Each call to SafeNIF.wrap/2 incurs:

This can take 500ms-2s depending on your application size. SafeNIF is designed for isolation, not performance. Use it for:

Don't use it for:

Note: Warm node pooling is planned for a future release to amortize startup costs across multiple calls.

Installation

SafeNIF is available on Hex.

To install, add it to you dependencies in your project's mix.exs.

def deps do
  [
    {:safe_nif, ">= 0.0.1"}
  ]
end

Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/safe_nif.