NatSet

Build StatusHex.pmHex.pm

This is an Elixir library for working with sets of natural numbers (i.e. non-negative integers). It uses bitwise operations to represent such sets much more compactly than you would get when using standard MapSets.

Documentation: http://hexdocs.pm/nat_set.

Usage

Add this library to your list of dependencies in mix.exs:

def deps do
  [{:nat_set, "~> 0.0.1"}]
end

Then run mix deps.get in your shell to fetch and compile nat_set.

Examples

Start an interactive Elixir shell with iex -S mix.

iex> NatSet.new([1, 2, 0, 1, 3])
#NatSet<[0, 1, 2, 3]>

iex> multiples_of_4 = 1..100_000 |> Enum.filter(&rem(&1, 4) == 0) |> NatSet.new
#NatSet<[4, 8, 12, 16, 20, ...]>

iex> multiples_of_6 = 1..100_000 |> Enum.filter(&rem(&1, 6) == 0) |> NatSet.new
#NatSet<[6, 12, 18, 24, 30, ...]>

iex> NatSet.intersection(multiples_of_4, multiples_of_6) |> NatSet.size
8333

See the documentation for more available functionality.

Benchmarks

The following test gives a rough idea of how NatSet's performance compares to that of MapSet for storing natural numbers.

test "compare efficiency of MapSets and NatSets" do
  max = 100_000

  {mus, map_set} = :timer.tc(fn -> Enum.into(1..max, MapSet.new) end)
  IO.puts("MapSet took #{mus |> secs} seconds and is #{map_set |> size_in_kb} kb")

  {mus, nat_set} = :timer.tc(fn -> Enum.into(1..max, NatSet.new) end)
  IO.puts("NatSet took #{mus |> secs} seconds and is #{nat_set |> size_in_kb} kb")
end

defp size_in_kb(term), do: :erts_debug.size(term) * :erlang.system_info(:wordsize) / 1024.0
defp secs(mus), do: mus / 1_000_000

This produced the following on my machine:

MapSet took 0.06468 seconds and is 2947.375 kb
NatSet took 0.043118 seconds and is 26.7890625 kb

Results

The results below are produced by Benchfella using the tests in nat_set_bench.exs.

difference using NatSet          100000   15.28 µs/op
difference using MapSet            2000   897.59 µs/op

disjoint? using NatSet         10000000   0.87 µs/op
disjoint? using MapSet            50000   34.60 µs/op

equal? using NatSet           100000000   0.06 µs/op
equal? using MapSet           100000000   0.06 µs/op

intersection using NatSet        100000   16.00 µs/op
intersection using MapSet          5000   389.47 µs/op

member? using NatSet               1000   2336.44 µs/op
member? using MapSet               1000   1307.68 µs/op

put and delete using NatSet         500   5381.54 µs/op
put and delete using MapSet         500   5378.17 µs/op

size using NatSet                   500   3163.51 µs/op
size using MapSet             100000000   0.03 µs/op

subset? using NatSet             500000   7.47 µs/op
subset? using MapSet               5000   302.90 µs/op

union using NatSet               100000   12.55 µs/op
union using MapSet                20000   86.31 µs/op