Snowball

Snowball string-processing language compiler and runtime for Elixir.

Snowball is a small string-processing language designed for creating stemming algorithms in information retrieval. This package compiles .sbl source files into Elixir modules and provides the runtime support functions that the generated modules call into.

This package does not bundle pre-compiled stemmers. For the canonical Snowball algorithms compiled to Elixir modules (English, French, German, and 30+ more), see the companion text_stemmer package.

What's in the box

Installation

Add :snowball to your mix.exs deps:

def deps do
  [
    {:snowball, "~> 0.1"}
  ]
end

Generating stemmers from .sbl sources

Drop your Snowball sources in src/algorithms/ and run:

mix snowball.gen

By default this generates Snowball.Stemmers.<Lang> modules into lib/snowball/stemmers/. Override either with the relevant flag:

mix snowball.gen --module-prefix MyApp.Stemmers \
                 --output-dir lib/my_app/stemmers \
                 --algorithms-dir priv/snowball

You can also generate a specific algorithm by name:

mix snowball.gen english french

The generated modules depend only on Snowball.Runtime for their runtime, so adding :snowball to your deps is sufficient. See text_stemmer for the canonical implementation of snowball-based stemmers.

Documentation

Full API documentation is published at https://hexdocs.pm/snowball.

License

Apache-2.0. See LICENSE.md.