Khafra Search

Khafra allows you to easily deploy and maintain a search cluster similar to elastic search. You will want to use khafra if:

Khafra is a dependency that can be added to any Elixir project or Phoenix Framework and uses Quantum to handle job execution schedules (for non-real time indexing use cases).

To query your running sphinx environment you can use the Giza Sphinx Client for Elixir

Installation

def deps do
  [
    {:khafra_search, "~> 0.1"}
  ]
end

# Add to your application or supervisor
def start(_type, _args) do
    import Supervisor.Spec

    # List all child processes to be supervised
    children = [
      ...,
      supervisor(Khafra.Supervisor, [])
    ]

    opts = [strategy: :one_for_one, name: YourApp.Supervisor]
    Supervisor.start_link(children, opts)
  end

Getting Started

First up set some indexing config in your application to connect to your database and index a table:

config :khafra_search, :source_sqldb,
  adapter: :postgres,
  database: "database_name",
  username: "db_user_name",
  password: "db_user_pass",
  hostname: "localhost"
# Note the \\ newline deliminator for query strings
config :khafra_search, :source_person,
  parent: :source_sqldb,
  query: """
    SELECT id, name, company, title, updated_at \\
    FROM persons
  """,
  attributes: [
    updated_at: :datetime],
  fields: [
    name:    :string,
    company: :string,
    title:   :string
  ]

config :khafra_search, :i_person,
  parent: :index_defaults,
  source: :source_person

# Specifying indices to index allows to change which indexes are rotated per environment
config :khafra_search, 
  indices: [:i_person]

Now let’s do a local Sphinx install and query some data

# May take time depending on connection speed
> mix khafra.sphinx.download linux_64

> mix khafra.gen.sphinxconf

# Try out your config
> mix khafra.sphinx.index all

# Start the search daemon
> mix khafra.sphinx.searchd

# You can now query sphinx! (Recommendation: use the Giza Elixir search client).  Let's rotate the index while running
> mix khafra.sphinx.index rotate all

Rotating the index can be set on a schedule using the ‘advanced’ configuration options. You should now be able to test locally and your deployment depends entirely on how you like to run your Elixir deployments.

Advanced Configuration

Configure the indexer to rotate once per day (see Quantum for more details):

config :khafra_search, Khafra.Scheduler,
  timezone: "America/Los_Angeles",
  global: true,
  timeout: :infinity,
  jobs: [
    {"* * * * *", {Khafra.Job.Index, :run, [
      [{:option, "--rotate"}, {:option, "--all"}]
    ]}},
    {"@daily", {Khafra.Job.Index, :run, [{:option, "--rotate"}, {:option, "--all"}]}}
  ]

Configure other indexer defaults + generate wordforms (see Sphinx Docs for details):

# Note the cwd! keyword so the generator uses an absolute path for all of your environments
config :khafra_search, :index_defaults,
  type: "plain",
  source: {:sql, :source_sqldb},
  morphology: "none",
  min_stemming_len: "1",
  min_word_len: "1",
  min_infix_len: "2",
  html_strip: "0",
  preopen: "0",
  wordforms: "[cwd!]/sphinx/wordforms.txt"

> mix khafra.gen.wordform "s02e02" "season 2 episode 2"

> mix khafra.sphinx.index rotate all

Deployment Example

Coming soon

Needed features:

#!/bin/sh

release_ctl eval --mfa "Khafra.ReleaseTasks.download_sphinx/1" --argv -- "$@"
#!/bin/sh

release_ctl eval --mfa "Khafra.ReleaseTasks.generate_config/1" --argv -- "$@"
#!/bin/sh

release_ctl eval --mfa "Khafra.ReleaseTasks.indexer/1" --argv -- "$@"
#!/bin/sh

release_ctl eval --mfa "Khafra.ReleaseTasks.searchd/1" --argv -- "$@"

And in your rel/config.exs

environment :prod do
  set include_erts: true
  set include_src: false
  set cookie: :"some complicated cookie"
  set commands: [
    index: "rel/commands/indexer.sh",
    searchd: "rel/commands/searchd.sh",
    gen_config: "rel/commands/gen_config.sh",
    download_sphinx: "rel/commands/download_sphinx.sh"
  ]
end