ExoSQL

Build Status

Universal SQL engine for Elixir.

This library implements the SQL engine to perform queries on user provided databases using a simple interface based on Foreign Data Wrappers from PostgreSQL.

This allows to use SQL on your own data and virtual tables.

For example it includes a CSV reader and an HTTP client, so that you can do queries as:

SELECT url, status_code
  FROM urls
  INNER JOIN request
  ON urls.url = request.url

There is a simple repl to be able to test ExoSQL:

iex> ExoSQL.repl()
exosql> SELECT m, SUM(price) FROM generate_series(10) as m LEFT JOIN (SELECT width_bucket(price, 0, 200, 10) AS n, price FROM products) ON n = m GROUP BY m
tmp.m.m | tmp.tmp.col_2
--------------------------
1       | 31
2       | 30
3       | 0
4       | 0
5       | 0
6       | 0
7       | 0
8       | 0
9       | 0
10      | 0

Origin

The origin of the library is as a SQL layer to all the services connected to you Serverboards.

Each service can export tables to be accessed via SQL and then can show the data in the Dashboards, the notebook, or used in the rules.

Installation

The package can be installed by adding exosql to your list of dependencies in mix.exs:

def deps do
  [
    {:exosql, "~> 0.2"}
  ]
end

Features

Check the tests for current available features.

Variables

Variables can be passed as a dictionary at __vars__ inside the context, and referenced as $name at the SQL expression. This may change in the future to streamline it more with standard SQL (no need for $).

INNER JOIN

Because some columns may need to be autogenerated depending on the query, if you want to access those columns you may need to use INNER JOINS. This way the planner asks for those specific column values.

For example:

SELECT * FROM request

does not know to which URL you want to access, but:

SELECT * FROM request WHERE url = 'http://serverboards.io'

knows the URL and can get the data.

Then same way, on INNER JOINS this can be used to access to auto generated data:

SELECT url, status_code
  FROM urls
  INNER JOIN request
  ON urls.url = request.url

Builtins

String operations

format(format_str, args...)

Formats a String using C sprintf-like parameters. Known placeholders are:

lower(str)

Lower case a full string

join(str, sep=",")

Joins all elements from a list into a string, using the given separator.

join([1,2,3,4], "/")
"1/2/3/4"

split(str, sep=[", ", ",", " "])

Splits a string into a list using the given separator.

split("1, 2,3 4")
["1", "2", "3", "4"]

substr(str, start, end=10000) / substr(str, end)

Extracts a substring from the first argument.

Can use negative indexes to start to count from the end.

substr('#test#', 1, -1)
"test"

to_string(arg)

Converts the given argument into a string.

to_string(1)
"1"

upper(str)

Upper cases a full string

Date time functions

strftime(datetime, format_str)

Convert a datetime to a string. Can be used also to extract some parts of a date, as the day, year and so on.

Normally strftime can be used directly with a string or an integer as it does the conversion to datetime implicitly.

It is based on Timexformatting.

Most common markers:

to_datetime(str | int)

Converts the given string or integer to a date.

The string must be in ISO8859 sub string format:

This is called implicitly on strftime calls, and normally is not needed.

Boolean functions

bool(arg)

Converts to boolean. Equivalent to NOT NOT arg

Logical functions

if(cond, then, else)

Evaluates the condition and if true returns the then value, or else the else value.

Currently it is a function, not a macro nor expression, so it executes both sides which can result in an error or performance problems.

Aggregation functions

avg(expr)

Calculates the average of the calculated expression on the group rows. Equivalent to sum(expr) / count(expr).

If no rows, returns NULL.

count(*)

Counts the number of rows of the aggregates expression.

max(expr)

Returns the maximum value of the given expression for the group.

min

Returns the minimum value of the given expression for the group.

sum(expr)

For each of the grouped rows, calculates the expression and returns the sum. If there are no rows, returns 0.

Miscellaneous functions

generate_series(end) / generate_series(start, end, step=0)

This function generates a virtual table with one column and on each row a value of the series.

Can be reverse with a larger start than end and negative step.

It can be used to for example fill all holes in a temporal serie:

SELECT month, SUM(value)
  FROM generate_series(12) AS month
LEFT JOIN purchases
  ON strftime(purchases.datetime, "%m") == month
GROUP BY month

This will return 0 for empty months on the purchases table.

jp(json, selector)

Does JSON Pointer selection:

regex(str, regex, query \\ nil)

Performs a regex search on the string.

It uses elixir regex, so use it as reference.

Can use groups and named groups for matching and it will return a list of a map with the result. It can optionally use directly JSON pointer queries. See jp function.

If matches the result will be “trueish” (or “falsy” if doesn’t) so can be used as a boolean.

round(number, precision=0)

Returns the number rounded to the given precission. May be convert to integer if precission is 0.

urlparse(string, sel="")

Parses an URL and returns a JSON.

If selector is given it does the equivalent of callong jp with that selector.

width_bucket(n, start, end, buckets)

Given a n value it is assigned a bucket between 0 and buckets, that correspond to the full width between start and end.

If a value is out of bounds it is set either to 0 or to buckets - 1.

This helper eases the generation of histograms.

For example an histogram of prices:

SELECT n, SUM(price)
  FROM (SELECT width_bucket(price, 0, 200, 10) AS n, price
          FROM products)
  GROUP BY n

or more complete, with filling zeroes:

SELECT m, SUM(price)
  FROM generate_series(10) AS m
  LEFT JOIN (
        SELECT width_bucket(price, 0, 200, 10) AS n, price
          FROM products
    )
    ON n = m
 GROUP BY m

Included extractors

ExoSQL has been developed with the idea of connecting to Serverboards services, and as such it does not provide more than some test extractors:

Creating new ones is a very straightforward process. The HTTP example can be followed.

This is not intended a full database system, but to be embedded into other Elixir programs and accessible from them by end users. As such it does contain only some basic extractors that are needed for proper testing.

Using ExoSQL

There is no formal documentation yet, but you can check the esql_test.exs file to get an idea of how to use ExoSQL.

Example:

context = %{
  "A" => {ExoSQL.Csv, path: "test/data/csv/"},
  "B" => {ExoSQL.HTTP, []}.
  "__vars__" => %{ "start" => "2018-01-01" }
}
{:ok, result} = ExoSQL.query("
  SELECT urls.url, request.status_code
    FROM urls
   INNER JOIN request
      ON urls.url = request.url
", context)
%ExoSQL.Result{
  columns: [{"A", "urls", "url"}, {"B", "request", "status_code"}],
  rows: [
    ["https://serverboards.io/e404", 404],
    ["http://www.facebook.com", 302],
    ["https://serverboards.io", 200],
    ["http://www.serverboards.io", 301],
    ["http://www.google.com", 302]
  ]}

A Simple extractor can be:

defmodule MyExtractor do
  def schema(_config), do: {:ok, ["week"]}
  def schema(_config, "week"), do: {:ok, %{ columns: ["id", "nr", "name", "weekend"] }}
  def execute(_config, "week", _quals, _columns) do
    {:ok, %{
      columns: ["id", "nr", "name", "weekend"],
      rows: [
        [1, 0, "Sunday", true],
        [2, 1, "Monday", false],
        [3, 2, "Tuesday", false],
        [4, 3, "Wednesday", false],
        [5, 4, "Thursday", false],
        [6, 5, "Friday", false],
        [7, 6, "Saturday", true],
      ]
    }}
  end
end

And then a simple query:

{:ok, res} = ExoSQL.query("SELECT * FROM week WHERE weekend", %{ "A" => {MyExtractor, []}})               
ExoSQL.format_result(res)
A.week.id A.week.nr A.week.name A.week.weekend
1 0 Sunday true
7 6 Saturday true

Related libraries

There are other implementations of this very same idea:

If you know any other, please ping me and I will add it here.

I develop ExoSQL as I needed an elixir solution for an existing project, and to learn how to create an SQL engine. ExoSQL is currently used in Serverboards KPI.

Known BUGS