PGPool

PGPool is a PosgreSQL client that automatically uses connection pools and handles reconnections in case of errors.

PGPool also optimizes all of your statements, by preparing them and caching them for you under the hood.

It uses:

epgsql as the PostgreSQL driver.
poolboy as pooling library.

Install

If you’re using rebar3, add pgpool as a dependency in your project’s rebar.config file:

{pgpool, {git, "git://github.com/ostinelli/pgpool.git", {tag, "1.1.1"}}}

Or, if you’re using Hex.pm as package manager (with the rebar3_hex plugin):

{pgpool, "1.2.0"}

Then, compile:

$ rebar3 compile

Usage

Setup

Ensure to start PGPool from your application. This can be done by either providing it as a dependency in your .app file, or by starting it manually:

pgpool:start().

Specify Databases

Databases can be set in the environment variable pgpool. You’re probably best off using an application configuration file (in releases, sys.config):


{pgpool, [
  {databases, [
    {db1_name, [
      {pool, [
        %% poolboy options <https://github.com/devinus/poolboy>
        %% The `name` and `worker_module` options here will be ignored.
        {size, 10},         %% maximum pool size
        {max_overflow, 20}, %% maximum number of workers created if pool is empty
        {strategy, lifo}    %% can be lifo or fifo (default is lifo)
      ]},
      {connection, [
        {host, "localhost"},
        {user, "postgres"},
        {pass, ""},
        {options, [
          %% epgsql connect_options() <https://github.com/epgsql/epgsql>
          {port, 5432},
          {ssl, false},
          {database, "db1"}
        ]}
      ]}
    ]},

    {db2_name, [
      {pool, [
        {size, 10},
        {max_overflow, 20},
        {strategy, lifo}
      ]},
      {connection, [
        {host, "localhost"},
        {user, "postgres"},
        {pass, ""},
        {options, [
          {port, 5432},
          {ssl, false},
          {database, "db2"}
        ]}
      ]}
    ]}
  ]}
]}

Queries

Please refer to epgsql README for how to perform queries. Currently, PGPool supports the following.

Simple Query

pgpool:squery(DatabaseName, Sql) -> Result

Types:
  DatabaseName = atom()
  Sql = string() | iodata()
  Result =  {ok, Count} | {ok, Count, Rows} | {error, no_connection}
    Count =  non_neg_integer()
    Rows = (see epgsql for more details)

For example:

pgpool:squery(db1_name, "SELECT * FROM users;").

Simple queries cannot be optimized by PGPool since they cannot be prepared. If you want to optimize and cache your queries, consider using equery/3,4 or batch/2 instead.

Retries

In case there’s no available connection to the database, the standard squery/2 function will return {error, no_connection}. If you want to keep retrying until a connection is available, you can use squery/3.

Note however that this is a blocking call, and should be used only if needed.

pgpool:squery(DatabaseName, Sql, RetryTimeout) -> Result

Types:
  DatabaseName = atom()
  Sql = string() | iodata()
  RetryTimeout = non_neg_integer() | infinity
  Result = {ok, Count} | {ok, Count, Rows} | {error, no_connection}
    Count = non_neg_integer()
    Rows = (see epgsql for more details)

RetryTimeout specifies how much time (in milliseconds) will be spent waiting to retry (that is, excluding the time taken to call the database). Set to infinity if you want the call to block forever until a connection becomes available.

For example:

pgpool:squery(db1_name, "SELECT * FROM users;", 60000).

Extended Query

pgpool:equery(DatabaseName, Statement, Params) -> Result

Types:
  DatabaseName = atom()
  Statement = string()
  Params = list()
  Result = {ok, Count} | {ok, Count, Rows} | {error, no_connection}
    Count = non_neg_integer()
    Rows = (see epgsql for more details)

For example:

pgpool:equery(db1_name, "SELECT * FROM users WHERE id = $1;", [3]).

PGPool will prepare your statements and cache them for you, which results in considerable speed improvements. If you use a lot of different statements, consider memory usage because the statements are not garbage collected.

Retries

In case there’s no available connection to the database, the standard equery/3 function will return {error, no_connection}. If you want to keep retrying until a connection is available, you can use equery/4.

Note however that this is a blocking call, and should be used only if needed.

pgpool:equery(DatabaseName, Statement, Params, RetryTimeout) -> Result

Types:
  DatabaseName = atom()
  Statement = string()
  Params = list()
  RetryTimeout = non_neg_integer() | infinity
  Result = {ok, Count} | {ok, Count, Rows} | {error, no_connection}
    Count = non_neg_integer()
    Rows = (see epgsql for more details)

For example:

pgpool:equery(db1_name, "SELECT * FROM users WHERE id = $1;", [3], 60000).

Batch Queries

To execute a batch:

pgpool:batch(DatabaseName, StatementsWithParams) -> Result

Types:
  DatabaseName = atom()
  StatementsWithParams = [{Statement, Params}]
  Statement = string()
  Params = list()
  Result =  [{ok, Count} | {ok, Count, Rows}].
    Count =  non_neg_integer()
    Rows = (see epgsql for more details)

For example:

S = "INSERT INTO users (name) VALUES ($1);",

[{ok, 1}, {ok, 1}] = pgpool:batch(db1_name, [
    {S, ["Hedy"]},
    {S, ["Roberto"]}
]).

PGPool will prepare your statements and cache them for you, which results in considerable speed improvements. If you use a lot of different statements, consider memory usage because the statements are not garbage collected.

Contributing

So you want to contribute? That’s great! Please follow the guidelines below. It will make it easier to get merged in.

Before implementing a new feature, please submit a ticket to discuss what you intend to do. Your feature might already be in the works, or an alternative implementation might have already been discussed.

Do not commit to master in your fork. Provide a clean branch without merge commits. Every pull request should have its own topic branch. In this way, every additional adjustments to the original pull request might be done easily, and squashed with git rebase -i. The updated branch will be visible in the same pull request, so there will be no need to open new pull requests when there are changes to be applied.

Ensure to include proper testing. To run PGPool tests, you need to create the database pgpool_test for user postgres with no password, and then simply run from the project’s root directory:

$ make tests