speed_trap

A basic Token Bucket rate limiter written in Erlang with no external dependencies.

GitHub last commit

Features

A basic Token Bucket based rate limiter for Erlang. This is a rate limiter in it's most basic form and uses atomics in order to keep track of whether or not a request should be rate limited on a per node basis.

Tests were performed across an atomics, ets and a gen_server based implementation.

Although the benchmark is not provided as part of speed_trap, the results are as follows:

Implementation	Throughput	Standard deviation
`atomics`	31,000,000 cps	0.915 ms
`ets`	760,000 cps	11 ms
`gen_server`	640,000 cps	0.517 ms

How to parameterise a rate limiter?

A token bucket rate limiter is defined by four mandatory parameters:

bucket_size, positive integer, represents maximum number of tokens that can be available to a client
refill_interval, milliseconds, how often the bucket is refilled with tokens
refill_count, positive integer, how many tokens are added every refill
delete_when_full, boolean, whether or not to delete the bucket when it's filled up

Optional parameters:

override, none | not_enforced | blocked, whether to override the rate limiter to always allow or block requests. not_enforced behaves the same as any other regular rate limiter with the exception that once the bucket is empty, one would receive {ok, rate_limit_not_enforced} rather than {error, too_many_requests}. This features allows a soft-launch of any rate limiter and allows one to tweak the limits before enforcing the rate limiter. blocked on the other hand will always return {error, blocked} and essentially short circuits that way. This can be useful when you truly don't want to accept any requests any more for this particular rate limiter.

You can pass any other key-value pair when configuring a speed trap. They will be present when inspecting speed traps (by calling speed_trap:all()).

A new speed trap always starts with a full bucket.

There is several different ways to achieve the same rate limit. For example, one wants to have the base rate of 100 requests per second. All the following ways to define it are correct:

Set the refill interval to 1 second, and the refill count to 100;
Set the refill interval to 100 milliseconds, and the refill count to 10;
Set the refill interval to 10 milliseconds, and the refill count to 1.

In all cases all available tokens can be consumed as soon as possible and then a client will get {error, too_many_requests} until the bucket is refilled again. The difference, however, is how new tokens are distributed across the 1 second period. Assume that one creates speed traps with the bucket size of 100.

Let's say the trap with the refill options from the first way was created at 00.000s and a client consumes one token per milliseconds. The bucket becomes empty at 00.100s and will be refilled to 100 at 01.000s.

Time	Available Tokens
00.000s	100 (full bucket)
00.001s	99 (1 consumed)
00.002s	98 (1 consumed)
...	...
00.100s	0 (1 consumed)
01.000s	100 (100 refilled)

In the second and the third cases, the availability of tokens are distributed evenly accross the correspondent refill intervals. Let's take the 3rd configuration as an example.

The trap is created at 00.000s, 100 tokens are available to a client. The client consumes one token per millisecond.
At 00.010s the number of available tokens is 91: 90 that haven't been consumed yet plus one which is refilled.
At 00.020 the total number of tokens in the bucket is 82: 81 haven't been consumed plus the refilled one.
And so on.

Time	Available Tokens
00.000s	100 (full bucket)
00.001s	99 (1 consumed)
00.002s	98 (1 consumed)
...	...
00.010s	90 (1 consumed)
00.010s	91 (1 refilled)
00.011s	90 (1 consumed)
00.011s	89 (1 consumed)
...	...
00.020s	81 (1 consumed)
00.020s	82 (1 refilled)
00.021s	81 (1 consumed)

Notice, that the momentary peak rate can be as high as the bucket size. This might be especially important when the requests are distributed unevenly. So, if one wants to support short periods with burst of requests on the system with a relatively low base rate, one could set the bucket size parameter to a higher value.

It does not make sense to set the bucket size to a lower value than the refill count because in this case the bucket will contain the overflowing tokens anyway.

Usage example

Example of setting up and using a rate limiter for POST requests to the captcha page that should handle a 10 requests per minute base load and allow 50 requests per minute peaks:

application:ensure_all_started(speed_trap).
Id = {<<"/api/v1/captcha_challenge">>, <<"POST">>}.
Options = #{bucket_size => 40,                   % difference of peak and base rates
            refill_interval => timer:seconds(6), % how often the bucket is refilled
            refill_count => 1,                   % number of tokens to refill
            delete_when_full => false            % Do not delete bucket when full
            override => none                     % No overrides
           },
speed_trap:new(Id, Options).                     % ok | raises the bad_options error
speed_trap:try_pass(Id).                         % {ok, RemainingTokens | rate_limit_not_enforced} | {error, too_many_requests}

In order to modify the rate limiters one can use modify/2:

NewRefillInterval = timer:seconds(1),
speed_trap:modify(Id, #{refill_interval => NewRefillInterval}).

In order to delete a rate limiter:

speed_trap:delete(Id).

Template rate limiters

When you do not know upfront all the rate limiters that you need you can add templates for rate limiters and connect them to rate limiter id patterns. When a speed trap is not present when calling speed_trap:try_pass/1, speed_trap looks for a pattern that matches the supplied id. If found a rate limiter with the id is created.

The template is a tuple of a template identifier and the same options map as used when calling speed_trap:new/2. An id pattern is a tuple of a match head and a template identifier for the pattern.

Say that you want a unique rate limiter (with the same configuration) for each user for a resource called my_resource. For the user "Alex" the rate limiter id would be {my_resource, "Alex"}.

A speed trap that is started from a template will have the template id stored with its options. This can be used when dynamically reconfiguring speed traps or when debugging a running system.

{templates,
 #{my_resource_template =>
  #{bucket_size => 40,
    refill_interval => 6000,
    refill_count => 1,
    delete_when_full => true,
    override => none
   }
 }
},
{id_patterns,
 [
  {
   {my_resource, &#39;_&#39;},
    my_resource_template
  }
 ]
}

Development

In order to build the project simply run make.

This repo uses rebar3 format to ensure consistent formatting of the codebase. If the pipeline fails with:

===> The following files are not properly formatted:
src/<file>.erl

Please run make format and commit the changes.

Running tests

make test