k-ordered unique identity

The library develops an identity schema suitable for rough, lexicographical objects ordering.

Inspiration

The library provide interface to generate k-ordered unique identifiers in lock-free and decentralized manner for Erlang application.

An array A[1..n] is said to be k-ordered if A[i − k] ≤ A[i] ≤ A[i + k] for all i such that k < i ≤ n−k.

The library aims following objectives:

Background

The library has developed a dual schema: 64-bit unique identity to be used within Erlang virtual machine and 96-bit unique identity for distributed environments. Client applications benefits from short 64-bit identity using it locally and dynamically switches schema while communicating with remote peers.

The k-ordered value consists of time-stamp with millisecond resolution (50-bit) that is used to roughly sort events. The time-stamp ensures distinct sorting within virtual machine where system clock is controlled by operation system. A locally monotonic padding (14-bits) prevents the collisions (Note: 14-bits allows to have about 16K allocations per millisecond).

The time-stamp based sorting fails on distributed systems unless it implements precises time synchronization protocol. Therefore, it was decided to uses location aspect (node fingerprint) as component of k-ordered value. This component allows to keep ordering consistent even if clocks on other node is skewed. The developed schema allows to control the quality of the ordering (precision) depending on the length of neighborhood interval. The neighborhood interval is a time span where the location has higher priority then time.

The library represents k-ordered values as tuples but implements binary and url friendly base64 encoding. The serialization protocol has been changed after library version 1.2.0. New serialization improves allocation performance of k-order values. It uses Erlang OTP/18 feature erlang:unique_integer(...) to generate locally monotonic value. The library allocates about 13M k-ordered values per second on reference hardware.

64-bit, local

        20 bit         20 bit       10 bit        14 bit
   |--------------|--------------|----------|-----------------|
          A              B             C           Seq
   
   A, B, C - time stamp in micro seconds (casted to 50 bits)
   Seq     - sequence value generated by erlang:unique_integer([monotonic])

   -type l() :: {uid, t(), seq()}.

96-bit, global

        20 bit        20 - I bit      32 bit       I bit      10 bit       14 bit
   |--------------|--------------|--------------|----------|----------|-----------------|
          A              B0            Node          B1         C           Seq

   A, B, C - time stamp in microseconds (casted to 50 bits)
   Seq     - sequence value generated by erlang:unique_integer([monotonic])
   Node    - node fingerprint erlang:phash(erlang:node(), 1 bsl 32).
   I       - time neighborhood interval 

   -type g() :: {uid, id(), t(), seq()}.

Getting Started

The latest version of the library is available at its master branch. All development, including new features and bug fixes, take place on the master branch using forking and pull requests as described in contribution guidelines.

The stable library release is available via hex packages, add the library as dependency to rebar.config

{deps, [{uid}]}.

Libraries implements a simple api.

%% 
%% generate local identity
%%   {uid,{1539,625275,432128},1}
A = uid:l().

%%
%% convert local identity to global one
%%   {uid,&#39;uid@127.0.0.1&#39;,{1539,625275,432128},1}
B = uid:g(A).

%%
%% encode value to binary
%%   <<0,96,57,138,3,235,39,50,123,105,128,1>>
uid:encode(B).

%%
%% encode value to base64
%%   <<"AGA5igPrJzJ7aYAB">>
uid:encode64(B).

How To Contribute

The library is Apache 2.0 licensed and accepts contributions via GitHub pull requests:

The build process requires Erlang/OTP version 19.0 or later and essential build tools.

Build and run service in your development console. The following command boots Erlang virtual machine and opens Erlang shell.

git clone https://github.com/fogfish/uid
cd uid
make
make run

commit message

The commit message helps us to write a good release note, speed-up review process. The message should address two question what changed and why. The project follows the template defined by chapter Contributing to a Project of Git book.

Short (50 chars or less) summary of changes

More detailed explanatory text, if necessary. Wrap it to about 72 characters or so. In some contexts, the first line is treated as the subject of an email and the rest of the text as the body. The blank line separating the summary from the body is critical (unless you omit the body entirely); tools like rebase can get confused if you run the two together.

Further paragraphs come after blank lines.

Bullet points are okay, too

Typically a hyphen or asterisk is used for the bullet, preceded by a single space, with blank lines in between, but conventions vary here

bugs

If you experience any issues with the library, please let us know via GitHub issues. We appreciate detailed and accurate reports that help us to identity and replicate the issue.

License

Copyright 2012 Dmitry Kolesnikov

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.