llists

Copyright (c) 2022 John Krukoff

Version: 1.2.0

Authors: John Krukoff (github@cultist.org).

Modules

lfiles
llists
llists_utils
## Overview An Erlang/OTP library for working with iterators; an opaque type that is evaluated as a lazily evaluated list of elements. This allows for memory efficient processing of sequential data and easy composition of processing on such lazy lists. An interface identical to the stdlib `lists` module is implemented, allowing for easy conversion between list processing code and lazy list processing code. ![Lazy Construction Worker](doc/lazy.png) ## Getting Started This library is published to [hex.pm](https://hex.pm) as [llists](https://hex.pm/packages/llists). If you're using [rebar3](https://www.rebar3.org/) as your build tool, it can be added as a dependency to your rebar.config as follows: ``` {deps, [{llists}]}. ``` ## Usage To use this library, it is first necessary to create an iterator. This can either be done from an existing data structure using functions like `llists:from_list/1` or by expressing the continuation programmatically using functions like `llists:unfold/2`. Once an iterator is constructed, it can then be evaluated an element at a time using `llists:next/1` or by using one of the many utility and evaluation functions in `llists` or `llists_utils`. `llists` contains an iterator aware version of every function in the stdlib `lists` module, with guaranteed identical behaviour for side effect free and finite iterators. `lfiles` contains file utility functions replicating functionality from the `file` and `io` modules. Read functions create impure iterators that should only be evaluated once. Write functions consume iterators and write the produced data to a file. ### Iterators An iterator is an opaque record created by the `llists` module to represent a continuation. When evaluated by `llists:next/1` an iterator returns a lazy list, represented by one of two options: - `[]`: An empty list, signaling that no more elements are available. - `[Elem | Iterator]`: An improper list consisting of an element and an iterator to continue evaluation. ### Examples As an example task, let us calculate the mean difference between a list of integer pairs. These pairs are stored in a [file](http://github.com/jkrukoff/llists/blob/master/doc/example.txt) as comma separated values, one per line. We can use the `llists` module to both read lines from the file and calculate the average lazily, thus avoiding loading the entire file into memory during computation. First, we need to construct the iterator: ``` > {ok, File} = file:open("doc/example.txt", [read]). {ok,<0.227.0>} > I = llists:unfold(fun(File) -> case file:read_line(File) of {ok, Data} -> {Data, File}; eof -> file:close(File), none end end, File). {iterator,#Fun} ``` Next, a loop to parse the strings and calculate the mean difference: ``` > F = fun Calculate(I, Sum, Count) -> case llists:next(I) of [] -> Sum / Count; [Elem | Next] -> [A, B] = [list_to_integer(string:trim(Part)) || Part <- string:split(Elem, ",")], Calculate(Next, Sum + (A - B), Count + 1) end end. #Fun > F(I, 0, 0). -0.42 ``` We could also make use of the utility functions in `llists` and `lfiles` and compose the same result as follows: ``` > {ok, File} = file:open("doc/example.txt", [read]). {ok,<0.227.0>} > I = lfiles:read_line(File). {iterator,#Fun} > Split = llists:map(fun (Elem) -> string:split(Elem, ",") end, I). {iterator,#Fun} > Integers = llists:map(fun (Parts) -> [list_to_integer(string:trim(Part)) || Part <- Parts] end, Split). {iterator,#Fun} > {Sum, Count} = llists:foldl(fun ([A, B], {Sum, Count}) -> {Sum + (A - B), Count + 1} end, {0, 0}, Integers). {-42,100} > file:close(File). ok > Sum / Count. -0.42 ``` In both examples, we read only a single line of the file into memory at a time. Notice that we couldn't use `llists:sum/1` and `llists:length/1` here instead of `llists:foldl/3`, since our input iterator has side effects and can only be evaluated once. ## Contributing Please fork the repo and submit a PR. Tests are run automatically on the master branch by GitHub actions or can be run locally via: ``` make deps make check ``` If a Unix environment is not available, tests can be run inside a docker container via: ``` docker-compose build docker-compose run check ``` Documentation is autogenerated using edown and edoc via: ``` make doc ``` Development of the library should be done with an Erlang/OTP version of 25 or greater. The library requires an Erlang/OTP version of 21 or greater to function. Earlier versions may work if functions involving maps are avoided. ## Lineage Streams and lazily evaluated lists are common language constructs and much prior art exists. Scheme's [SRFI-41](https://srfi.schemers.org/srfi-41/srfi-41.html) served as a useful design document to base this work on. Other implementations that were used for reference: - Elixir's standard library [Stream](https://hexdocs.pm/elixir/Stream.html) module. - The Erlang stream module from the [Datum library](https://github.com/fogfish/datum/blob/master/src/stream/stream.erl). - The [zlist](https://github.com/egobrain/zlist) Erlang library. - The [infinite lists](http://erlang.org/documentation/doc-5.8/doc/programming_examples/funs.html) example from the Erlang documentation.