erldn
An EDN parser for BEAM languages, to read Clojure's Extensible Data Notation
erldn is a low level parser: it simply provides an Erlang data structure.
This project implements EDN support using leex and yecc. Results are tested with eunit.
Notes on how this fork differs from the original:
-
provides a new top-level
parse/1function - supports binary input (in addition to the original string input)
-
support file input (if the passed string is a file that exists and ends with
.edn, it will be read) -
provides a
parse_file/1function - adds support for multiple top-level EDN data elements in a single input (returns a list of results)
-
WIP: add support for special numerical values
##Inf,##-Inf, and##NaN - and more to come ...
Add Dependency
In your project's rebar.config:
{deps, [
{erldn, "1.1.0", {pkg, erlsci_edn}},
]}.Usage Examples
1> erldn:parse("{}").
{ok,{map,[]}}
2> erldn:parse("1").
{ok,1}
3> erldn:parse("true").
{ok,true}
4> erldn:parse("nil").
{ok,nil}
5> erldn:parse("[1 true nil]").
{ok,{vector,[1,true,nil]}}
6> erldn:parse("(1 true nil :foo)").
{ok,[1,true,nil,foo]}
7> erldn:parse("(1 true nil :foo ns/foo)").
{ok,[1,true,nil,foo,{symbol,'ns/foo'}]}
8> erldn:parse("#{1 true nil :foo ns/foo}").
{ok,{set,[1,true,nil,foo,{symbol,'ns/foo'}]}}
9> erldn:parse("#myapp/Person {:first \"Fred\" :last \"Mertz\"}").
{ok,{tag,'myapp/Person',
{map,[{first,"Fred"},{last,"Mertz"}]}}}
10> erldn:parse("#{1 true #_ nil :foo ns/foo}").
{ok,{set,[1,true,{ignore,nil},foo,{symbol,'ns/foo'}]}}
11> erldn:parse("#{1 true #_ 42 :foo ns/foo}").
{ok,{set,[1,true,{ignore,42},foo,{symbol,'ns/foo'}]}}
% to_string
12> {ok, Result} = erldn:parse("{:a 42}").
{ok,{map,[{a,42}]}}
13> io:format("~s~n", [erldn:to_string(Result)]).
{:a 42}
ok
% to_erlang
14> erldn:to_erlang(element(2, erldn:parse("[1, nil, :nil, \"asd\"]"))).
[1,nil,nil,<<"asd">>]API
parse/1
high-level parsing function that accepts either binary or string input; automatically
detects if input is a filename ending in .edn and reads the file, otherwise
parses the input directly; for single values returns the unwrapped result,
for multiple values returns a list
parse_file/1
parses an EDN file by reading the contents and parsing them; the filename must
end with .edn extension; supports both single and multiple top-level values
parse_str/1
parses a string with EDN into an erlang data structure maintaining all the details from the original edn; for single values returns unwrapped result, for multiple values returns a list
to_string/1
converts the result from parsing functions into an edn string representation
to_erlang/1
converts the result from parsing functions into an erlang-friendly version of itself; see "To Erlang Mappings" below.
to_erlang/2
like to_erlang/1 but accepts a tuplelist as a second argument with a
tag as the first argument and a function (fun (Tag, Value, OtherHandlers) -> .. end)
as the second of each pair to handle tagged values.
lex_str/1
tokenizes an EDN string into a list of lexical tokens; primarily used internally by the parser but can be useful for debugging or custom parsing scenarios
Be sure to check the unit tests for usage examples; there are hundreds of them.
Parser Type Mappings
This table shows how EDN data types are represented in Erlang after parsing with erldn:parse/1 or erldn:parse_str/1. These are the "raw" parsed representations that preserve EDN semantics and can be converted back to EDN strings.
| EDN Type | EDN Example | Erlang Representation | Erlang Example |
|---|---|---|---|
| nil | nil | nil (atom) | nil |
| boolean | true, false | boolean atoms | true, false |
| integer | 42, -17, +5 | integer | 42, -17, 5 |
| integer with N suffix | 42N | integer (arbitrary precision marker ignored) | 42 |
| float | 3.14, 1.2e5 | float | 3.14, 120000.0 |
| float with M suffix | 3.14M | float (exact precision marker ignored) | 3.14 |
| character | \c, \A, \newline | {char, Integer} | {char, 99}, {char, 65}, {char, 10} |
| string | "hello world" | binary (UTF-8) | <<"hello world">> |
| keyword (simple) | :foo | atom | foo |
| keyword (namespaced) | :ns/foo | atom | 'ns/foo' |
| keyword (special case) | :nil | {keyword, nil} | {keyword, nil} |
| symbol | foo, ns/bar, / | {symbol, Atom} | {symbol, foo}, {symbol, 'ns/bar'}, {symbol, '/'} |
| list | (1 2 3) | list | [1, 2, 3] |
| vector | [1 2 3] | {vector, List} | {vector, [1, 2, 3]} |
| map | {:a 1 :b 2} | {map, PropList} | {map, [{a, 1}, {b, 2}]} |
| set | #{1 2 3} | {set, List} | {set, [1, 2, 3]} |
| tagged element | #inst "2024-01-01" | {tag, Symbol, Value} | {tag, 'inst', <<"2024-01-01">>} |
| discard element | #_ 42 | {ignore, Value} | {ignore, 42} |
| comments | ; comment | (ignored during parsing) | (not represented) |
Implementation Status
| Feature | Status | Notes |
|---|---|---|
| Ratios | ❌ Not implemented | 22/7 will parse as symbol, not ratio |
| Advanced integers | ❌ Not implemented | 0xFF, 0777, 36rZ not supported |
| Unicode chars | ❌ Limited | \uNNNN format not supported |
| Octal chars | ❌ Not implemented | \oNNN format not supported |
| String escapes | ⚠️ Partial | Basic escapes only |
| Metadata | ❌ Not implemented | ^{:meta true} value not supported |
Notes
Keywords vs Symbols: Keywords start with
:and become atoms. Symbols become{symbol, atom}tuples to distinguish them from keywords.Namespace Handling: Both keywords and symbols can have namespaces separated by
/. The entire string becomes a single atom with the/included.Set Uniqueness: Sets are parsed as lists and do not enforce uniqueness at parse time.
Map Ordering: Maps are represented as property lists maintaining insertion order.
Character Representation: Characters are tagged tuples containing the Unicode code point as an integer.
Nil Keyword Special Case: The keyword
:nilis handled specially to avoid confusion with thenilatom.Binary Strings: All strings are converted to UTF-8 binaries for efficient memory usage and Unicode support.
Nested Structures: All container types (lists, vectors, maps, sets) can contain any other EDN types including other containers.
To Erlang Mappings
This table shows how the parsed EDN data structures are transformed by erldn:to_erlang/1 and erldn:to_erlang/2 into more Erlang-idiomatic representations. These transformations make the data easier to work with in Erlang but cannot be directly converted back to EDN without additional type information.
| Parsed Representation | Erlang-Friendly Result | Example Transformation |
|---|---|---|
nil | nil | nil → nil |
true | true | true → true |
false | false | false → false |
42 | 42 | 42 → 42 |
3.14 | 3.14 | 3.14 → 3.14 |
{char, 99} | "c" | {char, 99} → "c" |
<<"hello">> | <<"hello">> | <<"hello">> → <<"hello">> |
foo (keyword) | foo | foo → foo |
{keyword, nil} | nil | {keyword, nil} → nil |
{symbol, foo} | {symbol, foo} | {symbol, foo} → {symbol, foo} |
[1, 2, 3] (list) | [1, 2, 3] | [1, 2, 3] → [1, 2, 3] |
{vector, [1, 2, 3]} | [1, 2, 3] | {vector, [1, 2, 3]} → [1, 2, 3] |
{map, [{a, 1}, {b, 2}]} | dict:dict() | {map, [{a, 1}, {b, 2}]} → dict with a→1, b→2 |
{set, [1, 2, 3]} | sets:set() | {set, [1, 2, 3]} → sets with {1, 2, 3} |
{tag, Symbol, Value} | Handler Result | Calls registered tag handler or fails |
{ignore, Value} | Undefined | No documented transformation |
Tag Handler System
Tagged elements are processed using a configurable handler system:
Default Handlers
The to_erlang/2 function accepts handler specifications:
Handlers = [{tag_symbol, fun(Tag, Value, OtherHandlers) -> Result end}]
erldn:to_erlang(ParsedData, Handlers)Handler Function Signature
Handler = fun(Tag, Value, OtherHandlers) -> TransformedValue end- Tag: The tag symbol (e.g.,
'inst','uuid') - Value: The tagged value after transformation
- OtherHandlers: List of other available handlers for nested processing
Common Tag Examples
| Tag | Example Input | Typical Handler Result |
|---|---|---|
#inst | {tag, 'inst', <<"2024-01-01T12:00:00Z">>} | {datetime, {{2024,1,1}, {12,0,0}}} |
#uuid | {tag, 'uuid', <<"550e8400-e29b-41d4-a716-446655440000">>} | Binary UUID or custom UUID record |
| Custom tags | {tag, 'myapp/Person', {map, [...]}} | Application-specific data structure |
Data Structure Transformations
Maps → Dicts
- Before:
{map, [{key1, val1}, {key2, val2}]} - After:
dict:dict()with key-value associations - Access: Use
dict:fetch/2,dict:find/2, etc. - Benefits: O(log n) lookup, functional updates
Sets → Sets Module
- Before:
{set, [elem1, elem2, elem3]} - After:
sets:set()with unique elements - Access: Use
sets:is_element/2,sets:to_list/1, etc. - Benefits: Automatic uniqueness, set operations
Vectors → Lists
- Before:
{vector, [1, 2, 3]} - After:
[1, 2, 3] - Benefits: Simpler Erlang idiom
- Trade-offs: Loses type distinction from lists
Characters → Strings
- Before:
{char, 65} - After:
"A" - Benefits: More natural Erlang representation
- Note: Single-character strings, not charlists
Error Handling
Unknown Tags
When to_erlang/1 encounters a tag without a registered handler:
- Behavior: Raises an error
- Solution: Use
to_erlang/2with appropriate handlers - Alternative: Implement a catch-all default handler
Nested Transformations
All nested values are recursively transformed:
-
Map values are processed through
to_erlang -
Set elements are processed through
to_erlang -
List elements are processed through
to_erlang - Tagged values are processed before being passed to handlers
Usage Patterns
Simple Transformation
{ok, ParsedData} = erldn:parse("{:name \"John\" :age 30}"),
ErlangData = erldn:to_erlang(ParsedData).
% ErlangData is a dict with name→<<"John">>, age→30With Custom Handlers
Handlers = [
{'inst', fun(Tag, DateStr, _) -> parse_iso_date(DateStr) end},
{'uuid', fun(Tag, UuidStr, _) -> uuid:parse(UuidStr) end}
],
ErlangData = erldn:to_erlang(ParsedData, Handlers).Limitations
- Information Loss: Cannot reconstruct original EDN types (vectors vs lists)
- Handler Dependencies: Tagged elements require appropriate handlers
- Type Ambiguity: Some transformations lose type information
- Discard Elements: No clear specification for
{ignore, Value}handling
Best Practices
- Use with Tag Handlers: Always provide handlers for expected tagged elements
- Document Transformations: Keep track of which data came from EDN for debugging
- Test Round-trips: Verify data integrity when relevant
- Handle Errors: Account for missing tag handlers in production code
License
The MIT License