erlang-idna
A pure Erlang IDNA implementation following RFC 5891.
Current Unicode version: 17.0.0
Features
- IDNA 2008 compliance with RFC 5891
- IDNA 2003 backward compatibility
- UTS #46 compatibility processing (Unicode Technical Standard #46)
-
Full label validation:
- NFC normalization check
- Hyphen placement rules
- Leading combining marks check
- Contextual rules (CONTEXTJ/CONTEXTO)
- Bidirectional text rules (RFC 5893)
Installation
Rebar3
Add to your rebar.config:
{deps, [
{idna, "7.1.0"}
]}.Mix (Elixir)
Add to your mix.exs:
defp deps do
[
{:idna, "~> 7.1"}
]
endQuick Start
Encoding (Unicode → ASCII/Punycode)
%% Basic encoding
1> idna:encode("münchen.de").
"xn--mnchen-3ya.de"
2> idna:encode("βόλος.com").
"xn--nxasmq5b.com"
%% Japanese domain with UTS #46 processing
3> idna:encode("日本語.JP", [uts46]).
"xn--wgv71a119e.jp"Decoding (ASCII/Punycode → Unicode)
1> idna:decode("xn--mnchen-3ya.de").
"münchen.de"
2> idna:decode("xn--nxasmq5b.com").
"βόλος.com"Options
The encode/2 and decode/2 functions accept an options list:
| Option | Default | Description |
|---|---|---|
uts46 | false | Enable UTS #46 compatibility processing |
std3_rules | false | Enforce STD3 ASCII rules |
transitional | false | Use transitional processing (IDNA 2003 compatibility) |
strict | false |
Only use ASCII period (.) as label separator |
Examples with Options
%% UTS #46 processing normalizes and maps characters
1> idna:encode("Ⅷ.com", [uts46]).
"viii.com"
%% Transitional processing (ß → ss)
2> idna:encode("faß.de", [uts46, transitional]).
"fass.de"
%% Non-transitional (default) preserves ß
3> idna:encode("faß.de", [uts46]).
"xn--fa-hia.de"
%% STD3 rules reject certain characters
4> idna:encode("_example.com", [uts46, std3_rules]).
** exception exit: {invalid_codepoint,95}API Reference
Main Functions
| Function | Description |
|---|---|
encode/1,2 | Encode a Unicode domain name to ASCII (Punycode) |
decode/1,2 | Decode an ASCII domain name to Unicode |
alabel/1 | Convert a single label to ASCII form (A-label) |
ulabel/1 | Convert a single label to Unicode form (U-label) |
Validation Functions
| Function | Description |
|---|---|
check_label/1,4 | Validate a domain label |
check_nfc/1 | Check NFC normalization |
check_hyphen/1 | Check hyphen placement rules |
check_context/1 | Check contextual rules |
check_initial_combiner/1 | Check for leading combining marks |
check_label_length/1 | Check label length (max 63 octets) |
Compatibility Functions (Deprecated)
| Function | Replacement |
|---|---|
to_ascii/1 |
Use encode/1 |
to_unicode/1 |
Use decode/1 |
from_ascii/1 |
Use decode/1 |
utf8_to_ascii/1 |
Use encode/1 |
Documentation
Full API documentation is available on HexDocs.
Generate documentation locally:
rebar3 ex_docUpdating Unicode Data
This library currently supports Unicode 17.0.0. To update to a new Unicode version:
1. Download Unicode Data Files
Replace VERSION with the target version (e.g., 17.0.0):
# Core Unicode data files
wget -O uc_spec/UnicodeData.txt https://www.unicode.org/Public/VERSION/ucd/UnicodeData.txt
wget -O uc_spec/ArabicShaping.txt https://www.unicode.org/Public/VERSION/ucd/ArabicShaping.txt
wget -O uc_spec/Scripts.txt https://www.unicode.org/Public/VERSION/ucd/Scripts.txt
# IDNA-specific files (path structure as of Unicode 17.0.0)
wget -O uc_spec/IdnaMappingTable.txt https://www.unicode.org/Public/VERSION/idna/IdnaMappingTable.txt
wget -O test/IdnaTestV2.txt https://www.unicode.org/Public/VERSION/idna/IdnaTestV2.txt2. Generate IDNA Table
Use the kjd/idna Python tool:
git clone --depth 1 https://github.com/kjd/idna.git /tmp/kjd-idna
python3 /tmp/kjd-idna/tools/idna-data make-table --version VERSION > uc_spec/idna-table.txt
rm -rf /tmp/kjd-idna
If the tool needs additional files, use the --source option:
python3 /tmp/kjd-idna/tools/idna-data make-table --version VERSION --source uc_spec > uc_spec/idna-table.txt3. Regenerate Erlang Modules
cd uc_spec
./gen_idnadata_mod.escript
./gen_idna_table_mod.escript
./gen_idna_mapping_mod.escript
cd ..4. Run Tests
rebar3 eunitLicense
MIT License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.