BIC Exporter
An Elixir library to extract BIC (Bank Identifier Code) directory data from ISO 9362 PDF files.
The PDF is published by Swift.
👉 The latest version should be downloadable from this url.
⚠️ DISCLAIMER: THIS IS NOT AN API
PDF files are not an official API. This parser might stop working if the PDF format changes even in small ways.
It's a tool to help you explore data. It is not a replacement for official APIs provided by Swift.
Requirements
- Elixir 1.19 or higher
- Rust 1.70 or higher (for compiling the NIF)
Useful to know
Why Rust
The first iteration of this was in Python, which has tons of PDF utils.
However python also took more than 10 minutes to process a real file, and that's after optimizing and switching to multithreading. We'd then need to figure out how to deploy and call the python script.
In contrast, this script takes ~7 seconds on the same machine without any optimization and can be used directly from Elixir.
Safety
We're using rustler which
catches rust panics before they unwind into C.
So in theory this lib shouldn't bring the entire BEAM down.
Choice of PDF Library
This projects uses the pdf library from pdf-rs.
There are a few PDF libraries for rust but most of them either seem to just wrap something else and/or do wall-of-text extraction. E.g. Extractous wraps Apache Tika, pdf-extract is pure rust, but extracts just text.
Use of AI
⚠️ A lot of this has been written with the help of AI. It helped figure out all the magic of extracting rows from the PDF. There is a chance there are simpler ways to do this.
Installation
Add bic_exporter to your list of dependencies in mix.exs:
def deps do
[
{:bic_exporter, "~> 0.2.0"}
]
endThen run:
mix deps.get
mix compileUsage
Extract from file path
{:ok, records} = BicExporter.extract_table_from_path("/path/to/ISOBIC.pdf")Extract from binary data
Useful when the PDF is already in memory (e.g., downloaded from a URL):
pdf_data = File.read!("/path/to/ISOBIC.pdf")
{:ok, records} = BicExporter.extract_table_from_binary(pdf_data)Get column headers
BicExporter.headers()
# => ["Record creation date", "Last Update date", "BIC", "Brch Code",
# "Full legal name", "Registered address", "Operational address",
# "Branch description", "Branch address", "Instit. Type"]Record format
Each record is a list of 10 strings corresponding to the CSV columns:
[
"1997-03-01", # Record creation date
"2024-06-06", # Last Update date
"ABORCA82", # BIC
"XXX", # Branch Code
"ABOR BANK", # Full legal name
"123 Main Street", # Registered address
"456 Business Ave", # Operational address
"Main office", # Branch description (optional)
"", # Branch address (optional)
"BANK" # Institution Type
]Development
# From root /
# Fetch dependencies
mix deps.get
# Compile (includes Rust NIF)
mix compile
# Run tests
mix test
# Format Elixir code
mix format
# From native/bic_exporter
# Format Rust code
cargo fmt
# Run Rust linter
cargo clippy --all-targets --all-features
# Run Rust tests
cargo testRelease
For maintainers:
-
Bump the version in
mix.exs -
Create a new tag, e.g.
git tag v0.2.0 -
Push the tag to Github
git push origin main --tags - ⚠️ Wait for the Github actions workflows to be successfully completed.
-
Run
mix rustler_precompiled.download BicExporter.Native --all --print. You may need to setRUSTLER_PRECOMPILATION_FORCE_BUILD=truewhen running this. -
Publish the package on hex:
mix hex.publish - 🍰