BIC Exporter

LicenceElixir CIRust CI

An Elixir library to extract BIC (Bank Identifier Code) directory data from ISO 9362 PDF files.

The PDF is published by Swift.

👉 The latest version should be downloadable from this url.

⚠️ DISCLAIMER: THIS IS NOT AN API

PDF files are not an official API. This parser might stop working if the PDF format changes even in small ways.

It's a tool to help you explore data. It is not a replacement for official APIs provided by Swift.

Requirements

Useful to know

Why Rust

The first iteration of this was in Python, which has tons of PDF utils.

However python also took more than 10 minutes to process a real file, and that's after optimizing and switching to multithreading. We'd then need to figure out how to deploy and call the python script.

In contrast, this script takes ~7 seconds on the same machine without any optimization and can be used directly from Elixir.

Safety

We're using rustler which

catches rust panics before they unwind into C.

So in theory this lib shouldn't bring the entire BEAM down.

Choice of PDF Library

This projects uses the pdf library from pdf-rs.

There are a few PDF libraries for rust but most of them either seem to just wrap something else and/or do wall-of-text extraction. E.g. Extractous wraps Apache Tika, pdf-extract is pure rust, but extracts just text.

Use of AI

⚠️ A lot of this has been written with the help of AI. It helped figure out all the magic of extracting rows from the PDF. There is a chance there are simpler ways to do this.

Installation

Add bic_exporter to your list of dependencies in mix.exs:

def deps do
  [
    {:bic_exporter, "~> 0.2.0"}
  ]
end

Then run:

mix deps.get
mix compile

Usage

Extract from file path

{:ok, records} = BicExporter.extract_table_from_path("/path/to/ISOBIC.pdf")

Extract from binary data

Useful when the PDF is already in memory (e.g., downloaded from a URL):

pdf_data = File.read!("/path/to/ISOBIC.pdf")
{:ok, records} = BicExporter.extract_table_from_binary(pdf_data)

Get column headers

BicExporter.headers()
# => ["Record creation date", "Last Update date", "BIC", "Brch Code",
#     "Full legal name", "Registered address", "Operational address",
#     "Branch description", "Branch address", "Instit. Type"]

Record format

Each record is a list of 10 strings corresponding to the CSV columns:

[
  "1997-03-01",           # Record creation date
  "2024-06-06",           # Last Update date
  "ABORCA82",             # BIC
  "XXX",                  # Branch Code
  "ABOR BANK",            # Full legal name
  "123 Main Street",      # Registered address
  "456 Business Ave",     # Operational address
  "Main office",          # Branch description (optional)
  "",                     # Branch address (optional)
  "BANK"                  # Institution Type
]

Development

# From root /
# Fetch dependencies
mix deps.get

# Compile (includes Rust NIF)
mix compile

# Run tests
mix test

# Format Elixir code
mix format

# From native/bic_exporter
# Format Rust code
cargo fmt

# Run Rust linter
cargo clippy --all-targets --all-features

# Run Rust tests
cargo test

Release

For maintainers: