kreuzcrawl

BindingsRustPythonNode.jsWASMJavaGoC#PHPRubyElixirDartKotlinSwiftZigC FFIDockerLicenseDocumentation
Kreuzcrawl
Join Discord

Elixir bindings for kreuzcrawl — a high-performance Rust web crawling engine. Uses Rustler NIFs for native BEAM integration with OTP-compatible error tuples and ResourceArc handles.

What This Package Provides

Installation

def deps do
[{:kreuzcrawl, "~> 0.3.0-rc.42"}]
end

Quick Start

# Simplest case: scrape a single page with default settings.
{:ok, engine} = Kreuzcrawl.create_engine()
{:ok, scrape_json} = Kreuzcrawl.scrape_async(engine, "https://example.com/")
scrape = Jason.decode!(scrape_json)
IO.puts("Title: #{scrape["metadata"]["title"]}")
IO.puts("Status: #{scrape["status_code"]}")
IO.puts("Links found: #{length(scrape["links"] || [])}")
# Crawl from a seed URL, limited to one hop and a handful of pages.
config_json = Jason.encode!(%Kreuzcrawl.CrawlConfig{max_depth: 1, max_pages: 5})
{:ok, crawl_engine} = Kreuzcrawl.create_engine(config_json)
{:ok, crawl_json} =
Kreuzcrawl.crawl_async(crawl_engine, "https://en.wikipedia.org/wiki/Web_scraping")
crawl = Jason.decode!(crawl_json)
IO.puts("Pages crawled: #{length(crawl["pages"] || [])}")

API Reference

Full API documentation is available at docs.kreuzcrawl.kreuzberg.dev.

Key functions:

Contributing

Contributions are welcome! Please see our Contributing Guide for details.

Part of Kreuzberg.dev

License

This project is licensed under Elastic License 2.0.