MarkdownLd

Hex.pmDocumentationBuild Status

High-performance Markdown processing with SIMD optimizations and JSON-LD integration for Elixir.

MarkdownLd is built for production systems that require extreme performance and reliability. Leveraging Rust SIMD optimizations, memory pooling, and advanced parsing algorithms, it delivers 10-50x faster markdown processing compared to traditional pure Elixir solutions.

⚑ Performance Highlights

πŸš€ Quick Start

Add to your mix.exs:

def deps do
  [
    {:markdown_ld, "~> 0.3.0"}
  ]
end

Basic usage:

# Parse markdown content
{:ok, result} = MarkdownLd.parse("""
# Hello World

This is **bold** text with a [link](https://example.com).

def hello, do: :world


- [ ] Todo item
- [x] Completed item
""")

# Result contains structured data
IO.inspect(result.headings)
# [%{level: 1, text: "Hello World", line: 1}]

IO.inspect(result.links) 
# [%{text: "link", url: "https://example.com", line: 3}]

IO.inspect(result.code_blocks)
# [%{language: "elixir", content: "def hello, do: :world", line: 5}]

IO.inspect(result.tasks)
# [%{completed: false, text: "Todo item", line: 9},
#  %{completed: true, text: "Completed item", line: 10}]

πŸ“– Features

Core Parsing

Performance Optimizations

Batch Processing

# Process multiple documents in parallel
documents = ["# Doc 1", "# Doc 2", "# Doc 3"]

# Elixir-side parallel processing  
{:ok, results} = MarkdownLd.parse_batch(documents, max_workers: 4)

# Rust-side parallel processing (fastest)
{:ok, results} = MarkdownLd.parse_batch_rust(documents)

# Stream processing with backpressure
results = MarkdownLd.parse_stream(document_stream, max_workers: 8)

Performance Tracking

# Get performance metrics
{:ok, stats} = MarkdownLd.get_performance_stats()
IO.inspect(stats)
# %{
#   "simd_operations" => 1_250_000,
#   "cache_hit_rate" => 85.2,
#   "memory_pool_usage" => 2_048_576,
#   "pattern_cache_size" => 128
# }

# Reset counters
MarkdownLd.reset_performance_stats()

βš™οΈ Configuration

Configure default options in your config.exs:

config :markdown_ld,
  # Performance options
  parallel: true,
  max_workers: System.schedulers_online(),
  
  # Optimization options  
  cache_patterns: true,
  track_performance: true,
  memory_pool_size: 1024 * 1024,  # 1MB
  pattern_cache_size: 500,
  
  # Processing options
  enable_tables: true,
  enable_strikethrough: true,
  enable_footnotes: true,
  
  # SIMD options (auto-detected)
  simd_enabled: true,
  simd_features: [:neon, :avx2],  # Auto-detected based on CPU
  
  # Batch processing
  batch_size: 100,
  batch_timeout: 5_000,  # 5 seconds
  
  # Development options
  debug_performance: false,
  log_slow_operations: true,
  slow_operation_threshold: 1000  # microseconds

Runtime Configuration

You can also configure options at runtime:

# Per-operation configuration
{:ok, result} = MarkdownLd.parse(content, 
  parallel: false,
  cache_patterns: true,
  track_performance: true,
  max_workers: 2
)

# Application-wide configuration  
Application.put_env(:markdown_ld, :max_workers, 8)

πŸ—οΈ Advanced Build System

MarkdownLd includes a comprehensive build system with multiple optimization profiles:

# Development build (fast compilation)
make dev

# Production build (maximum optimization)  
make prod

# Benchmark build (with profiling symbols)
make bench

# Profile-Guided Optimization
make pgo

# Run comprehensive benchmarks
make bench

Build Profiles

πŸ“Š Benchmarks

Based on comprehensive benchmarking:

Document Size Processing Time Throughput vs Pure Elixir
Small (1KB) 3-7ΞΌs 150MB/s 10-20x faster
Medium (10KB) 5-10ΞΌs 1GB/s 10-20x faster
Large (100KB) 15-35ΞΌs 3GB/s 10-25x faster

Extraction Functions

Run benchmarks yourself:

mix run bench/turbo_benchmark.exs

🚦 Production Usage

MarkdownLd is designed for high-throughput production systems:

Scalability

Reliability

Integration

πŸ”¬ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Elixir API    │───▢│   Rust NIF Core  │───▢│  SIMD Optimized β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚   Operations    β”‚
β”‚ β€’ Batch Proc.   β”‚    β”‚ β€’ Memory Pools   β”‚    β”‚ β€’ Pattern Match β”‚
β”‚ β€’ Streaming     β”‚    β”‚ β€’ Pattern Cache  β”‚    β”‚ β€’ String Ops    β”‚
β”‚ β€’ Error Handle  β”‚    β”‚ β€’ Parallel Proc. β”‚    β”‚ β€’ Word Count    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Development

# Install dependencies
make install

# Run tests
make test  

# Format code
make format

# Lint code
make lint

# Run benchmarks
make bench

# Generate documentation
make docs

πŸ“š Documentation

πŸ“„ License

MIT License - see LICENSE for details.

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes with tests
  4. Run the full test suite (make ci)
  5. Submit a pull request

Development Guidelines


MarkdownLd - Built for production systems that demand extreme performance.

Developed with ❀️ for the Elixir community.