JsonRemedy Logo

JsonRemedy

GitHub CIElixirOTPHex.pmHex DocsMIT License

A comprehensive, production-ready JSON repair library for Elixir that intelligently fixes malformed JSON strings from any sourceโ€”LLMs, legacy systems, data pipelines, streaming APIs, and human input.

JsonRemedy uses a sophisticated pre-processing stage followed by a 5-layer repair pipeline where each layer employs the most appropriate technique: pattern detection, content cleaning, state machines for structural repairs, character-by-character parsing for syntax normalization, and battle-tested parsers for validation. The result is a robust system that handles virtually any JSON malformation while preserving valid content.

The Problem

Malformed JSON is everywhere in real-world systems:

// LLM output with mixed issues
```json
{
  users: [
    {name: 'Alice Johnson', active: True, scores: [95, 87, 92,]},
    {name: "Bob Smith", active: False /* incomplete
  ],
  metadata: None
# Legacy Python system output
{'users': [{'name': 'Alice', 'verified': True, 'data': None}]}
// Copy-paste from JavaScript console
{name: "Alice", getValue: function() { return "test"; }, data: [1,2,3]}
// Streaming API with connection drop
{"status": "processing", "results": [{"id": 1, "name": "Alice"
// Gemini API max_output_tokens truncation (fills rest with dots)
{"title": "Report", "content": "Analysis of.............................
// Gemini API echoed key bug (gemini-2.5-flash-lite)
{"coach_notes": "coach_notes": "You showed grace under pressure."}
// Human input with common mistakes
{name: Alice, "age": 30, "scores": [95 87 92], active: true,}

Standard JSON parsers fail completely on these inputs. JsonRemedy fixes them intelligently.

Comprehensive Repair Capabilities

๐Ÿ”„ Pre-processing Pipeline(v0.1.5+)

Runs before the main layer pipeline to handle complex patterns that would otherwise be broken by subsequent layers:

Inspired by patterns from the json_repair Python library

๐Ÿงน Content Cleaning (Layer 1)

๐Ÿ—๏ธ Structural Repairs (Layer 2)

โœจ Syntax Normalization (Layer 3)

๐Ÿ”ง Hardcoded Patterns(ported from json_repair Python library)

Layer 3 includes battle-tested cleanup patterns for edge cases commonly found in LLM output:

These patterns run as a pre-processing step and can be controlled via feature flags. See examples/hardcoded_patterns_examples.exs for demonstrations.

๐Ÿš€ Fast Path Validation (Layer 4)

๐Ÿ›Ÿ Tolerant Parsing (Layer 5) โณ FUTURE

๐Ÿง  Context-Aware Intelligence

JsonRemedy understands JSON structure to preserve valid content:

# โœ… PRESERVE: Comma inside string content
{"message": "Hello, world", "status": "ok"}

# โœ… REMOVE: Trailing comma
{"items": [1, 2, 3,]}

# โœ… PRESERVE: Numbers stay numbers  
{"count": 42}

# โœ… QUOTE: Unquoted keys get quoted
{name: "Alice"}

# โœ… PRESERVE: Boolean content in strings
{"note": "Set active to True"}

# โœ… NORMALIZE: Boolean values
{"active": True}

# โœ… PRESERVE: Escape sequences in strings
{"path": "C:\\Users\\Alice"}

# โœ… PARSE: Unicode escapes
{"unicode": "\\u0048\\u0065\\u006c\\u006c\\u006f"}

Quick Start

Add JsonRemedy to your mix.exs:

def deps do
  [
    {:json_remedy, "~> 0.2.1"}
  ]
end

Basic Usage

# Simple repair and parse
malformed = ~s|{name: "Alice", age: 30, active: True}|
{:ok, data} = JsonRemedy.repair(malformed)
# => %{"name" => "Alice", "age" => 30, "active" => true}

# Get the repaired JSON string
{:ok, fixed_json} = JsonRemedy.repair_to_string(malformed)
# => "{\"name\":\"Alice\",\"age\":30,\"active\":true}"

# Track what was repaired
{:ok, data, repairs} = JsonRemedy.repair(malformed, logging: true)
# => repairs: [
#      %{layer: :syntax_normalization, action: "quoted unquoted key 'name'"},
#      %{layer: :syntax_normalization, action: "normalized boolean True -> true"}
#    ]

Real-World Examples

# LLM output with multiple issues
llm_output = """
Here's the user data you requested:

```json
{
  // User information
  users: [
    {
      name: 'Alice Johnson',
      email: "alice@example.com",
      age: 30,
      active: True,
      scores: [95, 87, 92,],  // Test scores
      profile: {
        city: "New York",
        interests: ["coding", "music", "travel",]
      },
    },
    {
      name: 'Bob Smith',
      email: "bob@example.com", 
      age: 25,
      active: False
      // Missing comma above
    }
  ],
  metadata: {
    total: 2,
    updated: "2024-01-15"
    // Missing closing brace
```

That should give you what you need!
"""

{:ok, clean_data} = JsonRemedy.repair(llm_output)
# Works perfectly! Handles code fences, comments, quotes, booleans, trailing commas, missing delimiters
# Legacy Python-style JSON
python_json = ~s|{'users': [{'name': 'Alice', 'active': True, 'metadata': None}]}|
{:ok, data} = JsonRemedy.repair(python_json)
# => %{"users" => [%{"name" => "Alice", "active" => true, "metadata" => nil}]}

# JavaScript object literals
js_object = ~s|{name: "Alice", getValue: function() { return 42; }, data: [1,2,3]}|
{:ok, data} = JsonRemedy.repair(js_object)
# => %{"name" => "Alice", "data" => [1, 2, 3]} (function removed)

# Streaming/incomplete data
incomplete = ~s|{"status": "processing", "data": [1, 2, 3|
{:ok, data} = JsonRemedy.repair(incomplete)
# => %{"status" => "processing", "data" => [1, 2, 3]}

# Human input with common mistakes
human_input = ~s|{name: Alice, age: 30, scores: [95 87 92], active: true,}|
{:ok, data} = JsonRemedy.repair(human_input)
# => %{"name" => "Alice", "age" => 30, "scores" => [95, 87, 92], "active" => true}

Examples

JsonRemedy includes comprehensive examples demonstrating real-world usage scenarios.

๐Ÿš€ Run All Examples

To see all examples in action with their full output:

./run-examples.sh

This will execute all example scripts and show a summary of results.

๐Ÿ“š Individual Examples

Run specific examples to see detailed output:

Basic Usage Examples

mix run examples/basic_usage.exs

Learn the fundamentals with step-by-step examples:

๐Ÿ”ง Hardcoded Patterns Examples โœจ NEW in v0.1.4

mix run examples/hardcoded_patterns_examples.exs

Demonstrates advanced cleanup patterns ported from Python's json_repair library:

๐ŸŒ HTML Content Examples โœจ NEW in v0.1.5

mix run examples/html_content_examples.exs

Demonstrates handling of unquoted HTML content in JSON values (common when APIs return error pages):

This example showcases the HTML detection and quoting capabilities added in v0.1.5, which handle real-world scenarios where API endpoints return HTML error pages instead of JSON.

๐Ÿงฎ HTML Metadata Examples โœจ NEW in v0.1.8

mix run examples/html_metadata_examples.exs

Inspect the metadata returned when quoting HTML fragments:

๐ŸŒ Real-World Scenarios

mix run examples/real_world_scenarios.exs

See JsonRemedy handle realistic problematic JSON:

โšก Performance Examples

mix run examples/quick_performance.exs

Understand JsonRemedy's performance characteristics:

๐Ÿ”ฌ Stress Testing

mix run examples/simple_stress_test.exs

Verify reliability under load:

๐ŸชŸ Windows CI Examples โœจ NEW in v0.1.8

mix run examples/windows_ci_examples.exs

Validate the cross-platform pipeline:

๐Ÿ“Š Example Output

Here's what you'll see when running the real-world scenarios:

=== JsonRemedy Real-World Scenarios ===

Example 1: LLM/ChatGPT Output with Code Fences
==============================================
Input (LLM response with code fences and explanatory text):
Here's the user data you requested:

{ "users": [

{name: "Alice Johnson", age: 32, role: "engineer"},
{name: "Bob Smith", age: 28, role: "designer"}

], "metadata": {

generated_at: "2024-01-15",
total_count: 2,
active_only: True

} }


Processing LLM Output through JsonRemedy pipeline...

โœ“ Layer 1 (Content Cleaning): Applied 1 repairs
โœ“ Layer 3 (Syntax Normalization): Applied 4 repairs  
โœ“ Layer 4 (Validation): SUCCESS - Valid JSON produced!

Final repaired JSON:
-------------------
{
  "users": [
    {
      "name": "Alice Johnson",
      "age": 32,
      "role": "engineer"
    },
    {
      "name": "Bob Smith", 
      "age": 28,
      "role": "designer"
    }
  ],
  "metadata": {
    "generated_at": "2024-01-15",
    "total_count": 2,
    "active_only": true
  }
}

Total repairs applied: 5
Repair summary:
  1. removed code fences and wrapper text
  2. normalized unquoted key 'name' to "name"
  3. normalized unquoted key 'age' to "age"  
  4. normalized unquoted key 'role' to "role"
  5. normalized boolean True -> true

All examples include detailed output showing:

๐ŸŽฏ Custom Examples

Create your own examples using the same patterns:

# examples/my_custom_example.exs
defmodule MyCustomExample do
  def test_my_json do
    malformed = ~s|{my: 'problematic', json: True}|
    
    case JsonRemedy.repair(malformed, logging: true) do
      {:ok, result, context} ->
        IO.puts("โœ“ Repaired successfully!")
        IO.puts("Result: #{Jason.encode!(result, pretty: true)}")
        IO.puts("Repairs: #{length(context.repairs)}")
      {:error, reason} ->
        IO.puts("โœ— Failed: #{reason}")
    end
  end
end

MyCustomExample.test_my_json()

Run with: mix run examples/my_custom_example.exs

๐Ÿ”ง Example Status & Known Issues

All examples have been thoroughly tested and optimized for v0.1.1:

Example Status Performance Notes
Basic Usage โœ… Stable ~10ms 8 fundamental examples, all patterns work
Real World Scenarios โœ… Stable ~15-30s 8 complex scenarios, handles LLM/legacy data
Quick Performance โœ… Stable ~2-5s 4 benchmarks, includes throughput analysis
Simple Stress Test โœ… Stable ~10-15s 1000+ operations, memory stability verified
Performance Benchmarks โš ๏ธ Limited May hang Complex analysis may timeout on large datasets

Known Issue: Performance Benchmarks

The examples/performance_benchmarks.exs may hang when processing large datasets (5000+ objects). This is a computational complexity issue, not a library bug:

# These work fine:
mix run examples/performance_benchmarks.exs  # May hang on large datasets

# Alternatives that complete successfully:
mix run examples/quick_performance.exs       # Lightweight performance testing
mix run examples/simple_stress_test.exs      # Stress testing without hanging

Workaround: For comprehensive benchmarking, use smaller dataset sizes or the quick performance example which provides sufficient performance insights.

Recent Fixes (v0.1.1)

Implementation Status

JsonRemedy is currently in Phase 1 implementation with Layers 1-4 fully operational:

Layer Status Description
Layer 1 โœ… Complete Content cleaning (code fences, comments, encoding)
Layer 2 โœ… Complete Structural repair (delimiters, nesting, concatenation)
Layer 3 โœ… Complete Syntax normalization (quotes, booleans, commas)
Layer 4 โœ… Complete Fast validation (Jason.decode optimization)
Layer 5 โณ Planned Tolerant parsing (aggressive error recovery)

The current implementation handles ~95% of real-world malformed JSON through Layers 1-4. Layer 5 will add edge case handling for the remaining challenging scenarios.

๐Ÿ—บ๏ธ Roadmap

Current Release (v0.1.1): Production-ready Layers 1-4

Next Release (v0.2.0): Layer 5 - Tolerant Parsing

โœ… Previously Missing Patterns - Now Implemented!(v0.1.5)

Based on comprehensive analysis of the json_repair Python library, the following patterns were initially documented as missing but are now fully implemented in v0.1.5:

Implemented Advanced Patterns(all tests passing):

  1. Multiple JSON Values Aggregation โœ… - test/missing_patterns/pattern1_multiple_json_test.exs

    • Pattern: []{} โ†’ [[],{}]
    • Status: โœ… 10/10 tests pass
    • Implementation: MultipleJsonDetector utility in pre-processing pipeline
    • Wraps multiple complete JSON values into an array
  2. Object Boundary Merging โœ… - test/missing_patterns/pattern2_object_merging_test.exs

    • Pattern: {"a":"b"},"c":"d"} โ†’ {"a":"b","c":"d"}
    • Status: โœ… 10/10 tests pass
    • Implementation: ObjectMerger module in Layer 3
    • Merges additional key-value pairs after premature object close
  3. Ellipsis Filtering โœ… - test/missing_patterns/pattern3_ellipsis_test.exs

    • Pattern: [1,2,3,...] โ†’ [1,2,3]
    • Status: โœ… 10/10 tests pass
    • Implementation: EllipsisFilter module in Layer 3
    • Filters unquoted ... placeholders from arrays (common in LLM output)
  4. Comment Keywords Filtering โœ… - test/missing_patterns/pattern4_comment_keywords_test.exs

    • Pattern: {"a":1, COMMENT "b":2} โ†’ {"a":1,"b":2}
    • Status: โœ… 10/10 tests pass
    • Implementation: KeywordFilter module in Layer 3
    • Filters unquoted keywords: COMMENT, SHOULD_NOT_EXIST, DEBUG_INFO, PLACEHOLDER, TODO, FIXME, etc.

These advanced patterns handle edge cases commonly found in LLM outputs, debug logs, and malformed API responses. All 40 pattern tests pass with 100% success rate.

The Pre-processing + 5-Layer Architecture

JsonRemedy's strength comes from its pragmatic, layered approach where each stage uses the optimal technique:

defmodule JsonRemedy.LayeredRepair do
  def repair(input) do
    input
    |> PreProcessing.detect_and_fix()  # Pre-process: Multiple JSON, object merging, filtering
    |> Layer1.content_cleaning()       # Cleaning: Remove wrappers, comments, normalize encoding
    |> Layer2.structural_repair()      # State machine: Fix delimiters, nesting, structure
    |> Layer3.syntax_normalization()   # Char parsing: Fix quotes, booleans, commas
    |> Layer4.validation_attempt()     # Jason.decode: Fast path for clean JSON
    |> Layer5.tolerant_parsing()       # Custom parser: Handle edge cases gracefully (FUTURE)
  end
end

๐Ÿ”„ Pre-processing Stage(v0.1.5)

Technique: Pattern detection and early transformation

๐Ÿงน Layer 1: Content Cleaning

Technique: String operations

๐Ÿ—๏ธ Layer 2: Structural Repair

Technique: State machine with context tracking

โœจ Layer 3: Syntax Normalization

Technique: Character-by-character parsing with context awareness

๐Ÿš€ Layer 4: Validation

Technique: Battle-tested Jason.decode

๐Ÿ›Ÿ Layer 5: Tolerant Parsing โณ FUTURE

Technique: Custom recursive descent with error recovery (planned)

API Reference

Core Functions

# Main repair function
JsonRemedy.repair(json_string, opts \\ [])
# Returns: {:ok, term} | {:ok, term, repairs} | {:error, reason}

# Repair to JSON string
JsonRemedy.repair_to_string(json_string, opts \\ [])  
# Returns: {:ok, json_string} | {:error, reason}

# Repair from file
JsonRemedy.from_file(path, opts \\ [])
# Returns: {:ok, term} | {:ok, term, repairs} | {:error, reason}

Options

[
  # Return detailed repair log as third tuple element
  logging: true,
  
  # How aggressive to be with repairs
  strictness: :lenient,  # :strict | :lenient | :permissive
  
  # Stop after successful layer (for performance)
  early_exit: true,
  
  # Maximum input size (security)
  max_size_mb: 10,
  
  # Processing timeout
  timeout_ms: 5000,
  
  # Custom repair rules for Layer 3
  custom_rules: [
    %{
      name: "fix_custom_pattern",
      pattern: ~r/special_pattern/,
      replacement: "fixed_pattern",
      condition: nil
    }
  ]
]

Advanced APIs

# Layer-specific processing (for custom pipelines)
JsonRemedy.Layer1.ContentCleaning.process(input, context)
JsonRemedy.Layer2.StructuralRepair.process(input, context)  
JsonRemedy.Layer3.SyntaxNormalization.process(input, context)

# Individual repair functions
JsonRemedy.Layer3.SyntaxNormalization.normalize_quotes(input)
JsonRemedy.Layer3.SyntaxNormalization.fix_commas(input)
JsonRemedy.Layer3.SyntaxNormalization.normalize_escape_sequences(input)

# Health checking
JsonRemedy.health_check()
# => %{status: :healthy, layers: [...], performance: {...}}

Streaming API

For large files or real-time processing:

# Process large files efficiently
"huge_log.jsonl"
|> File.stream!()
|> JsonRemedy.repair_stream()
|> Stream.map(&process_record/1)
|> Stream.each(&store_record/1)
|> Stream.run()

# Real-time stream processing with buffering
websocket_stream
|> JsonRemedy.repair_stream(buffer_incomplete: true, chunk_size: 1024)
|> Stream.each(&handle_json/1)
|> Stream.run()

# Batch processing with error collection
inputs
|> JsonRemedy.repair_stream(collect_errors: true)
|> Enum.reduce({[], []}, fn
  {:ok, data} -> {[data | successes], errors}
  {:error, err} -> {successes, [err | errors]}
end)

Performance Characteristics

JsonRemedy prioritizes correctness first, performance second with intelligent optimization:

Note: Performance benchmarks below reflect Layers 1-4 implementation. Layer 5 performance will be added in v0.2.0.

Benchmarks

Input Type                    | Throughput    | Memory    | Notes
------------------------------|---------------|-----------|------------------
Valid JSON (Layer 4 only)    | TODO:   |  TODO:     | Jason.decode fast path
Simple malformed             | TODO: | TODO:      | Layers 1-3 processing  
Complex malformed             | TODO:  | TODO:     | Full pipeline
Large files (streaming)      | TODO:     | TODO:     | Constant memory usage
LLM output (typical)         | TODO:  | TODO:      | Mixed complexity

Performance Strategy

Run benchmarks:

mix run bench/comprehensive_benchmark.exs
mix run bench/memory_profile.exs

Real-World Use Cases

๐Ÿค– LLM Integration

defmodule MyApp.LLMProcessor do
  def extract_structured_data(llm_response) do
    case JsonRemedy.repair(llm_response, logging: true, timeout_ms: 3000) do
      {:ok, data, []} -> 
        {:clean, data}
      {:ok, data, repairs} -> 
        Logger.info("LLM output required #{length(repairs)} repairs")
        maybe_retrain_model(repairs)
        {:repaired, data}
      {:error, reason} -> 
        Logger.error("Unparseable LLM output: #{reason}")
        {:unparseable, reason}
    end
  end
  
  defp maybe_retrain_model(repairs) do
    # Analyze repair patterns to improve LLM prompts
    serious_issues = Enum.filter(repairs, &(&1.layer == :structural_repair))
    if length(serious_issues) > 3, do: schedule_model_retraining()
  end
end

๐Ÿ“Š Data Pipeline Healing

defmodule DataPipeline.JSONHealer do
  def process_external_api(response) do
    response.body
    |> JsonRemedy.repair(strictness: :lenient, max_size_mb: 50)
    |> case do
      {:ok, data} -> 
        validate_and_transform(data)
      {:error, reason} -> 
        send_to_deadletter_queue(response, reason)
        {:error, :unparseable}
    end
  end
  
  def heal_legacy_export(file_path) do
    file_path
    |> JsonRemedy.from_file(logging: true)
    |> case do
      {:ok, data, repairs} when length(repairs) > 0 ->
        Logger.warn("Legacy file required healing: #{inspect(repairs)}")
        maybe_update_source_system(file_path, repairs)
        {:ok, data}
      result -> result
    end
  end
end

๐Ÿ”ง Configuration Recovery

defmodule MyApp.ConfigLoader do
  def load_with_auto_repair(path) do
    case JsonRemedy.from_file(path, logging: true) do
      {:ok, config, []} -> 
        {:ok, config}
      {:ok, config, repairs} ->
        Logger.warn("Config file auto-repaired: #{format_repairs(repairs)}")
        maybe_write_fixed_config(path, config, repairs)
        {:ok, config}
      {:error, reason} ->
        {:error, "Config file unrecoverable: #{reason}"}
    end
  end
  
  defp maybe_write_fixed_config(path, config, repairs) do
    if mostly_syntax_fixes?(repairs) do
      backup_path = path <> ".backup"
      File.cp!(path, backup_path)
      
      fixed_json = Jason.encode!(config, pretty: true)
      File.write!(path, fixed_json)
      
      Logger.info("Auto-fixed config saved. Backup at #{backup_path}")
    end
  end
end

๐ŸŒŠ Stream Processing

defmodule LogProcessor do
  def process_json_logs(file_path) do
    file_path
    |> File.stream!(read_ahead: 100_000)
    |> JsonRemedy.repair_stream(
      buffer_incomplete: true,
      collect_errors: true,
      timeout_ms: 1000
    )
    |> Stream.filter(&valid_log_entry?/1)
    |> Stream.map(&enrich_log_entry/1)
    |> Stream.chunk_every(1000)
    |> Stream.each(&bulk_insert_logs/1)
    |> Stream.run()
  end
  
  def process_realtime_stream(websocket_pid) do
    websocket_pid
    |> stream_from_websocket()
    |> JsonRemedy.repair_stream(
      buffer_incomplete: true,
      max_buffer_size: 64_000,
      early_exit: true
    )
    |> Stream.each(&handle_realtime_event/1)
    |> Stream.run()
  end
end

๐Ÿ”ฌ Quality Assurance

defmodule QualityControl do
  def analyze_data_quality(source) do
    results = source
    |> stream_data()
    |> JsonRemedy.repair_stream(logging: true)
    |> Enum.reduce(%{total: 0, clean: 0, repaired: 0, failed: 0, repairs: []}, 
      fn result, acc ->
        case result do
          {:ok, _data, []} -> 
            %{acc | total: acc.total + 1, clean: acc.clean + 1}
          {:ok, _data, repairs} -> 
            %{acc | total: acc.total + 1, repaired: acc.repaired + 1, 
              repairs: acc.repairs ++ repairs}
          {:error, _} -> 
            %{acc | total: acc.total + 1, failed: acc.failed + 1}
        end
      end)
    
    generate_quality_report(results)
  end
  
  defp generate_quality_report(%{total: total, clean: clean, repaired: repaired, 
                                 failed: failed, repairs: repairs}) do
    %{
      summary: %{
        quality_score: (clean + repaired) / total * 100,
        clean_percentage: clean / total * 100,
        repair_rate: repaired / total * 100,
        failure_rate: failed / total * 100
      },
      top_issues: repair_frequency_analysis(repairs),
      recommendations: generate_recommendations(repairs)
    }
  end
end

Comparison with Alternatives

Feature JsonRemedy Poison Jason Python json-repair JavaScript jsonrepair
Repair Capability โœ… Comprehensive โŒ None โŒ None โš ๏ธ Basic โš ๏ธ Limited
Architecture ๐Ÿ—๏ธ 5-layer pipeline ๐Ÿ“ฆ Monolithic ๐Ÿ“ฆ Monolithic ๐Ÿ“ฆ Single-pass ๐Ÿ“ฆ Single-pass
Context Awareness โœ… Advanced โŒ No โŒ No โš ๏ธ Limited โš ๏ธ Basic
Streaming Support โœ… Yes โŒ No โŒ No โŒ No โŒ No
Repair Logging โœ… Detailed โŒ No โŒ No โš ๏ธ Basic โŒ No
Performance โšก Optimized โšก Good ๐Ÿš€ Excellent ๐ŸŒ Slow โšก Good
Unicode Support โœ… Full โœ… Yes โœ… Yes โš ๏ธ Limited โœ… Yes
Error Recovery โœ… Aggressive โŒ No โŒ No โš ๏ธ Basic โš ๏ธ Basic
LLM Output โœ… Specialized โŒ No โŒ No โš ๏ธ Partial โš ๏ธ Partial
Production Ready โœ… Yes โœ… Yes โœ… Yes โš ๏ธ Limited โš ๏ธ Limited

Advanced Features

Custom Repair Rules

# Define domain-specific repair rules
custom_rules = [
  %{
    name: "fix_currency_format",
    pattern: ~r/\$(\d+)/,
    replacement: ~S({"amount": \1, "currency": "USD"}),
    condition: &(!JsonRemedy.LayerBehaviour.inside_string?(&1, 0))
  },
  %{
    name: "normalize_dates",
    pattern: ~r/(\d{4})-(\d{2})-(\d{2})/,
    replacement: ~S("\1-\2-\3T00:00:00Z"),
    condition: nil
  }
]

{:ok, data} = JsonRemedy.repair(input, custom_rules: custom_rules)

Health Monitoring

# System health and performance monitoring
health = JsonRemedy.health_check()
# => %{
#   status: :healthy,
#   layers: [
#     %{layer: :content_cleaning, status: :healthy, avg_time_us: 45},
#     %{layer: :structural_repair, status: :healthy, avg_time_us: 120},
#     # ...
#   ],
#   performance: %{
#     cache_hit_rate: 0.85,
#     avg_repair_time_us: 850,
#     memory_usage_mb: 12.3
#   }
# }

# Performance statistics
stats = JsonRemedy.performance_stats()
# => %{success_rate: 0.94, avg_time_us: 680, cache_hits: 1205}

Error Analysis

# Detailed error analysis for debugging
case JsonRemedy.repair(malformed_input, logging: true) do
  {:ok, data, repairs} ->
    analyze_repair_patterns(repairs)
    {:success, data}
    
  {:error, reason} ->
    case JsonRemedy.analyze_failure(malformed_input) do
      {:analyzable, issues} -> 
        Logger.error("Repair failed: #{inspect(issues)}")
        {:partial_analysis, issues}
      {:unanalyzable, _} -> 
        {:complete_failure, reason}
    end
end

Limitations and Design Philosophy

What JsonRemedy Excels At

What JsonRemedy Doesn't Do

Design Philosophy

Security Considerations

# Built-in security features
JsonRemedy.repair(input, [
  max_size_mb: 10,           # Prevent memory exhaustion
  timeout_ms: 5000,          # Prevent infinite processing
  max_nesting_depth: 50,     # Prevent stack overflow
  disable_custom_rules: true # Disable user rules in untrusted contexts
])

Contributing

JsonRemedy follows a test-driven development approach with comprehensive quality standards:

# Development setup
git clone https://github.com/nshkrdotcom/json_remedy.git
cd json_remedy
mix deps.get

# Run test suites
mix test                        # All tests
mix test --only unit            # Unit tests only  
mix test --only integration     # Integration tests
mix test --only performance     # Performance validation
mix test --only property        # Property-based tests

# Quality assurance
mix credo --strict              # Code quality
mix dialyzer                    # Type analysis
mix format --check-formatted    # Code formatting
mix test.coverage               # Coverage analysis

# Windows CI parity (PowerShell)
mix run examples/windows_ci_examples.exs

# Benchmarking
mix run bench/comprehensive_benchmark.exs
mix run bench/memory_profile.exs

Architecture Overview

lib/
โ”œโ”€โ”€ json_remedy.ex                     # Main API with pre-processing
โ”œโ”€โ”€ json_remedy/
โ”‚   โ”œโ”€โ”€ layer_behaviour.ex             # Common interface for all layers
โ”‚   โ”œโ”€โ”€ utils/
โ”‚   โ”‚   โ””โ”€โ”€ multiple_json_detector.ex  # โœ… Pre-processing: Multiple JSON aggregation
โ”‚   โ”œโ”€โ”€ layer1/
โ”‚   โ”‚   โ””โ”€โ”€ content_cleaning.ex        # โœ… Code fences, comments, wrappers
โ”‚   โ”œโ”€โ”€ layer2/
โ”‚   โ”‚   โ””โ”€โ”€ structural_repair.ex       # โœ… Delimiters, nesting, state machine
โ”‚   โ”œโ”€โ”€ layer3/
โ”‚   โ”‚   โ”œโ”€โ”€ syntax_normalization.ex    # โœ… Quotes, booleans, char-by-char parsing
โ”‚   โ”‚   โ”œโ”€โ”€ object_merger.ex           # โœ… Pre-processing: Object boundary merging
โ”‚   โ”‚   โ”œโ”€โ”€ ellipsis_filter.ex         # โœ… Filter unquoted ellipsis
โ”‚   โ”‚   โ””โ”€โ”€ keyword_filter.ex          # โœ… Filter debug keywords
โ”‚   โ”œโ”€โ”€ layer4/
โ”‚   โ”‚   โ””โ”€โ”€ validation.ex              # โœ… Jason.decode optimization
โ”‚   โ”œโ”€โ”€ layer5/                         # โณ PLANNED
โ”‚   โ”‚   โ””โ”€โ”€ tolerant_parsing.ex        # โณ Custom parser with error recovery
โ”‚   โ”œโ”€โ”€ pipeline.ex                    # Layer orchestration with pre-processing
โ”‚   โ”œโ”€โ”€ performance.ex                 # Monitoring and health checks
โ”‚   โ””โ”€โ”€ config.ex                      # Configuration management

Adding New Repair Capabilities

# 1. Add repair rule to Layer 3
@repair_rules [
  %{
    name: "fix_my_pattern",
    pattern: ~r/custom_pattern/,
    replacement: "fixed_pattern",
    condition: &my_condition_check/1
  }
  # existing rules...
]

# 2. Add test cases
test "fixes my custom pattern" do
  input = "input with custom_pattern"
  expected = "input with fixed_pattern"
  
  {:ok, result, context} = SyntaxNormalization.process(input, %{repairs: [], options: []})
  assert result == expected
  assert Enum.any?(context.repairs, &String.contains?(&1.action, "fix_my_pattern"))
end

# 3. Add to API documentation
@doc """
Fix my custom pattern in JSON strings.
"""
@spec fix_my_pattern(input :: String.t()) :: {String.t(), [repair_action()]}
def fix_my_pattern(input), do: apply_rule(input, @my_pattern_rule)

Roadmap

Version 0.2.0 - Enhanced Capabilities

Version 0.3.0 - Ecosystem Integration

Version 0.4.0 - Advanced Features

License

JsonRemedy is released under the MIT License. See LICENSE for details.


JsonRemedy: Industrial-strength JSON repair for the real world. When your JSON is broken, we fix it right.

Built with โค๏ธ by developers who understand that perfect JSON is a luxury, but working JSON is a necessity.