StatusCodeTracker

An Elixir library for tracking HTTP status code rates and service health monitoring. It monitors the rate of 5xx error codes and flags the service as unhealthy when it reaches a configured threshold.

Installation

Add status_code_tracker to your list of dependencies in mix.exs:

def deps do
  [
    {:status_code_tracker, "~> 0.1.3"}
  ]
end

Configuration

Configure the library in your application config:

config :status_code_tracker, :settings,
  time_window_seconds: 60,
  error_threshold: 10,
  keep_unhealthy?: false,
  unhealthy_action: fn -> YourModule.on_unhealthy() end,
  healthy_action: fn -> YourModule.on_healthy() end,
  extra_checks: fn -> YourModule.extra_checks() end,
  unhealthy_status_code: 503,
  verbose?: true,
  unhealthy_message: "Service unhealthy due to many 5xx",
  extra_checks_error_message: "Extra checks failed"

Configuration Options

Option Default Description
time_window_seconds60 Sliding time window (in seconds) for counting errors
error_threshold10 Number of 5xx errors within the time window that triggers unhealthy state
keep_unhealthy?false If true, service stays unhealthy until manually reset. If false, service auto-recovers when errors drop below threshold
unhealthy_actionfn -> :noop end Callback function triggered when service becomes unhealthy
healthy_actionfn -> :noop end Callback function triggered when service recovers and becomes healthy again
extra_checksfn -> false end Custom validation function for additional health checks beyond error rate
unhealthy_status_code503 HTTP status code returned when service is unhealthy
verbose?false Enable detailed logging of health status changes
unhealthy_message"Service unhealthy due to many 5xx" Custom message returned when unhealthy due to error threshold
extra_checks_error_message"Extra checks failed" Custom message returned when extra checks fail

Usage

Adding the Health Check Endpoint

You can add the health check endpoint to your router:

scope "/health" do
  get("/", StatusCodeTracker.HealthPlug, [json: true, body: "{\"status\":\"success\"}"])
end

Or add it to your endpoint:

plug StatusCodeTracker.HealthPlug, path: "/health"

Adding the Error Tracker

Add the tracker plug to your endpoint to automatically track all 5xx errors:

plug StatusCodeTracker.Plug

How it Works

Error Tracking

The library uses an ETS table to store timestamps of 5xx errors. When a request results in a 5xx status code, the timestamp is recorded. The health check endpoint is automatically excluded from tracking to prevent recursive errors.

Health Evaluation

When a health check is performed:

  1. The library counts errors that occurred within time_window_seconds
  2. If the count exceeds error_threshold, the service is marked unhealthy
  3. Optionally, extra_checks function is called for additional validation

Automatic Cleanup

A periodic cleanup process removes old timestamps (older than time_window_seconds) to prevent memory growth.

Health State Behavior

When keep_unhealthy?: false (default)

The health status is transient - re-evaluated on every health check:

[errors spike] → unhealthy → [errors drop] → automatically healthy

When keep_unhealthy?: true

The health status is sticky - once unhealthy, stays unhealthy:

[errors spike] → unhealthy → [errors drop] → STILL unhealthy (requires manual reset)

Action Callbacks

unhealthy_action

Triggered when the service transitions from healthy to unhealthy. Use this for:

unhealthy_action: fn ->
  Logger.error("Service became unhealthy!")
  AlertService.send_alert("Service down")
end

healthy_action

Triggered when the service transitions from unhealthy back to healthy (only when keep_unhealthy?: false). Use this for:

healthy_action: fn ->
  Logger.info("Service recovered!")
  AlertService.send_recovery("Service recovered")
end

Extra Checks

You can define custom health checks beyond error rate monitoring:

extra_checks: fn ->
  case check_database_connection() do
    :ok -> false  # false means no issues
    :error -> true  # true means check failed
  end
end

The extra_checks function should return:

API Reference

StatusCodeTracker.Server

License

MIT License. See LICENSE file for details.