Kernel Change Point Detection

An Elixir implementation of the Kernel Change Point Detection (KCPD) algorithm. It locates structural breaks in univariate or multivariate time series by minimising a kernel-based cost function via exact dynamic programming (DYNP).

How it works

KCPD frames change point detection as a penalised optimal partitioning problem [1]. Given a signal of length n and a desired number of change points K, the algorithm finds the partition into K + 1 segments that minimises the total cost.

1. Kernel matrix

A symmetric n × n kernel matrix K is computed from the signal, where K[i][j] = k(xᵢ, xⱼ). The kernel k is a positive-definite function that measures similarity between observations. Only the upper triangle is stored; the lower triangle is accessed by symmetry.

2. Segment cost (MMD-based)

The cost of a segment S = [a, b) measures how spread out the observations are within that segment. It is derived from the Maximum Mean Discrepancy (MMD) [2]:

C(a, b) = Σᵢ∈S k(xᵢ, xᵢ) − (1/|S|) Σᵢ∈S Σⱼ∈S k(xᵢ, xⱼ)

Intuitively, the first term is the sum of self-similarities and the second is the mean pairwise similarity. A homogeneous segment (all observations drawn from the same distribution) has a low cost; a segment spanning a structural change has a high cost.

3. O(1) segment cost via prefix sums

Naively evaluating C(a, b) requires O(|S|²) work. Instead, two auxiliary structures are built once from K:

2-D prefix sumP[i][j] = Σ_{r<i} Σ_{c<j} K[r][c] — allows any rectangular sum over K to be retrieved in O(1) using the identity Σ_{r=a}^{b-1} Σ_{c=a}^{b-1} K[r][c] = P[b][b] − P[a][b] − P[b][a] + P[a][a].
Diagonal prefix sumD[i] = Σ_{r<i} K[r][r] — gives the self-similarity sum Σᵢ∈S k(xᵢ, xᵢ) = D[b] − D[a] in O(1).

Both structures are computed in O(n²) time and space.

4. Dynamic programming (DYNP)

With O(1) cost queries available, the globally optimal segmentation is found by the classic DYNP recurrence [3]:

dp[k][t] = min_{s < t} ( dp[k-1][s] + C(s, t) )

where dp[k][t] is the minimum total cost to cover [0, t) with exactly k segments. The base case dp[1][t] = C(0, t) is filled for all t, and then k is incremented up to K + 1. The optimal breakpoints are recovered by backtracking the stored argmin pointers.

The full search runs in O(n² · K) time.

Installation

Add :kcpd to your mix.exs dependencies:

def deps do
  [
    {:kcpd, "~> 0.1.0"}
  ]
end

Usage

# Univariate — detect one change point in a step signal
KCPD.detect([0, 0, 0, 5, 5, 5], 1)
#=> [3, 6]

# Two change points
KCPD.detect([0.0, 0.0, 1.0, 1.0, 0.0, 0.0], 2)
#=> [2, 4, 6]

# Multivariate (list of lists)
KCPD.detect([[0, 0], [0, 0], [0, 0], [5, 5], [5, 5], [5, 5]], 1)
#=> [3, 6]

# Laplacian kernel with auto bandwidth
KCPD.detect(signal, 2, kernel: :laplacian, bandwidth: :auto)

# Custom kernel function
KCPD.detect(signal, 1, kernel: fn xi, xj -> :math.exp(-abs(xi - xj)) end)

Return value convention

The return value is a sorted list of exclusive end positions (1-based).

detect/3 returns segment end positions, not raw change point indices. The result always contains K + 1 values — one per segment — and the final value is always length(signal).

This convention means the output is self-contained: consecutive pairs of values (prepending 0) fully describe every segment without the caller needing to supply the signal length separately.

breakpoints = KCPD.detect(signal, k)  # e.g. [3, 6]

# Reconstruct segments using 0 as the implicit start:
[0 | breakpoints]
|> Enum.chunk_every(2, 1, :discard)
|> Enum.map(fn [a, b] -> Enum.slice(signal, a, b - a) end)
# => [[0, 0, 0], [5, 5, 5]]

If you only need the change point indices (i.e. where each new segment begins), drop the last element:

breakpoints |> List.delete_at(-1)  # => [3]

This convention matches the ruptures Python library [4], making it straightforward to cross-validate results between the two implementations.

Options

Option	Default	Description
`:kernel`	`:rbf`	`:rbf`, `:linear`, `:laplacian`, or a 2-arity `fn xi, xj -> ...` function
`:bandwidth`	`1.0`	Bandwidth σ for RBF / Laplacian. Pass `:auto` to use the median pairwise-distance heuristic.

Kernels

Name	Formula
`:rbf`	`exp(-‖xᵢ − xⱼ‖² / 2σ²)`
`:linear`	`xᵢᵀ xⱼ`
`:laplacian`	`exp(-‖xᵢ − xⱼ‖₁ / σ)`

Complexity

Phase	Time	Space
Kernel matrix	O(n²)	O(n²)
2-D prefix sums	O(n²)	O(n²)
DYNP (K segments)	O(n²K)	O(n²)

References

Arlot, S., Celisse, A., & Harchaoui, Z. (2019). A kernel multiple change-point algorithm via model selection. Journal of Machine Learning Research, 20(162), 1–56. https://www.jmlr.org/papers/v20/16-155.html
Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., & Smola, A. (2012). A kernel two-sample test. Journal of Machine Learning Research, 13(25), 723–773. https://www.jmlr.org/papers/v13/gretton12a.html
Jackson, B., Scargle, J. D., Barnes, D., Arabhi, S., Alt, A., Gioumousis, P., Gwin, E., Sangtrakulcharoen, P., Tan, L., & Tsai, T. T. (2005). An algorithm for optimal partitioning of data on an interval. IEEE Signal Processing Letters, 12(2), 105–108. https://doi.org/10.1109/LSP.2001.838216
Truong, C., Oudre, L., & Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Processing, 167, 107299. https://doi.org/10.1016/j.sigpro.2019.107299