Kernel Change Point Detection
An Elixir implementation of the Kernel Change Point Detection (KCPD) algorithm. It locates structural breaks in univariate or multivariate time series by minimising a kernel-based cost function via exact dynamic programming (DYNP).
How it works
KCPD frames change point detection as a penalised optimal partitioning problem [1]. Given a signal of length n and a desired number of change points K, the algorithm finds the partition into K + 1 segments that minimises the total cost.
1. Kernel matrix
A symmetric n × n kernel matrix K is computed from the signal, where K[i][j] = k(xᵢ, xⱼ). The kernel k is a positive-definite function that measures similarity between observations. Only the upper triangle is stored; the lower triangle is accessed by symmetry.
2. Segment cost (MMD-based)
The cost of a segment S = [a, b) measures how spread out the observations are within that segment. It is derived from the Maximum Mean Discrepancy (MMD) [2]:
C(a, b) = Σᵢ∈S k(xᵢ, xᵢ) − (1/|S|) Σᵢ∈S Σⱼ∈S k(xᵢ, xⱼ)Intuitively, the first term is the sum of self-similarities and the second is the mean pairwise similarity. A homogeneous segment (all observations drawn from the same distribution) has a low cost; a segment spanning a structural change has a high cost.
3. O(1) segment cost via prefix sums
Naively evaluating C(a, b) requires O(|S|²) work. Instead, two auxiliary structures are built once from K:
- 2-D prefix sum
P[i][j] = Σ_{r<i} Σ_{c<j} K[r][c]— allows any rectangular sum overKto be retrieved in O(1) using the identityΣ_{r=a}^{b-1} Σ_{c=a}^{b-1} K[r][c] = P[b][b] − P[a][b] − P[b][a] + P[a][a]. - Diagonal prefix sum
D[i] = Σ_{r<i} K[r][r]— gives the self-similarity sumΣᵢ∈S k(xᵢ, xᵢ) = D[b] − D[a]in O(1).
Both structures are computed in O(n²) time and space.
4. Dynamic programming (DYNP)
With O(1) cost queries available, the globally optimal segmentation is found by the classic DYNP recurrence [3]:
dp[k][t] = min_{s < t} ( dp[k-1][s] + C(s, t) )
where dp[k][t] is the minimum total cost to cover [0, t) with exactly k segments. The base case dp[1][t] = C(0, t) is filled for all t, and then k is incremented up to K + 1. The optimal breakpoints are recovered by backtracking the stored argmin pointers.
The full search runs in O(n² · K) time.
Installation
Add :kcpd to your mix.exs dependencies:
def deps do
[
{:kcpd, "~> 0.1.0"}
]
endUsage
# Univariate — detect one change point in a step signal
KCPD.detect([0, 0, 0, 5, 5, 5], 1)
#=> [3, 6]
# Two change points
KCPD.detect([0.0, 0.0, 1.0, 1.0, 0.0, 0.0], 2)
#=> [2, 4, 6]
# Multivariate (list of lists)
KCPD.detect([[0, 0], [0, 0], [0, 0], [5, 5], [5, 5], [5, 5]], 1)
#=> [3, 6]
# Laplacian kernel with auto bandwidth
KCPD.detect(signal, 2, kernel: :laplacian, bandwidth: :auto)
# Custom kernel function
KCPD.detect(signal, 1, kernel: fn xi, xj -> :math.exp(-abs(xi - xj)) end)Return value convention
The return value is a sorted list of exclusive end positions (1-based).
detect/3 returns segment end positions, not raw change point indices. The result always contains K + 1 values — one per segment — and the final value is always length(signal).
This convention means the output is self-contained: consecutive pairs of values (prepending 0) fully describe every segment without the caller needing to supply the signal length separately.
breakpoints = KCPD.detect(signal, k) # e.g. [3, 6]
# Reconstruct segments using 0 as the implicit start:
[0 | breakpoints]
|> Enum.chunk_every(2, 1, :discard)
|> Enum.map(fn [a, b] -> Enum.slice(signal, a, b - a) end)
# => [[0, 0, 0], [5, 5, 5]]If you only need the change point indices (i.e. where each new segment begins), drop the last element:
breakpoints |> List.delete_at(-1) # => [3]This convention matches the ruptures Python library [4], making it straightforward to cross-validate results between the two implementations.
Options
| Option | Default | Description |
|---|---|---|
:kernel | :rbf | :rbf, :linear, :laplacian, or a 2-arity fn xi, xj -> ... function |
:bandwidth | 1.0 |
Bandwidth σ for RBF / Laplacian. Pass :auto to use the median pairwise-distance heuristic. |
Kernels
| Name | Formula |
|---|---|
:rbf | exp(-‖xᵢ − xⱼ‖² / 2σ²) |
:linear | xᵢᵀ xⱼ |
:laplacian | exp(-‖xᵢ − xⱼ‖₁ / σ) |
Complexity
| Phase | Time | Space |
|---|---|---|
| Kernel matrix | O(n²) | O(n²) |
| 2-D prefix sums | O(n²) | O(n²) |
| DYNP (K segments) | O(n²K) | O(n²) |
References
Arlot, S., Celisse, A., & Harchaoui, Z. (2019). A kernel multiple change-point algorithm via model selection. Journal of Machine Learning Research, 20(162), 1–56. https://www.jmlr.org/papers/v20/16-155.html
Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., & Smola, A. (2012). A kernel two-sample test. Journal of Machine Learning Research, 13(25), 723–773. https://www.jmlr.org/papers/v13/gretton12a.html
Jackson, B., Scargle, J. D., Barnes, D., Arabhi, S., Alt, A., Gioumousis, P., Gwin, E., Sangtrakulcharoen, P., Tan, L., & Tsai, T. T. (2005). An algorithm for optimal partitioning of data on an interval. IEEE Signal Processing Letters, 12(2), 105–108. https://doi.org/10.1109/LSP.2001.838216
Truong, C., Oudre, L., & Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Processing, 167, 107299. https://doi.org/10.1016/j.sigpro.2019.107299