QUICK REVIEW

[Paper Review] High-dimensional change point estimation via sparse projection

Tengyao Wang, Richard J. Samworth|arXiv (Cornell University)|Jun 20, 2016

Statistical Methods and Inference42 references16 citations

TL;DR

This paper proposes a novel two-stage method, inspect, for detecting changepoints in high-dimensional time series where mean changes occur in a sparse subset of coordinates. It first identifies an optimal projection direction via a convex relaxation of the k-sparse leading left singular vector problem on the CUSUM-transformed data matrix, then applies univariate changepoint detection to the projected series, achieving strong theoretical guarantees on changepoint number and location estimation under high-dimensional asymptotics.

ABSTRACT

Changepoints are a very common feature of Big Data that arrive in the form of a data stream. In this paper, we study high-dimensional time series in which, at certain time points, the mean structure changes in a sparse subset of the coordinates. The challenge is to borrow strength across the coordinates in order to detect smaller changes than could be observed in any individual component series. We propose a two-stage procedure called 'inspect' for estimation of the changepoints: first, we argue that a good projection direction can be obtained as the leading left singular vector of the matrix that solves a convex optimisation problem derived from the CUSUM transformation of the time series. We then apply an existing univariate changepoint estimation algorithm to the projected series. Our theory provides strong guarantees on both the number of estimated changepoints and the rates of convergence of their locations, and our numerical studies validate its highly competitive empirical performance for a wide range of data generating mechanisms. Software implementing the methodology is available in the R package 'InspectChangepoint'.

Motivation & Objective

Address the challenge of detecting sparse mean changes in high-dimensional time series where traditional univariate methods lack power.
Develop a method that borrows strength across coordinates to detect smaller, otherwise undetectable changepoints.
Provide theoretical guarantees on the number of estimated changepoints and the convergence rates of their locations.
Enable practical application through an efficient algorithm and publicly available R package InspectChangepoint.
Extend the framework to handle multiple changepoints using Wild Binary Segmentation in a recursive manner.

Proposed method

Apply the CUSUM transformation to the high-dimensional time series to construct a matrix that captures cumulative deviations from the mean.
Formulate a convex relaxation of the k-sparse leading left singular vector problem to estimate a projection direction aligned with the vector of mean changes.
Project the original data onto the estimated direction to reduce dimensionality while preserving changepoint signal.
Apply an existing univariate changepoint detection algorithm (e.g., CUSUM-based) to the projected series to locate changepoints.
Use Wild Binary Segmentation to recursively detect multiple changepoints by applying the single changepoint procedure to residual series.
Leverage theoretical results on singular vector perturbation and concentration inequalities to establish consistency and convergence rates.

Experimental results

Research questions

RQ1Can a convex relaxation of the sparse singular vector problem provide a consistent estimate of the projection direction for high-dimensional changepoint detection?
RQ2Does projecting high-dimensional data onto the estimated direction enhance the power to detect small mean changes in sparse coordinates?
RQ3What are the theoretical guarantees on the number of estimated changepoints and the rate of convergence of their locations?
RQ4How does the method perform under temporal dependence in the data, such as in weakly dependent or autoregressive structures?
RQ5Can the method be extended to multiple changepoints with provable consistency in high-dimensional settings?

Key findings

The method achieves consistent estimation of both the number and locations of changepoints under high-dimensional asymptotics, with convergence rates established for changepoint location estimation.
Theoretical analysis shows that the estimated projection direction converges to the true direction of mean changes at a rate depending on the sparsity and signal strength.
For the single changepoint case, the method achieves optimal detection power when the signal-to-noise ratio exceeds a threshold related to the sparsity and dimensionality.
Numerical studies demonstrate competitive empirical performance across diverse data-generating mechanisms, including independent, weakly dependent, and correlated error structures.
The method maintains high power even when only a small fraction of coordinates undergo mean changes, outperforming univariate and naive multivariate approaches.
Theoretical guarantees extend to spatially dependent data, with explicit bounds on estimation error under covariance structures such as autoregressive and equicorrelation models.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.