[Paper Review] Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification
The paper proves that ordinary least-squares (OLS) attains nearly minimax optimal rates for identifying linear dynamical systems from a single trajectory, without relying on mixing-time arguments, by leveraging a generalized small-ball method for dependent data.
We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory. Our upper bound relies on a generalization of Mendelson's small-ball method to dependent data, eschewing the use of standard mixing-time arguments. Our lower bounds reveal that these upper bounds match up to logarithmic factors. In particular, we capture the correct signal-to-noise behavior of the problem, showing that more unstable linear systems are easier to estimate. This behavior is qualitatively different from arguments which rely on mixing-time calculations that suggest that unstable systems are more difficult to estimate. We generalize our technique to provide bounds for a more general class of linear response time-series.
Motivation & Objective
- Motivate the study of sample complexity in linear system identification from a single trajectory.
- Characterize how system dynamics, via the controllability Gramian, affect estimation rates.
- Provide near-minimax upper bounds for OLS in marginally stable regimes (rho(A*) ≤ 1).
- Establish lower bounds matching upper bounds up to logarithmic factors to reveal signal-to-noise behavior.
- Extend techniques to a broader class of linear response time-series.
Proposed method
- Model the system as X_{t+1}=A_*X_t+η_t with η_t ~ N(0, σ^2 I).
- Analyze the OLS estimator c | Â(T)=argmin_A ∑_{t=1}^T 1/2 ||X_{t+1}-AX_t||_2^2.
- Introduce and bound in terms of the finite-time controllability Gramian Γ_T = ∑_{s=0}^{T-1} A_*^s (A_*^s)^T.
- Generalize Mendelsons small-ball method to dependent data via k-block martingale small-ball (BMSB) conditions.
- Develop high-probability bounds by linking the minimum eigenvalue λ_min(Γ_k) to estimation error scales.
- Provide a general theorem (Theorem 2.4) for linear responses with martingale small-ball conditions.
- Apply corollaries to specific system classes (scalar, scaled orthogonal, diagonalizable).
Experimental results
Research questions
- RQ1How many samples (in a single trajectory) are needed to estimate A_* with high probability in operator norm?
- RQ2How does the finite-time controllability Gramian influence the estimation rate across stable and marginally stable regimes?
- RQ3Can OLS achieve minimax-optimal rates without mixing-time arguments in dependent data settings?
- RQ4How do different system structures (scalar, scaled orthogonal, diagonalizable) affect the rates and constants of OLS?
- RQ5Can the results extend to general linear time-series with linear responses beyond dynamical systems?
Key findings
- OLS achieves estimation error bounds that scale with 1/√(T λ_min(Γ_k)) up to log factors.
- Bounds hold for any marginally stable A_* (ρ(A_*) ≤ 1) and do not rely on mixing-time arguments.
- For stable systems, the bounds can be stated without explicit block length dependence for large T (Corollary 2.2).
- The estimation rate depends on the excitability of the system via the controllability Gramian; larger λ_min(Γ_k) yields faster learning.
- Lower bounds show minimax optimality up to logarithmic factors in certain regimes (Theorem 2.3).
- The framework extends to a general time series with linear responses through a martingale small-ball condition (Theorem 2.4).
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.