QUICK REVIEW

[Paper Review] Towards Riemannian Accelerated Gradient Methods

Hongyi Zhang, Suvrit Sra|arXiv (Cornell University)|Jun 7, 2018

Stochastic Gradient Optimization Techniques29 references33 citations

TL;DR

This paper proposes a computationally tractable Riemannian accelerated gradient method (Ragd) that achieves accelerated convergence for geodesically smooth and strongly convex optimization on Riemannian manifolds within a neighborhood of the minimizer. The method relies on a novel estimate sequence and a tangent space distance comparison theorem to bound nonlinear metric distortion, ensuring convergence with a rate dependent on the condition number and sectional curvature.

ABSTRACT

We propose a Riemannian version of Nesterov's Accelerated Gradient algorithm (RAGD), and show that for geodesically smooth and strongly convex problems, within a neighborhood of the minimizer whose radius depends on the condition number as well as the sectional curvature of the manifold, RAGD converges to the minimizer with acceleration. Unlike the algorithm in (Liu et al., 2017) that requires the exact solution to a nonlinear equation which in turn may be intractable, our algorithm is constructive and computationally tractable. Our proof exploits a new estimate sequence and a novel bound on the nonlinear metric distortion, both ideas may be of independent interest.

Motivation & Objective

To develop a computationally tractable Riemannian generalization of Nesterov’s accelerated gradient method that avoids intractable nonlinear equations.
To establish local convergence with acceleration for geodesically smooth and strongly convex problems on Riemannian manifolds.
To overcome the challenge of nonlinear metric distortion in Riemannian optimization through new analytical tools.
To identify conditions under which acceleration is achievable in non-Euclidean spaces, despite the absence of linear structure.
To provide a convergence analysis that relaxes assumptions from prior work and applies to matrix manifolds with tractable exponential maps.

Proposed method

Proposes a Riemannian accelerated gradient algorithm (Ragd) using a modified estimate sequence tailored for nonlinear Riemannian geometry.
Introduces a tangent space distance comparison theorem to bound metric distortion between geodesic distances and their Euclidean approximations in tangent spaces.
Employs a constant step size strategy with parameters $ h = \frac{1}{L} $, $ \beta = \frac{1}{5}\sqrt{\frac{\mu}{L}} $, ensuring convergence within a neighborhood of the minimizer.
Uses a novel estimate sequence that accounts for curvature-induced distortion, relaxing assumptions from Nesterov’s original construction.
Relies on the tractability of Riemannian gradient, exponential map, and its inverse—feasible for many matrix manifolds.
Applies induction and curvature-dependent bounds to ensure the comparison inequality (8) holds at each iteration, enabling convergence proof.

Experimental results

Research questions

RQ1Can Nesterov-style acceleration be achieved in Riemannian optimization, despite the absence of linear structure?
RQ2Is it possible to construct a computationally tractable Riemannian accelerated gradient method that avoids solving intractable nonlinear equations at each step?
RQ3What conditions on curvature and condition number ensure local acceleration on Riemannian manifolds?
RQ4Can a novel estimate sequence and metric distortion bound be developed to handle non-Euclidean geometry in first-order optimization?
RQ5Does the nonlinear nature of Riemannian geometry fundamentally prevent global acceleration, or is local acceleration achievable?

Key findings

The proposed Ragd algorithm converges locally with an accelerated rate of $ \left(1 - \frac{9}{10}\sqrt{\frac{\mu}{L}}\right)^k $ for geodesically smooth and strongly convex problems.
Convergence is guaranteed within a neighborhood $ \mathcal{B}_{x^*, D} $ of radius $ D = \frac{1}{20\sqrt{K}}\left(\frac{\mu}{L}\right)^{\frac{3}{4}} $, depending on the condition number and sectional curvature.
The analysis introduces a new estimate sequence that handles metric distortion on Riemannian manifolds, relaxing assumptions from classical Nesterov methods.
A tangent space distance comparison theorem provides sufficient conditions to bound nonlinear metric distortion, a key technical contribution.
The method avoids intractable nonlinear equations—unlike prior work (Liu et al., 2017)—making it practically implementable on matrix manifolds.
The side length $ d(y_k, v_{k+1}) $ can grow faster than $ d(y_k, x^*) $, indicating that global control of distortion is inherently difficult in nonlinear spaces.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.