QUICK REVIEW

[论文解读] A Variance Reduction Method for Non-Convex Optimization with Improved Convergence under Large Condition Number

Zaiyi Chen, Tianbao Yang|arXiv (Cornell University)|Sep 18, 2018

Stochastic Gradient Optimization Techniques被引用 4

一句话总结

该论文提出了一种新颖的、基于SVRG的加速随机算法，用于处理带有非光滑凸项的非凸优化问题，在大条件数条件下实现了更优的梯度复杂度。该方法引入了采样概率递增的分阶段平均解，相较于以往的SVRG风格方法，提供了更强的收敛保证，同时保持了低内存使用。

ABSTRACT

In this paper, we propose a new SVRG-style acceleated stochastic algorithm for solving a family of non-convex optimization problems whose objective consists of a sum of $n$ smooth functions and a non-smooth convex function. Our major goal is to improve the convergence of SVRG-style stochastic algorithms to stationary points under a setting with a large condition number $c$ - the ratio between the smoothness constant and the negative curvature constant. The proposed algorithm achieves the best known gradient complexity when $c\geq \Omega(n)$, which was achieved previously by a SAGA-style accelerated stochastic algorithm. Compared with the SAGA-style accelerated stochastic algorithm, the proposed algorithm is more practical due to its low memory cost that is inherited from previous SVRG-style algorithms. Compared with previous studies on SVRG-style stochastic algorithms, our theory provides much stronger results in terms of (i) reduced gradient complexity under a large condition number; and (ii) that the convergence is proved for a sampled stagewise averaged solution that is selected from all stagewise averaged solutions with increasing sampling probabilities instead of for a uniformly sampled solutions across all iterations.

研究动机与目标

为解决在条件数较大时，SVRG风格算法在非凸优化中收敛速度慢的问题。
改进光滑函数之和与非光滑凸项组合的非凸问题的梯度复杂度。
在保持SVRG低内存优势的同时，实现与SAGA风格方法相当的收敛保证。
为非均匀采样的分阶段平均解建立收敛性，提升理论鲁棒性。

提出的方法

该算法采用受SVRG启发的方差缩减机制，定期保存梯度快照。
采用分阶段平均策略，各阶段内解的平均采用递增的采样概率。
结合随机优化中的加速技术，以在高条件数下降低梯度复杂度。
通过在所有分阶段平均解上采用非均匀采样策略，确保收敛至驻点。
通过仅存储当前迭代点和单个快照梯度，保持低内存使用，优于SAGA风格方法。

实验结果

研究问题

RQ1SVRG风格算法能否在条件数较大的非凸问题中实现改进的梯度复杂度？
RQ2能否证明非均匀采样分阶段平均解的收敛性，而非仅限于均匀采样？
RQ3在高条件数下，能否提升SVRG风格方法的收敛速率，使其达到或超过SAGA风格加速方法的性能？
RQ4在非凸优化中，能否在保持SAGA风格收敛性能的同时，维持低内存使用？

主要发现

当条件数 c ≥ Ω(n) 时，所提算法实现了目前已知最优的梯度复杂度，与先前SAGA风格结果一致。
通过在分阶段平均解中采用递增采样概率，该方法提供了强于以往SVRG风格方法的理论收敛保证。
该算法保持了低内存开销，尽管收敛性能与SAGA风格方法相近，但实际应用中更具可行性。
首次为非均匀采样分阶段平均解建立了收敛性证明，改进了以往工作中对均匀采样假设的依赖。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。