QUICK REVIEW

[論文レビュー] Linear Convergence of Variance-Reduced Stochastic Gradient without Strong Convexity

Pinghua Gong, Jieping Ye|arXiv (Cornell University)|Jun 4, 2014

Stochastic Gradient Optimization Techniques参考文献 29被引用数 28

ひとこと要約

この論文は、機械学習で一般的な非強い凸問題において、バリアンス低減付き確率的勾配法—特にVRPSGとProx-SVRG—の線形収束を確立している。主な技術的貢献は、強い凸性を仮定せずに線形収束を可能にする、新しい半強い凸（SSC）不等式の確立である。この不等式は、制約付きおよび正則化付き設定に適用可能である。

ABSTRACT

Stochastic gradient algorithms estimate the gradient based on only one or a few samples and enjoy low computational cost per iteration. They have been widely used in large-scale optimization problems. However, stochastic gradient algorithms are usually slow to converge and achieve sub-linear convergence rates, due to the inherent variance in the gradient computation. To accelerate the convergence, some variance-reduced stochastic gradient algorithms, e.g., proximal stochastic variance-reduced gradient (Prox-SVRG) algorithm, have recently been proposed to solve strongly convex problems. Under the strongly convex condition, these variance-reduced stochastic gradient algorithms achieve a linear convergence rate. However, many machine learning problems are convex but not strongly convex. In this paper, we introduce Prox-SVRG and its projected variant called Variance-Reduced Projected Stochastic Gradient (VRPSG) to solve a class of non-strongly convex optimization problems widely used in machine learning. As the main technical contribution of this paper, we show that both VRPSG and Prox-SVRG achieve a linear convergence rate without strong convexity. A key ingredient in our proof is a Semi-Strongly Convex (SSC) inequality which is the first to be rigorously proved for a class of non-strongly convex problems in both constrained and regularized settings. Moreover, the SSC inequality is independent of algorithms and may be applied to analyze other stochastic gradient algorithms besides VRPSG and Prox-SVRG, which may be of independent interest. To the best of our knowledge, this is the first work that establishes the linear convergence rate for the variance-reduced stochastic gradient algorithms on solving both constrained and regularized problems without strong convexity.

研究の動機と目的

非強い凸問題におけるバリアンス低減付き確率的勾配法の収束保証のギャップを埋める。
制約付きおよび正則化付き最適化設定において、VRPSGおよびProx-SVRGの線形収束を確立する。
非強い凸問題に適用可能な新しい半強い凸（SSC）不等式を構築し、厳密に証明する。
SSC不等式がアルゴリズムに依存しないこと、および他の確率的勾配法への応用可能性を示す。
最小二乗法やロジスティック回帰など、実用的な機械学習問題における線形収束の理論的根拠を提供する。これらの問題はしばしば強い凸でない。

提案手法

非強い凸問題におけるVRPSGおよびProx-SVRGを提案する。
強い凸性がなくても成り立つ、目的関数のギャップによって最適解集合までの距離を上界で抑える新しい半強い凸（SSC）不等式を導入する。
SSC不等式を用いて、ややきつい条件下で再帰的誤差バウンディングを導出し、線形収束を確立する。
SSC不等式を、射影による制約付き問題およびプロキシマルステップによる正則化付き問題に適用する。
ステップサイズおよび内部ループパラメータを用いて、反復回ごとの目的関数ギャップの期待値の減少を分析し、収束レートを導出する。
収束効率を向上させるために、リプシッツ定数に比例する非一様サンプリングを採用し、実験的に妥当性を検証する。

実験結果

リサーチクエスチョン

RQ1バリアンス低減付き確率的勾配法は、強い凸性を仮定せずに線形収束を達成できるか？
RQ2非強い凸設定における線形収束を可能にする、新たな構造的条件は何か？
RQ3半強い凸（SSC）不等式は、制約付きおよび正則化付き最適化問題の両方で有効でかつ厳密に証明可能か？
RQ4実際の運用において、VRPSGの性能はサンプリング戦略、内部ループ長、ステップサイズにどのように依存するか？
RQ5SSC不等式は、VRPSGおよびProx-SVRGを越えた他の確率的勾配アルゴリズムの解析に活用可能か？

主な発見

VRPSGおよびProx-SVRGは、非強い凸問題において線形収束レートを達成する。これは、かつて強い凸性を仮定した場合にのみ知られていた結果である。
制約付きおよび正則化付き設定の両方において、非強い凸問題のクラスに対して、半強い凸（SSC）不等式が厳密に証明された。
SSC不等式は、目的関数ギャップを用いて最適解集合までの距離をバウンディングするものであり、線形収束解析を可能にする。
実験結果から、リプシッツ定数に比例する非一様サンプリングが、一様サンプリングよりも顕著に収束速度を向上させることが示された。
VRPSGはステップサイズの選択に対して頑健であり、理論的境界が $η < 0.25/L_P$ を要件としているのに対し、$η = 1/L_P$ および $η = 5/L_P$ の両方で高速な収束を示した。
内部ループ長 $m$ を $0.5n$ または $n$ に設定すると最も安定した性能が得られ、中間の値が最適であることを示唆している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。