QUICK REVIEW

[论文解读] Sample Complexity of Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization.

Zhe Wang, Yi Zhou|arXiv (Cornell University)|Feb 20, 2018

Sparse and Compressive Sensing Techniques参考文献 36被引用 14

一句话总结

本文提出了一种用于非凸优化的随机方差减少立方正则化（SVRC）牛顿法，相较于标准立方正则化（CR），实现了改进的样本复杂度。通过在有放回和无放回采样下利用方差减少技术，SVRC 在无放回采样下达到总 Hessian 样本复杂度 $\mathcal{O}(N^{8/11} \epsilon^{-3/2})$，在有放回采样下达到 $\mathcal{O}(N^{3/4}\epsilon^{-3/2})$，优于 CR 及先前的子采样变体，同时保持相同的收敛速率。

ABSTRACT

The popular cubic regularization (CR) method converges with first- and second-order optimality guarantee for nonconvex optimization, but encounters a high sample complexity issue for solving large-scale problems. Various sub-sampling variants of CR have been proposed to improve the sample complexity.In this paper, we propose a stochastic variance-reduced cubic-regularized (SVRC) Newton's method under both sampling with and without replacement schemes. We characterize the per-iteration sample complexity bounds which guarantee the same rate of convergence as that of CR for nonconvex optimization. Furthermore, our method achieves a total Hessian sample complexity of $\mathcal{O}(N^{8/11} \epsilon^{-3/2})$ and $\mathcal{O}(N^{3/4}\epsilon^{-3/2})$ respectively under sampling without and with replacement, which improve that of CR as well as other sub-sampling variant methods via the variance reduction scheme. Our result also suggests that sampling without replacement yields lower sample complexity than that of sampling with replacement. We further compare the practical performance of SVRC with other cubic regularization methods via experiments.

研究动机与目标

解决大规模非凸优化中立方正则化（CR）的高样本复杂度问题。
开发一种保持与标准 CR 相同收敛速率的 CR 随机方差减少变体。
在有放回和无放回采样方案下，刻画每轮迭代和总样本复杂度的边界。
证明在随机设置下，无放回采样相比有放回采样能获得更低的样本复杂度。
在实践中对 SVRC 的性能与其他立方正则化方法进行经验比较。

提出的方法

提出一种用于非凸优化的随机方差减少立方正则化（SVRC）牛顿法。
引入一种方差减少技术，以在随机设置下稳定 Hessian 估计。
分析 Hessian 采样在有放回和无放回两种情况下的收敛性。
推导保持标准 CR 收敛速率的每轮迭代样本复杂度边界。
采用递归 Hessian 估计策略，以减少 Hessian 近似中的方差。
为两种采样方案建立总 Hessian 样本复杂度的理论边界。

实验结果

研究问题

RQ1能否设计一种随机方差减少的立方正则化方法，在保持与标准立方正则化相同收敛速率的同时降低样本复杂度？
RQ2在随机立方正则化中，有放回与无放回采样选择如何影响总 Hessian 样本复杂度？
RQ3SVRC 在两种采样方案下的理论样本复杂度边界是什么？
RQ4Hessian 估计中的方差减少是否能改善非凸优化中的收敛行为？
RQ5在收敛速度和样本效率方面，SVRC 与其它立方正则化方法相比在实践中表现如何？

主要发现

SVRC 方法在保持与标准立方正则化（CR）相同收敛速率的同时，显著降低了样本复杂度。
在无放回采样下，总 Hessian 样本复杂度为 $\mathcal{O}(N^{8/11} \epsilon^{-3/2})$，优于 CR 及先前的子采样变体。
在有放回采样下，总 Hessian 样本复杂度为 $\mathcal{O}(N^{3/4}\epsilon^{-3/2})$，同样优于 CR。
无放回采样相比有放回采样能获得更低的样本复杂度，表明其在方差减少效率上具有理论优势。
实验结果表明，SVRC 在实际收敛速度和样本效率方面优于其他立方正则化方法。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。