QUICK REVIEW

[論文レビュー] Linear-Memory and Decomposition-Invariant Linearly Convergent Conditional Gradient Algorithm for Structured Polytopes

Dan Garber, Ofer Meshi|arXiv (Cornell University)|May 1, 2016

Stochastic Gradient Optimization Techniques被引用数 23

ひとこと要約

本稿では、次元に依存しない線形メモリおよび計算コストで線形収束を達成する、構造的ポリトープ向けの新しい条件付き勾配アルゴリズムを提案する。分解不変のアウェイステップを活用することで、次元依存要因をスパarsity依存項に置き換え、最適解がスパースである場合に顕著な収束速度の向上を実現する。

ABSTRACT

Recently, several works have shown that natural modifications of the classical conditional gradient method (aka Frank-Wolfe algorithm) for constrained convex optimization, provably converge with a linear rate when the feasible set is a polytope, and the objective is smooth and strongly-convex. However, all of these results suffer from two significant shortcomings: i) large memory requirement due to the need to store an explicit convex decomposition of the current iterate, and as a consequence, large running-time overhead per iteration ii) the worst case convergence rate depends unfavorably on the dimension In this work we present a new conditional gradient variant and a corresponding analysis that improves on both of the above shortcomings. In particular, both memory and computation overheads are only linear in the dimension, and in addition, in case the optimal solution is sparse, the new convergence rate replaces a factor which is at least linear in the dimension in previous works, with a linear dependence on the number of non-zeros in the optimal solution At the heart of our method, and corresponding analysis, is a novel way to compute decomposition-invariant away-steps. While our theoretical guarantees do not apply to any polytope, they apply to several important structured polytopes that capture central concepts such as paths in graphs, perfect matchings in bipartite graphs, marginal distributions that arise in structured prediction tasks, and more. Our theoretical findings are complemented by empirical evidence that shows that our method delivers state-of-the-art performance.

研究の動機と目的

ポリトープ制約に対する従来の条件付き勾配法の高いメモリおよび計算コストの問題に対処すること。
収束速度における次元依存要因を、最適解のスパarsityに依存する項に置き換えることで、それらを排除すること。
1イテレーションあたりの複雑さと保存領域を削減しながらも、線形収束を維持する手法を開発すること。
グラフパス、マッチング、構造的予測などに生じる構造的ポリトープに対しても理論的保証を拡張すること。
関連する最適化タスクにおいて、最先端の性能を実証する実験的証拠を提供すること。

提案手法

現在の反復点の凸分解に依存しない、分解不変のアウェイステップの新しい定式化を導入することで、安定的かつ効率的な更新を可能にする。
反復点の完全な凸分解を明示的に保存しないことで、線形のメモリフットプリントを維持する。
目的関数の十分な減少を保証するとともに収束保証を維持するように、修正されたラインサーチ戦略を採用する。
収束速度を、環境次元ではなく最適解のスパarsityに結びつける新しい理論的枠組みに依拠する。
特に、マージナルポリトープやマッチングポリトープなど、スパarsityが自然に現れる構造的ポリトープに特化している。
主なイノベーションは、各イテレーションで再計算や完全な分解の保存を回避できる、分解不変のアウェイステップの使用である。

実験結果

リサーチクエスチョン

RQ1構造的ポリトープに対して、線形メモリおよび1イテレーションあたりの計算コストで線形収束を達成できる条件付き勾配の変種は存在するか？
RQ2収束速度を環境次元に依存させず、最適解のスパarsityに依存させるのは可能か？
RQ3凸分解の選択に依存しないアウェイステップを設計することは可能か？これにより安定性と効率性が向上するか？
RQ4理論的改善が、現実の構造的最適化問題における実用的性能向上に反映されるか？
RQ5どのクラスの構造的ポリトープが、このような分解不変で線形収束するアルゴリズムを許容するか？

主な発見

提案手法は、問題次元に比例して線形に増加するメモリおよび1イテレーションあたりのコストで線形収束を達成する。これは、二次的またはそれ以上の増加を示す従来手法とは対照的である。
収束速度において、次元に少なくとも線形に依存する要因が、最適解の非ゼロ要素数に依存する要因に置き換えられる。
本手法は、グラフ内のパス、二部グラフにおける完全マッチング、構造的予測におけるマージナル分布を表す重要な構造的ポリトープに適用可能である。
実験的結果は、最先端の性能を示しており、理論的利点が実際の応用でも裏付けられていることを確認している。
最適解がスパースである場合でも、本アルゴリズムは線形収束を維持し、そのような状況で従来手法を著しく上回る性能を発揮する。
理論的分析は、スパarsityが本質的に存在するクラスの構造的ポリトープに対して有効であり、一般のポリトープを超えて拡張可能である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。