QUICK REVIEW

[論文レビュー] Fast Algorithms for Robust PCA via Gradient Descent

Xinyang Yi, Dohyung Park|arXiv (Cornell University)|May 25, 2016

Sparse and Compressive Sensing Techniques参考文献 4被引用数 138

ひとこと要約

本論文は、完全観測データおよび部分観測データの両方で機能する、堅牢PCAの非凸・勾配降下法ベースアプローチを提案し、堅牢性保証を維持しつつ実行時間を短縮する。

ABSTRACT

We consider the problem of Robust PCA in the fully and partially observed settings. Without corruptions, this is the well-known matrix completion problem. From a statistical standpoint this problem has been recently well-studied, and conditions on when recovery is possible (how many observations do we need, how many corruptions can we tolerate) via polynomial-time algorithms is by now understood. This paper presents and analyzes a non-convex optimization approach that greatly reduces the computational complexity of the above problems, compared to the best available algorithms. In particular, in the fully observed case, with $r$ denoting rank and $d$ dimension, we reduce the complexity from $\mathcal{O}(r^2d^2\log(1/\varepsilon))$ to $\mathcal{O}(rd^2\log(1/\varepsilon))$ -- a big savings when the rank is big. For the partially observed case, we show the complexity of our algorithm is no more than $\mathcal{O}(r^4d \log d \log(1/\varepsilon))$. Not only is this the best-known run-time for a provable algorithm under partial observation, but in the setting where $r$ is small compared to $d$, it also allows for near-linear-in-$d$ run-time that can be exploited in the fully-observed case as well, by simply running our algorithm on a subset of the observations.

研究の動機と目的

欠落および破損エントリを伴う堅牢PCAを、SVDベース PCAのスケーラブルな代替として動機づける。
低ランク因子分解を初期化し、その後 sparsity 制約の下で精練する2段階アルゴリズムを開発する。
破損を伴うリスクを減らさずに erasures に対応するため、部分観測へアプローチを拡張する。
初期化品質、収束、サンプル複雑性に関する理論的保証を提供する。
凸法および従来の非凸アプローチに対する実用的な性能向上を実証する。

提案手法

Robust PCA を Y = M* + S* の形に定式化し、M* を低ランク、S* を疎とする（決定論的またはランダムな破損モデルの下で）。
2段階のアルゴリズムを導入する：i) S_init を初期化し、Y - S_init のランク r の SVD により (U0, V0) を初期化するスパース推定器; ii) 因子空間上での射影勾配法と、スパース性を考慮した更新を行う。
α分の上位要素のみを row と column の両方で保持する sparsification 演算子 T_alpha[A] を定義する。
U,V の因子に対する勾配ベースの更新を行い、U^T U ≈ V^T V を課す正規化項と、非相干性制約集合への射影、さらに各反復での sparse 推定器 S_t を加える。
部分観測の場合、損失を観測エントリ上で動作させ、 sparsity 演算子を観測サポートに調整して収束保証を維持する。

実験結果

リサーチクエスチョン

RQ1グロス破損および欠搾データの下で、非凸勾配降下法アプローチはロウランク成分を回復できるか。
RQ2完全観測設定での線形収束を保証する初期化とステップサイズ条件は何か。
RQ3欠損（erasures）に対して頑健性の保証を、計算コストを膨らませずに拡張できるか。
RQ4完全観測および部分観測設定でのサンプルと時間の複雑性はどの程度で、凸緩和と比較してどうか。
RQ5小さなランクの場合に subsampling はほぼ線時間性能を実現しつつ頑健性を維持できるか。

主な発見

提案アルゴリズムは完全観測の場合に O(r d^2 log(1/ε)) の複雑さで頑健な回復を達成し、既存手法より改善している。
適切な初期化とパラメータの下で、勾配反復は収束を線形に収束させ、収束率は O(1/κ) の縮退を持ち、ε-精度には O(κ log(1/ε)) 回の反復を要する。
部分観測設定では、 observed entries が O(μ^2 r^2 d log d) 個で成功し、計算時間は O(μ^3 r^4 d log d log(1/ε)) となり、r が小さい場合には d に対してほぼ線形の性能を示す。
一般矩形行列での厳密なマトリクス補完が、サンプル数 O(μ^2 r^2 d log d) と時間 O(μ^3 r^4 d log d log(1/ε)) で実現可能であり、いくつかの先行結果を改良する。
このアプローチは特定のレジームで SVD様の実行時間と一致し、実験において AltProj および凸 IALM より速度と頑健性の点で優れている。
合成データおよびビデオ前景-背景分離に関する実証結果は、データをサブサンプリングした場合に特に収束が速く、分離品質が改善されることを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。