QUICK REVIEW

[論文レビュー] Row Sampling for Matrix Algorithms via a Non-Commutative Bernstein Bound

Malik Magdon‐Ismail|arXiv (Cornell University)|Aug 3, 2010

Sparse and Compressive Sensing Techniques参考文献 34被引用数 29

ひとこと要約

この論文は、非可換 Bernstein 確率不等式を用いて、行列乗算、スパース再構成、ℓ² レギュレーションの相対誤差保証を達成する、行サンプリングに基づく行列近似の o(md²) アルゴリズムを提示する。本研究では、SVD を実行せずに、高速なランダム射影を用いて、キーパラメータであるリッジスコア（サンプリング確率）を計算する最初の手法を提案する。これにより、安定ランクにほぼ線形に依存する近似が可能となり、従来のランダム射影法を超えた効率的かつ行を保持する近似が実現される。

ABSTRACT

We focus the use of \emph{row sampling} for approximating matrix algorithms. We give applications to matrix multipication; sparse matrix reconstruction; and, \math{\ell_2} regression. For a matrix \math{\matA\in\R^{m imes d}} which represents \math{m} points in \math{d\ll m} dimensions, all of these tasks can be achieved in \math{O(md^2)} via the singular value decomposition (SVD). For appropriate row-sampling probabilities (which typically depend on the norms of the rows of the \math{m imes d} left singular matrix of \math{\matA} (the \emph{leverage scores}), we give row-sampling algorithms with linear (up to polylog factors) dependence on the stable rank of \math{\matA}. This result is achieved through the application of non-commutative Bernstein bounds. We then give, to our knowledge, the first algorithms for computing approximations to the appropriate row-sampling probabilities without going through the SVD of \math{\matA}. Thus, these are the first \math{o(md^2)} algorithms for row-sampling based approximations to the matrix algorithms which use leverage scores as the sampling probabilities. The techniques we use to approximate sampling according to the leverage scores uses some powerful recent results in the theory of random projections for embedding, and may be of some independent interest. We confess that one may perform all these matrix tasks more efficiently using these same random projection methods, however the resulting algorithms are in terms of a small number of linear combinations of all the rows. In many applications, the actual rows of \math{\matA} have some physical meaning and so methods based on a small number of the actual rows are of interest.

研究の動機と目的

行列乗算、スパース再構成、ℓ² レギュレーションなどの行列近似タスクに適した効率的な行サンプリングアルゴリズムの開発。
左特異行列の行ノルムに基づくリッジスコアに依存するサンプリング確率を用いて、相対誤差保証を達成すること。
完全な SVD を実行せずに、o(md²) 時間で近似リッジスコアを計算すること。
線形結合によるランダム射影では不十分な応用において、元の行の物理的解釈可能性を保持すること。

提案手法

非可換 Bernstein 確率不等式を適用し、相対誤差を伴う行列乗算のためのサンプリング保証を導出する。
SVD の O(md²) コストを回避するため、リッジスコアの効率的近似にランダム射影を用いる。
Johnson-Lindenstrauss 型埋め込みと高速な距離保存射影を用い、サブ線形時間で行ノルム（リッジスコア）を推定する。
推定スコアの正規化を安定化させるため、しきい値処理機構を導入し、小さなエントリのインフレーションを防止する。
射影行列のトレースに関する境界を導出し、リッジスコア推定値の集中を保証する。
これらの技術を統合し、高い確率で相対誤差近似を達成するサンプリング確率を構築する。

実験結果

リサーチクエスチョン

RQ1行サンプリングアルゴリズムは、o(md²) の複雑さで、行列乗算、スパース再構成、ℓ² レギュレーションの相対誤差近似を達成できるか？
RQ2完全な SVD を計算せずに、非一様サンプリングに不可欠なリッジスコアを近似することは可能か？
RQ3ランダム射影を用いて、実行時間に多項式対数的オーバーヘッドしか発生させないリッジスコアを推定することは可能か？
RQ4リッジスコアの推定誤差が、行列アルゴリズムの最終的近似品質に与える影響は何か？
RQ5推定されたサンプリング確率は、小さなノイズの影響を避けるために、効果的に正規化可能か？

主な発見

非可換 Bernstein 確率不等式を用いることで、r = O(ρ log d / ε²) 行を用いて、行列乗算の相対誤差近似が達成され、ρ は安定ランクを表す。
リッジスコアをランダム射影で近似することで、SVD を用いない最初の o(md²) アルゴリズムを提案する。
高速な埋め込みを用いることで、真の値の多項式対数的要因以内にリッジスコアを推定可能となり、SVD を用いずに効率的なサンプリングが可能になる。
推定されたサンプリング確率が真のリッジスコアの定数倍の範囲に収まるように保証され、近似品質が維持される。
しきい値処理戦略により、推定誤差の小さな値による歪みが防止され、正規化が安定化する。
理論的保証（誤差と実行時間）を備えた、ℓ² レギュレーションおよびスパース行列再構成のための効率的かつ行を保持するアルゴリズムが実現可能である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。