QUICK REVIEW

[論文レビュー] Distributed estimation of the inverse Hessian by determinantal averaging

Michał Dereziński, Michael W. Mahoney|arXiv (Cornell University)|May 28, 2019

Statistical Mechanics and Entropy参考文献 27被引用数 3

ひとこと要約

本稿では、逆ヘッシアンの分散推定における反転バイアスを是正するための新規手法、デターミナント平均化を提案する。各局所的逆ヘッシアン推定値の重みを、その推定値の行列式に基づき設定し、それらを平均化することで、局所的推定値の数が増加するにつれて真のニュートンステップに収束する漸近的一貫性を達成する。主な貢献は、有限標本における集中保証を伴う、通信効率の高い分散ニュートン法の理論的裏付けに基づくアプローチである。

ABSTRACT

In distributed optimization and distributed numerical linear algebra, we often encounter an inversion bias: if we want to compute a quantity that depends on the inverse of a sum of distributed matrices, then the sum of the inverses does not equal the inverse of the sum. An example of this occurs in distributed Newton's method, where we wish to compute (or implicitly work with) the inverse Hessian multiplied by the gradient. In this case, locally computed estimates are biased, and so taking a uniform average will not recover the correct solution. To address this, we propose determinantal averaging, a new approach for correcting the inversion bias. This approach involves reweighting the local estimates of the Newton's step proportionally to the determinant of the local Hessian estimate, and then averaging them together to obtain an improved global estimate. This method provides the first known distributed Newton step that is asymptotically consistent, i.e., it recovers the exact step in the limit as the number of distributed partitions grows to infinity. To show this, we develop new expectation identities and moment bounds for the determinant and adjugate of a random matrix. Determinantal averaging can be applied not only to Newton's method, but to computing any quantity that is a linear tranformation of a matrix inverse, e.g., taking a trace of the inverse covariance matrix, which is used in data uncertainty quantification.

研究の動機と目的

局所的逆行列の平均が平均の逆行列に一致しないという、行列の逆行列推定における反転バイアスを是正すること。
ノード間で大きな行列を結合することを避ける通信効率の高い分散ニュートン法の手法を開発すること。
行列式に基づく重み付き平均に依拠する分散ニュートンステップの収束に関する理論的保証を提供すること。
ランダム行列の行列式および余因子行列に対する新しい集中不等式を確立し、有限標本解析を可能にすること。

提案手法

局所的逆ヘッシアン推定値の重み付き平均であるデターミナント平均化を提案する。重みは各局所的ヘッシアン推定値の行列式に等しい。
重み付けスキームの正当化に、H⁻¹ = E[det(Ĥ)Ĥ⁻¹] / E[det(Ĥ)] という恒等式を用いる。
大数の法則を適用して、重み付き平均がほとんど確実に真の逆ヘッシアンに収束することを示す。
ランク1の正定値摂動を受けるランダム行列の行列式および余因子行列に対する新しい行列集中不等式を導出する。
ランダム行列の行列式に関する新しいモーメント不等式を用いて、デターミナント平均の収束速度に関する高確率的境界を確立する。
各ノードが局所的ステップを推定し、ヘッシアンの行列式に比例する重みで重み付き平均を取ることで、分散ニュートン法にこの手法を適用する。

実験結果

リサーチクエスチョン

RQ1原理的重み付けスキームを用いて、分散行列逆行列推定における反転バイアスを是正できるか？
RQ2デターミナント平均化は、分散最適化における逆ヘッシアンの漸近的一貫推定量を提供するか？
RQ3逆行列のデターミナント平均に対して、有限標本における集中保証を確立できるか？
RQ4この手法はニュートン法を超えて、トレース推定などの逆行列の線形関数への応用に一般化可能か？
RQ5行列式および余因子行列の収束保証を確立するために、どのような新しいモーメント不等式が必要か？

主な発見

デターミナント平均化は、分割数が無限に近づくにつれて正確なニュートンステップに収束する、分散ニュートンステップにおいて初の漸近的一貫推定量である。
本手法は高確率的収束を達成する：確率1−δ以上で、逆ヘッシアン推定値のデターミナント平均が真の逆ヘッシアンの (1±η/√m) 倍の範囲内に収束する。
ヘッシアンおよびサンプリングパラメータに適切な条件が満たされれば、ニュートンステップの有限標本誤差境界は O(η/√m) となる。
本稿では、ランダム行列の行列式および余因子行列に関する新しいモーメント不等式を確立し、これらはランダム行列理論において独立に興味深い結果である。
各ノードが局所的逆ヘッシアン推定値を計算し、O(d) のパラメータのみを送信することで、通信効率の高い分散最適化を実現する。
本手法は、不確実性評価のための逆共分散行列のトレース推定など、逆行列の任意の線形関数へ一般化可能である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。