QUICK REVIEW

[論文レビュー] Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms

Ping Ma, Xinlian Zhang|arXiv (Cornell University)|Feb 24, 2020

Stochastic Gradient Optimization Techniques参考文献 27被引用数 31

ひとこと要約

本論文は、RandNLA のサンプリング推定量の漸近分布を、無条件推論と条件推論の下で導出し、AMSE および EAMSE 基準を用いた最適サンプリング方式を提案する。

ABSTRACT

The statistical analysis of Randomized Numerical Linear Algebra (RandNLA) algorithms within the past few years has mostly focused on their performance as point estimators. However, this is insufficient for conducting statistical inference, e.g., constructing confidence intervals and hypothesis testing, since the distribution of the estimator is lacking. In this article, we develop an asymptotic analysis to derive the distribution of RandNLA sampling estimators for the least-squares problem. In particular, we derive the asymptotic distribution of a general sampling estimator with arbitrary sampling probabilities. The analysis is conducted in two complementary settings, i.e., when the objective of interest is to approximate the full sample estimator or is to infer the underlying ground truth model parameters. For each setting, we show that the sampling estimator is asymptotically normally distributed under mild regularity conditions. Moreover, the sampling estimator is asymptotically unbiased in both settings. Based on our asymptotic analysis, we use two criteria, the Asymptotic Mean Squared Error (AMSE) and the Expected Asymptotic Mean Squared Error (EAMSE), to identify optimal sampling probabilities. Several of these optimal sampling probability distributions are new to the literature, e.g., the root leverage sampling estimator and the predictor length sampling estimator. Our theoretical results clarify the role of leverage in the sampling process, and our empirical results demonstrate improvements over existing methods.

研究の動機と目的

最小二乗法における点推定を超えた RandNLA 法の統計的推論を動機づける。
真のモデルを推定する設定と全サンプル推定量を近似する設定の2つで、一般的な RandNLA サンプリング推定量の漸近分布を導出する。
AMSE および EAMSE を、最適なサンプリング確率を設計する基準として導入する。
新しいサンプリング方式（inverse-covariance、root leverage、predictor-length）を提案・分析し、既存の方法と比較する。
漸近的無偏性と分散特性の改善を示す理論的結果と実証的検証を提供する。）
method':[' RandNLA サンプリング推定量を tilde{\beta} = (X^T W X)^{-1} X^T W Y で表現し、W はランダム対角行列とする。','正規性漸近性を、正規条件（固定された p、次に p が発散する場合）と2つの推論設定の下で、\\u007ftilde{\\beta} に対して導出する。','AMSE および EAMSE を、漸近的平均二乗誤差とその期待値を定量化するために定義し、最適なサンプリング確率を導く。','\\boldsymbol{\\beta}_0、X\\boldsymbol{\\beta}_0、および X^T X \\boldsymbol{\\beta}_0 を推定するための明示的な AMSE 形を得て、新しいサンプリング方式につながる。','最適な方式を提案する：\\inverse-covariance (IC)、X\\tilde{\\boldsymbol{\\beta}} に対する root leverage (RL)、および X^T X \\tilde{\\boldsymbol{\\beta}} に対する predictor-length (PL)。','サンプリング確率を効率的に計算できる条件を提供し、それらとレバレージスコアとの関係について論じる。'],
research_questions':['LS問題に対する RandNLA サンプリング推定量の無条件推論と条件推論の下での漸近分布は何か？','AMSE および EAMSE を RandNLA の文脈で最適なサンプリング確率を設計するためにどう用いるか？','新しいサンプリング方式（IC、RL、PL）は、AMSE/EAMSE の観点から従来のレバレージベースまたは一様サンプリングより優れているか？','これらの結果は、固定された p と発散する p の場合にどう拡張されるか？'],
key_findings':['サンプリング推定量は、無条件設定と条件設定の双方で漸近的に正規分布かつ漸近的無偏である。','漸近分散は、全サンプルOLS分散と、サンプリング確率の逆数に依存するサンドイッチ型項を組み合わせたものとなる。','inverse-covariance (IC) サンプリングは、\\boldsymbol{\\beta}_0 を推定する AMSE を最小化する。','root leverage (RL) サンプリングは、レバレージ構造により X\\boldsymbol{\\beta}_0 を推定する AMSE を最小化する。','predictor-length (PL) サンプリングは、X^T X \\boldsymbol{\\beta}_0 を推定する AMSE を最小化し、フィッシャー情報と関連する。','実證的な結果は、合成データと実データの両方で、提案推定量の分散を低減し性能を向上させることを示している。'],
table_headers:[]
table_rows:[]

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。