QUICK REVIEW

[論文レビュー] Sampling-Based Accuracy Testing of Posterior Estimators for General Inference

Pablo Lemos, Adam Coogan|arXiv (Cornell University)|Feb 6, 2023

Bayesian Methods and Mixture Models被引用数 18

ひとこと要約

TARPを導入する：生成モデルからの事後推定量の精度を評価するサンプリングベースのカバレッジテストで、ポステリアを直接評価することなく正確性のための必要十分条件を証明する。

ABSTRACT

Parameter inference, i.e. inferring the posterior distribution of the parameters of a statistical model given some data, is a central problem to many scientific disciplines. Generative models can be used as an alternative to Markov Chain Monte Carlo methods for conducting posterior inference, both in likelihood-based and simulation-based problems. However, assessing the accuracy of posteriors encoded in generative models is not straightforward. In this paper, we introduce `Tests of Accuracy with Random Points' (TARP) coverage testing as a method to estimate coverage probabilities of generative posterior estimators. Our method differs from previously-existing coverage-based methods, which require posterior evaluations. We prove that our approach is necessary and sufficient to show that a posterior estimator is accurate. We demonstrate the method on a variety of synthetic examples, and show that TARP can be used to test the results of posterior inference analyses in high-dimensional spaces. We also show that our method can detect inaccurate inferences in cases where existing methods fail.

研究の動機と目的

明示的な事後評価が利用できない場合に、事後推定量の堅牢な評価を促進する。
事後の正確性を認証するための理論的に基づいたカバレッジ検証フレームワークを開発する。
事後正確性の必要十分条件を実装する実用的なアルゴリズム（TARP）を提供する。
合成データと高次元問題（重力レンズ像を含む）で手法を実証する。
テストにおける参照点分布と距離測度の選択に関する指針を提供する。

提案手法

ポステリア推定量の正確さを、(x, θ) 全体で真の事後分布と等しいこととして定義する。
配置可能な信頼区間生成器を導入し、期待カバレッジ確率（ECP）を計算する。
すべての配置に対して正しい期待カバレッジが成り立つことは、厳密な事後回復を意味することを証明する（Theorem 3）。
Explicitな posterior evaluations を必要とせずに ECPを推定するための TARP (Test of Accuracy with Random Points) を開発する。
実用的なアルゴリズム（Algorithm 2）を提案し、ポステリアドローをサンプリングし、ランダムな参照点 θ_r を選択し、距離測度を用いてランダムポイント領域を形成する。
θ_r 分布の選択と距離測度の選択に対する頑健性を実証し、HPDベースのカバレッジと比較する。）

Figure 1: A graphical illustration of the proposed coverage test for assessing the quality of a posterior estimator $\hat{p}$ . Given a set of simulations (panels), we draw samples from the posterior estimator (orange points). We sample a reference parameter point $\theta_{r}$ , and determine the fr

実験結果

リサーチクエスチョン

RQ1ポスト推定量の精度を、事後密度の評価を必要とせずにカバレッジチェックで認証できるか。
RQ2すべての配置可能な信用区間に対して正しい期待カバレッジが、ポステリオの正確性の必要十分条件であるか（Theorem 3）？

主な発見

TARPは正確な診断を与える：ランダムポイント領域全体での正しい期待カバレッジが、事後推定量が正確であることを意味する。
HPDベースのカバレッジは特定の偏りや無情報的な事後分布には盲になることがあるが、TARPはそのような問題を検出できる。
TARPは高次元設定における参照点の分布と距離測度の選択に対する頑健性を示す。
この手法は、合成的な Gaussian toy モデルおよび高次元の重力レンズ源再構成タスクで不正確さを検出するのに成功した。
実験は、TARPが過度に自信過剰または過小な自信の事後分布、およびHPDカバレッジが見落とすバイアスを識別することを示している。

Figure 2: Results on the Gaussian toy model for all four cases described in § 4.1 . The red line shows the method presented in this paper, while the blue shows the HPD region.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。