QUICK REVIEW

[논문 리뷰] Sampling-Based Accuracy Testing of Posterior Estimators for General Inference

Pablo Lemos, Adam Coogan|arXiv (Cornell University)|2023. 02. 06.

Bayesian Methods and Mixture Models인용 수 18

한 줄 요약

TARP를 도입합니다: 생성 모델로부터의 사후 추정기의 정확성을 평가하기 위한 샘플링 기반 커버리지 테스트로, 사후 분포를 직접 평가하지 않고도 정확성에 대한 필요충분조건을 증명합니다.

ABSTRACT

Parameter inference, i.e. inferring the posterior distribution of the parameters of a statistical model given some data, is a central problem to many scientific disciplines. Generative models can be used as an alternative to Markov Chain Monte Carlo methods for conducting posterior inference, both in likelihood-based and simulation-based problems. However, assessing the accuracy of posteriors encoded in generative models is not straightforward. In this paper, we introduce `Tests of Accuracy with Random Points' (TARP) coverage testing as a method to estimate coverage probabilities of generative posterior estimators. Our method differs from previously-existing coverage-based methods, which require posterior evaluations. We prove that our approach is necessary and sufficient to show that a posterior estimator is accurate. We demonstrate the method on a variety of synthetic examples, and show that TARP can be used to test the results of posterior inference analyses in high-dimensional spaces. We also show that our method can detect inaccurate inferences in cases where existing methods fail.

연구 동기 및 목표

명시적 사후 평가가 불가능할 때도 사후 추정기의 견고한 평가를 촉진한다.
사후 정확성을 인증하기 위한 이론적으로 체계화된 커버리지 테스트 프레임워크를 개발한다.
사후 정확성의 필요충분조건을 구현하는 실용 알고리즘(TARP)을 제공한다.
합성 및 고차원 문제(중력렌즈를 포함)에 대해 방법을 시연한다.
테스트에서 기준점 분포와 거리 척도의 선택에 관한 지침을 제공한다.

제안 방법

사후 추정기의 정확성을 (x, θ)에 걸친 진짜 사후 분포와의 등식으로 정의한다.
배치 가능한 신뢰 구간 생성기를 도입하고 기대 커버리지 확률(ECP)을 계산한다.
모든 배치에 대해 올바른 기대 커버리지가 정확한 사후 회복을 의미한다는 것을 증명한다(정리 3).
명시적 사후 평가를 요구하지 않고 ECP를 추정하기 위해 TARP (Test of Accuracy with Random Points)를 개발한다.
사후 샘플링을 수행하고, 임의의 기준점 θ_r를 선택하며, 거리지표를 사용해 임의 점 영역을 형성하는 실용 알고리즘(알고리즘 2)을 제안한다.
θ_r 분포 선택과 거리 지표의 선택에 대한 강건성을 시연하고 HPD 기반 커버리지와 비교한다.

Figure 1: A graphical illustration of the proposed coverage test for assessing the quality of a posterior estimator $\hat{p}$ . Given a set of simulations (panels), we draw samples from the posterior estimator (orange points). We sample a reference parameter point $\theta_{r}$ , and determine the fr

실험 결과

연구 질문

RQ1후포 추정기의 정확성을 사후 밀도 평가를 필요로 하지 않는 커버리지 검사로 인증할 수 있는가?
RQ2모든 위치 가능한 신뢰 구역에 대해 올바른 기대 커버리지가 사후 정확성의 필요충분조건인가(정리 3)?

주요 결과

TARP는 정확한 진단을 제공한다: 임의 점 영역 전반에 걸친 올바른 기대 커버리지는 사후 추정기가 정확하다는 것을 시사한다.
HPD 기반 커버리지는 특정 편향이나 비정보적 사후에 눈멀 수 있는 반면, TARP는 이러한 문제를 감지할 수 있다.
TARP 결과는 고차원 설정에서 기준점 분포와 거리 지표 선택에 대해 강건하다.
이 방법은 합성 가우시안 토이 모델과 고차원 중력 렌즈링 원천 재구성 작업에서의 부정확성을 성공적으로 감지한다.
실험은 TARP가 과신 및 과소신뢰의 사후와 HPD 커버리지에서 놓치는 편향을 식별함을 보여준다.

Figure 2: Results on the Gaussian toy model for all four cases described in § 4.1 . The red line shows the method presented in this paper, while the blue shows the HPD region.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.