QUICK REVIEW

[논문 리뷰] Uncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison

Tuomas Sivula, Måns Magnusson|arXiv (Cornell University)|2020. 08. 24.

Statistical Methods and Bayesian Inference참고 문헌 24인용 수 100

한 줄 요약

본 논문은 두 모델을 비교할 때 사용되는 Bayesian LOO-CV의 불확실성을 분석하여, 유한 표본에서 표준 오차 추정이 신뢰할 수 없을 수 있으며 특히 모델이 비슷하거나 잘못 지정되었거나 데이터가 희소할 때 그렇다르고, 정상 부트스트랩과 베이지안 부트스트랩 접근법을 제시하고 실용적인 지침을 제공한다.

ABSTRACT

It is useful to estimate the expected predictive performance of models planned to be used for prediction. We focus on leave-one-out cross-validation (LOO-CV), which has become a popular method for estimating predictive performance of Bayesian models. Given two models, we are interested in comparing the predictive performances and associated uncertainty, which can also be used to compute the probability of one model having better predictive performance than the other model. We study the properties of the Bayesian LOO-CV estimator and the related uncertainty quantification for the predictive performance difference, and analyse when a normal approximation of this uncertainty is well calibrated and whether taking into account higher moments could improve the approximation. We provide new results of the properties both theoretically in the linear regression case and empirically for hierarchical linear, latent linear, and spline models and discuss the challenges. We show that problematic cases include: comparing models with similar predictions, misspecified models, and small data. In these cases, there is a weak connection between the distributions of the LOO-CV estimator and its error. We show that that the problematic skewness of the error distribution for the difference, which occurs when the models make similar predictions, does not fade away when the data size grows to infinity in certain situations. Based on the results, we also provide some practical recommendations for the users of Bayesian LOO-CV for comparing predictive performance of models.

연구 동기 및 목표

Bayesian LOO-CV를 사용한 모델 비교 시 elpd 차이의 불확실성이 어떻게 작용하는지 평가한다.
표준 불확실성 추정이 신뢰할 수 없는 상황을 식별한다(예: 유사한 예측, 오제수, 데이터가 적을 때).
정규 선형 회귀 및 다른 모델에서 LOO-CV 불확실성의 이론적 및 경험적 특성을 분석한다.
Bayesian LOO-CV를 사용하는 실무자를 위한 실용적 권고를 제공한다.

제안 방법

모형 비교를 위한 elpd 및 그 LOO-CV 추정량을 형식화한다.
오차 err_LOO 및 그 분포 p(err_LOO))를 통해 차이 elpd(Ma, Mb|y)의 불확실성을 분석한다.
오차 분포에 대해 정상 근사와 Dirichlet 베이지안 부트스트랩의 두 가지 근사 방법을 비교한다.
정규 선형 회귀에 대한 해석적 결과를 도출하고 여러 모델에서의 실험으로 검증한다.
PIT를 사용하여 근사된 불확실성이 오라클 분포에 대해 보정되는지 평가한다.
점근적 거동과 유한 표본의 문제(비대칭 및 잘못된 지정 포함)에 대해 논의한다.

실험 결과

연구 질문

RQ1두 모델을 비교할 때 Bayesian LOO-CV를 사용했을 때 예측 성능 차이에 대한 표준 불확실성 추정이 얼마나 신뢰할 수 있는가?
RQ2정규 근사나 베이지안 부트스트랩 근사가 실패하거나 보정이 잘되지 않는 시나리오는 어떤 것이 있는가?
RQ3비대칭, 잘못된 지정, 작은 표본 크기가 LOO-CV 모델 비교의 불확실성에 어떤 영향을 미치는가?
RQ4정규 선형 회귀 이외의 모형(예: 계층적 모델, 포아송 GLM, 스플라인)으로 일반화되는가?

주요 결과

LOO-CV 차이의 불확실성은 유한 표본에서 신뢰할 수 없을 수 있으며, 특히 모델이 비슷하게 예측하거나 잘못 지정되었거나 데이터가 제한적일 때 그렇다.
LOO-CV 추정기의 오차 분포가 매우 비대칭일 수 있어 일부 시나리오에서 정상 근사가 신뢰할 수 없게 만든다.
오잘못 지정 및 이상치가 LOO-CV 추정치를 바이어스하고 분산을 증가시켜 모델 비교 결론에 영향을 준다.
더 큰 데이터 크기에서도 특정 문제성의 비대칭 패턴이 남아 있어 어떤 모델이 더 나은지에 대한 추론의 정확성을 저해할 수 있다.
베이지안 부트스트랩은 실제로 elpd 차이의 불확실성에 대해 정상 근사를 항상 능가하지는 않는다.
정규 선형 회귀의 결과는 다른 모형에도 질적적으로 확장되며, 베이지안 K-fold CV에서도 유사한 동작이 관찰된다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.