Skip to main content
QUICK REVIEW

[Paper Review] Bayesian leave-one-out cross-validation approximations for Gaussian latent variable models

Aki Vehtari, Tommi Mononen|arXiv (Cornell University)|Dec 23, 2014
Gaussian Processes and Bayesian Inference56 references58 citations
TL;DR

This paper proposes fast, accurate approximations for Bayesian leave-one-out cross-validation (LOO) in Gaussian latent variable models using Laplace and expectation propagation (EP) posterior approximations. It demonstrates that computing LOO with a Gaussian approximation to the LOO marginal distribution (cavity distribution) is the most accurate and efficient method, requiring negligible additional cost after full-data posterior inference.

ABSTRACT

The future predictive performance of a Bayesian model can be estimated using Bayesian cross-validation. In this article, we consider Gaussian latent variable models where the integration over the latent values is approximated using the Laplace method or expectation propagation (EP). We study the properties of several Bayesian leave-one-out (LOO) cross-validation approximations that in most cases can be computed with a small additional cost after forming the posterior approximation given the full data. Our main objective is to assess the accuracy of the approximative LOO cross-validation estimators. That is, for each method (Laplace and EP) we compare the approximate fast computation with the exact brute force LOO computation. Secondarily, we evaluate the accuracy of the Laplace and EP approximations themselves against a ground truth established through extensive Markov chain Monte Carlo simulation. Our empirical results show that the approach based upon a Gaussian approximation to the LOO marginal distribution (the so-called cavity distribution) gives the most accurate and reliable results among the fast methods.

Motivation & Objective

  • To develop efficient and accurate approximations for Bayesian leave-one-out cross-validation (LOO) in Gaussian latent variable models (GLVMs).
  • To evaluate the accuracy of LOO approximations based on Laplace and expectation propagation (EP) methods against exact brute-force LOO computation.
  • To compare the performance of different LOO approximation techniques in terms of accuracy and computational cost.
  • To assess the reliability of the Laplace and EP posterior approximations themselves using Markov chain Monte Carlo (MCMC) as a ground truth.
  • To establish that the cavity distribution-based LOO approximation offers the best balance of accuracy and computational efficiency.

Proposed method

  • Use the Laplace method and expectation propagation (EP) to approximate the posterior distribution over latent variables in GLVMs.
  • Apply a Gaussian approximation to the LOO marginal distribution (cavity distribution) to compute fast LOO cross-validation estimates.
  • Compute LOO approximations with minimal additional cost after forming the full-data posterior approximation.
  • Use the cavity distribution approach to estimate the predictive performance of each left-out observation.
  • Compare the cavity-based LOO with alternative fast LOO methods, including moment corrections (LA-CM2, EP-FACT), and exact LOO via MCMC.
  • Leverage the factorizing likelihood structure of GLVMs to enable efficient computation of LOO without re-running full posterior inference.

Experimental results

Research questions

  • RQ1How accurate are the fast LOO cross-validation approximations based on Laplace and EP methods compared to exact LOO computed via MCMC?
  • RQ2Which LOO approximation method—cavity distribution, moment correction, or direct approximation—provides the most accurate predictive performance estimate?
  • RQ3What is the computational overhead of computing LOO using the cavity distribution approach compared to standard posterior inference?
  • RQ4How do the Laplace and EP approximations compare to MCMC-based ground truth in terms of posterior accuracy?
  • RQ5Can the cavity distribution method be reliably used for model selection and performance assessment in GLVMs with minimal computational cost?

Key findings

  • The cavity distribution-based LOO approximation provides the most accurate and reliable results among fast LOO methods, outperforming moment correction and direct approximation techniques.
  • The cavity-based method incurs negligible additional computational cost after full-data posterior approximation, making it highly efficient.
  • For the probit likelihood, GPstuff-EP is 1.5–5 times slower than GPstuff-LA, while for log-logistic with censoring, GPstuff-EP is about 18 times slower due to slow quadrature-based moment computations.
  • GPstuff-EP is 10–25 times faster than GPML-EP due to better vectorization and parallel updates, highlighting implementation efficiency.
  • The global Gaussian variational (KL) method is 70–500 times slower than GPML-EP, confirming its high computational overhead despite similar O(n³) scaling.
  • For Student’s t likelihood, the robust-EP implementation in GPstuff performs well, while GPML-KL fails to converge properly, showing significant performance degradation.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.