Skip to main content
QUICK REVIEW

[Paper Review] Restricting exchangeable nonparametric distributions

Sinead A. Williamson, Steve MacEachern|arXiv (Cornell University)|Sep 5, 2012
Gaussian Processes and Bayesian Inference24 references7 citations
TL;DR

This paper proposes a class of exchangeable nonparametric priors that restrict the domain of existing models like the Indian buffet process (IBP) to explicitly control the distribution over the number of features per data point. By modifying the prior to enforce specific marginal distributions—such as a fixed number of features or a heavy-tailed distribution—the method improves model interpretability and predictive performance on data where the original IBP's Poisson-distributed feature counts are ill-suited, as demonstrated on synthetic images and text data.

ABSTRACT

Distributions over exchangeable matrices with infinitely many columns are useful in constructing nonparametric latent variable models. However, the distribution implied by such models over the number of features exhibited by each data point may be poorly-suited for many modeling tasks. In this paper, we propose a class of exchangeable nonparametric priors obtained by restricting the domain of existing models. Such models allow us to specify the distribution over the number of features per data point, and can achieve better performance on data sets where the number of features is not well-modeled by the original distribution.

Motivation & Objective

  • To address the limitation of existing exchangeable nonparametric models like the IBP, which assume a Poisson distribution for the number of features per data point.
  • To develop a framework that allows researchers to explicitly specify the marginal distribution over the number of features per data point.
  • To improve model performance and interpretability in settings where the Poisson assumption is inappropriate, such as in text data with power-law word usage or image data with fixed feature counts.
  • To provide a principled method for restricting the domain of completely random measures in exchangeable matrix models.

Proposed method

  • Proposes a restricted Indian buffet process (rIBP) by constraining the prior distribution over feature counts per data point, replacing the default Poisson marginal with user-specified distributions.
  • Uses a finite approximation of the IBP with a truncated number of features (e.g., 100) to enable practical inference.
  • Employs Gibbs sampling for posterior inference over feature assignments and beta process parameters, with Metropolis-Hastings steps for constrained parameters.
  • Applies importance sampling to estimate the predictive distribution under the restricted model, using weights derived from the ratio of restricted and unrestricted likelihoods.
  • Implements a predictive distribution approximation via weighted samples from the posterior of the beta process, using Equation 12 for importance weighting.
  • Uses a negative binomial distribution to model heavy-tailed feature counts in text data, replacing the Poisson assumption of the standard IBP.

Experimental results

Research questions

  • RQ1Can restricting the distribution over the number of features per data point improve model interpretability in latent feature models?
  • RQ2Does enforcing a non-Poisson marginal distribution for feature counts lead to better predictive performance on real-world data?
  • RQ3How can we modify existing exchangeable nonparametric models like the IBP to allow flexible control over the number of features per data point?
  • RQ4What inference techniques are effective for restricted models with constrained parameter spaces?
  • RQ5Can domain-specific knowledge about feature counts (e.g., exactly two features per image) be encoded to improve reconstruction quality?

Key findings

  • The restricted IBP (rIBP) successfully recovers the true underlying features in synthetic image data where each data point has exactly two features, outperforming the standard IBP in reconstruction quality.
  • On the 20 Newsgroups text dataset, the rIBP with a negative binomial prior on feature counts achieved uniformly higher classification accuracy than the standard IBP across all top-n label rankings.
  • At n=5, the rIBP correctly classified 91.8% of held-out documents in the top 5 most likely labels, compared to 87.8% for the standard IBP.
  • The use of importance sampling with weighted posterior samples enabled accurate estimation of the predictive distribution under the restricted model.
  • The method demonstrates that incorporating domain knowledge about feature counts leads to more parsimonious and interpretable models.
  • The results confirm that the Poisson assumption on feature counts in the IBP is suboptimal for data with heavy-tailed feature distributions, such as natural language text.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.