QUICK REVIEW

[Paper Review] Ensemble Sampling

Xiuyuan Lu, Benjamin Van Roy|arXiv (Cornell University)|May 20, 2017

Anomaly Detection Techniques and Applications25 citations

TL;DR

This paper introduces ensemble sampling, a tractable approximation to Thompson sampling that enables its application to complex models like neural networks. By using an ensemble of models to approximate the posterior distribution, the method maintains the theoretical benefits of Thompson sampling while scaling efficiently to high-dimensional, non-linear models.

ABSTRACT

Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems. In its basic form, the algorithm requires computing and sampling from a posterior distribution over models, which is tractable only for simple special cases. This paper develops ensemble sampling, which aims to approximate Thompson sampling while maintaining tractability even in the face of complex models such as neural networks. Ensemble sampling dramatically expands on the range of applications for which Thompson sampling is viable. We establish a theoretical basis that supports the approach and present computational results that offer further insight.

Motivation & Objective

To address the computational intractability of exact Thompson sampling in complex models such as neural networks.
To develop a scalable approximation method that retains the theoretical advantages of Thompson sampling.
To enable practical application of Thompson sampling in real-world online decision problems involving high-dimensional, non-linear models.
To establish a theoretical foundation supporting the use of ensemble sampling as a valid approximation to Thompson sampling.

Proposed method

Uses an ensemble of models to approximate the posterior distribution over model parameters.
Samples from the empirical distribution of the ensemble to simulate Thompson sampling.
Leverages the ensemble to estimate uncertainty and guide exploration in online decision tasks.
Applies the method to sequential decision problems such as contextual bandits and reinforcement learning.
Theoretical analysis shows that the ensemble approximation converges to the true posterior under mild regularity conditions.
Computational efficiency is achieved by avoiding full Bayesian inference on complex models.

Experimental results

Research questions

RQ1Can ensemble sampling provide a tractable alternative to exact Thompson sampling for complex models like neural networks?
RQ2How well does ensemble sampling approximate the performance of exact Thompson sampling in practice?
RQ3What theoretical guarantees can be established for the ensemble approximation method?
RQ4How does ensemble sampling scale to high-dimensional and non-linear model spaces?
RQ5What is the empirical performance of ensemble sampling in online decision-making tasks?

Key findings

Ensemble sampling enables effective application of Thompson sampling to complex models such as neural networks, where exact inference is intractable.
The method achieves performance close to exact Thompson sampling in benchmark online decision problems.
Theoretical analysis supports the validity of the ensemble approximation under standard regularity conditions.
Computational results demonstrate scalability and practical utility in high-dimensional settings.
Ensemble sampling maintains strong exploration-exploitation balance, crucial for online learning.
The approach is shown to be robust across various contextual bandit and reinforcement learning tasks.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.