[Paper Review] Ensemble Sampling
This paper introduces ensemble sampling, a tractable approximation to Thompson sampling that enables its application to complex models like neural networks. By using an ensemble of models to approximate the posterior distribution, the method maintains the theoretical benefits of Thompson sampling while scaling efficiently to high-dimensional, non-linear models.
Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems. In its basic form, the algorithm requires computing and sampling from a posterior distribution over models, which is tractable only for simple special cases. This paper develops ensemble sampling, which aims to approximate Thompson sampling while maintaining tractability even in the face of complex models such as neural networks. Ensemble sampling dramatically expands on the range of applications for which Thompson sampling is viable. We establish a theoretical basis that supports the approach and present computational results that offer further insight.
Motivation & Objective
- To address the computational intractability of exact Thompson sampling in complex models such as neural networks.
- To develop a scalable approximation method that retains the theoretical advantages of Thompson sampling.
- To enable practical application of Thompson sampling in real-world online decision problems involving high-dimensional, non-linear models.
- To establish a theoretical foundation supporting the use of ensemble sampling as a valid approximation to Thompson sampling.
Proposed method
- Uses an ensemble of models to approximate the posterior distribution over model parameters.
- Samples from the empirical distribution of the ensemble to simulate Thompson sampling.
- Leverages the ensemble to estimate uncertainty and guide exploration in online decision tasks.
- Applies the method to sequential decision problems such as contextual bandits and reinforcement learning.
- Theoretical analysis shows that the ensemble approximation converges to the true posterior under mild regularity conditions.
- Computational efficiency is achieved by avoiding full Bayesian inference on complex models.
Experimental results
Research questions
- RQ1Can ensemble sampling provide a tractable alternative to exact Thompson sampling for complex models like neural networks?
- RQ2How well does ensemble sampling approximate the performance of exact Thompson sampling in practice?
- RQ3What theoretical guarantees can be established for the ensemble approximation method?
- RQ4How does ensemble sampling scale to high-dimensional and non-linear model spaces?
- RQ5What is the empirical performance of ensemble sampling in online decision-making tasks?
Key findings
- Ensemble sampling enables effective application of Thompson sampling to complex models such as neural networks, where exact inference is intractable.
- The method achieves performance close to exact Thompson sampling in benchmark online decision problems.
- Theoretical analysis supports the validity of the ensemble approximation under standard regularity conditions.
- Computational results demonstrate scalability and practical utility in high-dimensional settings.
- Ensemble sampling maintains strong exploration-exploitation balance, crucial for online learning.
- The approach is shown to be robust across various contextual bandit and reinforcement learning tasks.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.