QUICK REVIEW

[论文解读] A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

Eric Brochu, Vlad M. Cora|arXiv (Cornell University)|Dec 12, 2010

Advanced Bandit Algorithms Research参考文献 75被引用 2,138

一句话总结

本教程提供一个关于昂贵代价函数的贝叶斯优化的全面介绍，详细介绍高斯过程先验、获取函数，以及应用于主动用户建模和分层强化学习的两个扩展。

ABSTRACT

We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences.

研究动机与目标

将贝叶斯优化作为数据高效的方法引入，用于最大化昂贵的黑箱目标函数。
解释高斯过程如何作为未知目标的代理模型。
描述在探索与利用之间取得平衡以选择评估点的获取函数。
展示将贝叶斯优化扩展到带偏好主动用户建模和分层强化学习。

提出的方法

描述一个贝叶斯框架，在该框架中对目标函数的先验通过观测更新，以形成对f的后验。
使用均值为m、协方差为k的高斯过程先验来建模f，并推导预测的μ与σ。
定义获取函数（如EI、PI、UCB）以通过最大化期望效用来选择下一次评估。
讨论核函数的选择（平方指数、Matérn、ARD）和超参数学习。
解释高斯观测噪声的处理及其对后验的影响。
展示获取函数如何在探索与利用之间实现权衡。

实验结果

研究问题

RQ1贝叶斯优化如何高效定位昂贵的黑箱代价函数的全局最大值？
RQ2在实践中哪些先验和核函数最能建模平滑目标函数？
RQ3不同获取函数（EI、PI、UCB）在探索与利用平衡方面的表现如何？
RQ4如何将贝叶斯优化扩展到主动用户建模和分层强化学习？

主要发现

贝叶斯优化使用GP代理来建模f及其不确定性，通过获取函数引导采样。
EI、PI和UCB获取函数提供了在探索与利用之间进行权衡的实用机制。
ARD和Matérn核在建模函数平滑性和识别相关维度方面提供灵活性。
本教程展示了对带偏好主动用户建模和分层控制问题的扩展。
观测中的噪声被考虑在内，影响后验更新和获取决策。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。