QUICK REVIEW

[论文解读] Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization

Jonathan Scarlett, Ilijia Bogunovic|arXiv (Cornell University)|May 31, 2017

Advanced Bandit Algorithms Research被引用 30

一句话总结

该论文在非贝叶斯设置下，首次建立了针对噪声高斯过程带 bandit 优化的算法无关下界，聚焦于平方指数核与 Matérn 核。结果表明，对于平方指数核，达到简单遗憾 ε 需要 T = Ω(1/ε² (log 1/ε)^{d/2}) 轮，几乎匹配现有上界；同时为 Matérn 核提供了类似下界，但与上界之间的差距更大。

ABSTRACT

In this paper, we consider the problem of sequentially optimizing a black-box function $f$ based on noisy samples and bandit feedback. We assume that $f$ is smooth in the sense of having a bounded norm in some reproducing kernel Hilbert space (RKHS), yielding a commonly-considered non-Bayesian form of Gaussian process bandit optimization. We provide algorithm-independent lower bounds on the simple regret, measuring the suboptimality of a single point reported after $T$ rounds, and on the cumulative regret, measuring the sum of regrets over the $T$ chosen points. For the isotropic squared-exponential kernel in $d$ dimensions, we find that an average simple regret of $ε$ requires $T = Ω\big(\frac{1}{ε^2} (\log\frac{1}ε)^{d/2}\big)$, and the average cumulative regret is at least $Ω\big( \sqrt{T(\log T)^{d/2}} \big)$, thus matching existing upper bounds up to the replacement of $d/2$ by $2d+O(1)$ in both cases. For the Matérn-$ν$ kernel, we give analogous bounds of the form $Ω\big( (\frac{1}ε)^{2+d/ν}\big)$ and $Ω\big( T^{\frac{ν+ d}{2ν+ d}} \big)$, and discuss the resulting gaps to the existing upper bounds.

研究动机与目标

通过推导算法无关的下界，弥合现有上界与理论极限在噪声高斯过程带 bandit 优化中的差距。
在有界 RKHS 范数和噪声观测的非贝叶斯设置下，分析简单遗憾与累积遗憾的根本限制。
探究现有平方指数核与 Matérn 核的上界是否紧致或可改进。
探索噪声对遗憾缩放的影响，特别是在高维设置下。
识别贝叶斯设置中的开放问题，其中当前下界可能因先验与针堆中函数不匹配而无法反映实际性能。

提出的方法

构造了一类在噪声带 bandit 反馈下难以区分的针堆中函数，通过在有界 RKHS 范数的函数类上使用极小化极大论证。
应用 Fano 不等式与 Pinsker 不等式，通过有界不同函数下的似然函数之间的总变差距离，推导出期望遗憾的下界。
使用覆盖论证来限制 RKHS 类中可区分函数的数量，从而得出区分最优点所需样本数的下界。
通过分析最终点的期望次优性以及随时间累积的次优性，推导出简单遗憾与累积遗憾的下界。
通过应用反向马尔可夫不等式，将分析拓展至高概率遗憾界，表明常数概率遗憾不可能优于期望遗憾界。
通过分析其各自的 RKHS 范数与度量熵性质，考虑了两种广泛使用的核：平方指数（SE）核与 Matérn 核。

实验结果

研究问题

RQ1在 d 维空间中，噪声高斯过程带 bandit 优化中，平方指数核的简单遗憾的根本下界是什么？
RQ2在非贝叶斯设置下，累积遗憾如何缩放？与现有上界相比如何？
RQ3Matérn 核的最佳已知上界与新下界之间的差距有多大？
RQ4噪声在多大程度上影响 GP bandit 优化中的遗憾缩放？
RQ5SE 核的现有上界能否改进，还是它们已近乎紧致？

主要发现

对于 d 维空间中的平方指数核，达到简单遗憾 ε 所需的最小轮数 T 为 Ω(1/ε² (log 1/ε)^{d/2})，几乎匹配现有上界。
累积遗憾的下界为 Ω(√(T (log T)^{d/2}))，与最佳已知上界相比，指数部分仅相差 2d+O(1) 的因子。
对于 Matérn-ν 核，达到简单遗憾 ε 所需的 T 的下界为 Ω((1/ε)^{2 + d/ν})，表明与现有上界之间存在较大差距。
Matérn 核的累积遗憾下界为 Ω(T^{(ν + d)/(2ν + d)})，严格小于上界缩放，表明存在改进空间。
分析确认了假设 σ/B = O(√T) 可确保 ε/B 足够小，验证了边界的渐近有效性。
通过应用反向马尔可夫不等式，推导出高概率遗憾界，表明常数概率遗憾不可能优于期望遗憾界。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。