QUICK REVIEW

[论文解读] Student-t Processes as Alternatives to Gaussian Processes

Amar Shah, Andrew Gordon Wilson|arXiv (Cornell University)|Feb 18, 2014

Gaussian Processes and Bayesian Inference参考文献 12被引用 100

一句话总结

本文提出学生t过程（TPs）作为高斯过程（GPs）的灵活替代方案，通过在高斯过程核上对逆Wishart过程先验进行积分推导出TPs。TP保留了分析形式的边缘分布和预测分布，支持基于核的非参数建模，并具有独特的预测协方差依赖于训练数据取值的特性——在贝叶斯优化和回归中提升了鲁棒性，且计算开销与GPs完全相同。

ABSTRACT

We investigate the Student-t process as an alternative to the Gaussian process as a nonparametric prior over functions. We derive closed form expressions for the marginal likelihood and predictive distribution of a Student-t process, by integrating away an inverse Wishart process prior over the covariance kernel of a Gaussian process model. We show surprising equivalences between different hierarchical Gaussian process models leading to Student-t processes, and derive a new sampling scheme for the inverse Wishart process, which helps elucidate these equivalences. Overall, we show that a Student-t process can retain the attractive properties of a Gaussian process -- a nonparametric representation, analytic marginal and predictive distributions, and easy model selection through covariance kernels -- but has enhanced flexibility, and predictive covariances that, unlike a Gaussian process, explicitly depend on the values of training observations. We verify empirically that a Student-t process is especially useful in situations where there are changes in covariance structure, or in applications like Bayesian optimization, where accurate predictive covariances are critical for good performance. These advantages come at no additional computational cost over Gaussian processes.

研究动机与目标

为解决高斯过程在建模不确定性、处理协方差结构误设或结构性变化方面的局限性。
将逆Wishart过程形式化为协方差矩阵的非参数先验，以应用于层次化高斯过程模型。
推导出具有闭式边缘分布和预测分布的学生t过程，从而实现在回归和优化中的实际应用。
证明学生t过程的预测协方差依赖于训练观测值——与高斯过程不同——从而提升鲁棒性和尾部依赖性。
表明学生t过程可作为高斯过程的即插即用替代品，无额外计算开销，同时在关键应用（如贝叶斯优化）中表现更优。

提出的方法

通过在高斯过程的协方差核上施加逆Wishart过程先验，再进行解析积分，推导出学生t过程。
将逆Wishart过程用作任意大小协方差矩阵的非参数先验，确保其在边缘化下的一致性。
推导出TP的边缘似然和预测分布的闭式表达式，包括用于超参数优化的解析导数。
提出一种新颖的逆Wishart过程采样方案，以阐明层次化高斯过程模型中的等价性，并提升可解释性。
在贝叶斯优化中实现一种边际化期望改进获取函数，通过分层切片采样对超参数进行积分。
使用相同的核函数和超参数推断方法，对比TP与GP在合成函数和基准函数上的性能。

实验结果

研究问题

RQ1能否将学生t过程作为高斯过程的层次化推广形式，实现具有解析可处理的边缘分布和条件分布？
RQ2学生t过程的预测协方差与高斯过程相比有何不同，特别是在其对训练数据取值的依赖性方面？
RQ3逆Wishart过程在构建学生t过程时，作为协方差矩阵的非参数先验，其作用是什么？
RQ4在哪些场景下，学生t过程优于高斯过程，特别是在存在结构性变化的贝叶斯优化和回归中？
RQ5学生t过程能否作为高斯过程的即插即用替代品，且无额外计算成本？

主要发现

学生t过程是具有解析可处理边缘分布和预测分布的最一般椭球对称过程，其范围超越了高斯过程。
学生t过程的预测协方差显式依赖于训练观测值的取值，而高斯过程则不具备此特性，从而能更好地建模不确定性与尾部依赖性。
在贝叶斯优化中，学生t过程优于高斯过程，平均在8.1±0.4次迭代内找到一维正弦函数的最小值，比高斯过程的10.7±0.6次迭代快25%。
在二维Branin-Hoo函数和六维Hartmann函数上，TP对局部极小值的探索更为充分，行为类似阶跃函数，而GP则表现出更均匀的改进。
TP在模型误设和协方差结构变化方面表现出更强的鲁棒性，尤其在高维设置中。
学生t过程可与分析形式的噪声模型结合使用，实现信号与噪声的分离，优于以往的公式化方法，且支持无计算开销的非参数核学习。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。