QUICK REVIEW

[论文解读] Posterior Concentration for Bayesian Regression Trees and their Ensembles

Veronika Ročková, Stéphanie van der Pas|arXiv (Cornell University)|Aug 29, 2017

Bayesian Modeling and Causal Inference被引用 19

一句话总结

本文通过引入一种“峰-树先验”（spike-and-tree prior），为贝叶斯回归树及其集成方法建立了理论保证，使其后验分布能围绕光滑回归函数集中。研究证明，这些方法在达到最优收敛速率（仅相差对数因子）方面表现优异，可自适应未知光滑度，并在 p > n 时实现维度缩减，为这些方法在实践中的成功提供了理论基础。

ABSTRACT

Since their inception in the 1980's, regression trees have been one of the more widely used non-parametric prediction methods. Tree-structured methods yield a histogram reconstruction of the regression surface, where the bins correspond to terminal nodes of recursive partitioning. Trees are powerful, yet susceptible to over-fitting. Strategies against overfitting have traditionally relied on pruning greedily grown trees. The Bayesian framework offers an alternative remedy against overfitting through priors. Roughly speaking, a good prior charges smaller trees where overfitting does not occur. While the consistency of random histograms, trees and their ensembles has been studied quite extensively, the theoretical understanding of the Bayesian counterparts has been missing. In this paper, we take a step towards understanding why/when do Bayesian trees and their ensembles not overfit. To address this question, we study the speed at which the posterior concentrates around the true smooth regression function. We propose a spike-and-tree variant of the popular Bayesian CART prior and establish new theoretical results showing that regression trees (and their ensembles) (a) are capable of recovering smooth regression surfaces, achieving optimal rates up to a log factor, (b) can adapt to the unknown level of smoothness and (c) can perform effective dimension reduction when p>n. These results provide a piece of missing theoretical evidence explaining why Bayesian trees (and additive variants thereof) have worked so well in practice.

研究动机与目标

为解决实践中贝叶斯回归树为何能避免过拟合的理论理解不足问题。
研究后验分布向真实光滑回归函数集中速率的问题。
提出一种“峰-树先验”，以实现在贝叶斯非参数回归中达到最优的频派收敛速率。
展示在高维设置下（p > n）对未知光滑度的自适应性及有效的维度缩减能力。
为贝叶斯树及其集成方法在实践中表现出的优异性能提供理论依据。

提出的方法

提出贝叶斯 CART 先验的一种“峰-树”变体，将零模型处的峰与基于树的划分先验相结合。
采用分层先验结构，倾向于更小的树，通过收缩减少过拟合。
通过界定真实回归函数周围 Kullback-Leibler 邻域的后验概率，分析后验集中性。
利用非渐近集中不等式和度量熵论证，建立收敛速率。
将结果应用于单棵树及其集成，表明其具有更高的自适应性和鲁棒性。
证明即使在未知光滑度下，后验仍能以最优速率（仅相差对数因子）集中。

实验结果

研究问题

RQ1贝叶斯回归树能否在光滑回归函数上实现最优后验集中速率？
RQ2“峰-树先验”如何实现对回归函数中未知光滑度的自适应？
RQ3当预测变量数量超过样本量（p > n）时，贝叶斯树在多大程度上能实现有效的维度缩减？
RQ4贝叶斯树集成的后验集中性可提供哪些理论保证？
RQ5为何贝叶斯树及其集成方法在具有非参数灵活性的情况下仍能实现良好泛化？

主要发现

贝叶斯回归树在后验集中速率上达到最优（仅相差对数因子），与非参数回归中已知的极小极大速率一致。
“峰-树先验”使后验能够自适应真实回归函数的未知光滑度，而无需事先知道光滑度类别。
该方法通过树划分聚焦于相关协变量，实现有效的维度缩减，在 p > n 时仍能保持良好性能。
理论结果可推广至贝叶斯树集成，表明其相比单棵树具有更高的自适应性和鲁棒性。
在较弱的正则性条件下即可建立后验集中性，为贝叶斯树的实证成功提供了理论解释。
该分析首次为贝叶斯树在非参数回归中的应用提供了理论依据，填补了文献中的关键空白。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。