Skip to main content
QUICK REVIEW

[論文レビュー] On the Statistical Optimality of Optimal Decision Trees

Zineng Xu, Subhroshekhar Ghosh|arXiv (Cornell University)|Mar 5, 2026
Explainable Artificial Intelligence (XAI)被引用数 0
ひとこと要約

The paper develops a comprehensive statistical theory for empirical risk minimizer (ERM) decision trees under random design, proving oracle inequalities and minimax optimal rates over a new PSHAB function class that captures sparsity, anisotropy, and spatial heterogeneity.

ABSTRACT

While globally optimal empirical risk minimization (ERM) decision trees have become computationally feasible and empirically successful, rigorous theoretical guarantees for their statistical performance remain limited. In this work, we develop a comprehensive statistical theory for ERM trees under random design in both high-dimensional regression and classification. We first establish sharp oracle inequalities that bound the excess risk of the ERM estimator relative to the best possible approximation achievable by any tree with at most $L$ leaves, thereby characterizing the interpretability-accuracy trade-off. We derive these results using a novel uniform concentration framework based on empirically localized Rademacher complexity. Furthermore, we derive minimax optimal rates over a novel function class: the piecewise sparse heterogeneous anisotropic Besov (PSHAB) space. This space explicitly captures three key structural features encountered in practice: sparsity, anisotropic smoothness, and spatial heterogeneity. While our main results are established under sub-Gaussianity, we also provide robust guarantees that hold under heavy-tailed noise settings. Together, these findings provide a principled foundation for the optimality of ERM trees and introduce empirical process tools broadly applicable to other highly adaptive, data-driven procedures.

研究の動機と目的

  • Characterize the interpretability-accuracy trade-off for ERM trees by bounding excess risk relative to the best L-leaf tree.
  • Develop a uniform concentration framework using empirically localized Rademacher complexity to derive oracle inequalities.
  • Introduce PSHAB spaces to model sparsity, anisotropy, and spatial heterogeneity and establish minimax optimal rates for regression and classification.
  • Provide robustness guarantees under heavy-tailed noise and discuss sub-Gaussian assumptions.
  • Explain the limitations of prior dyadic or grid-based analyses and highlight the advantages of non-dyadic ERM tree analysis.

提案手法

  • Define ERM regression and classification tree estimators with leaf-wise constants under constraints and penalties.
  • Derive oracle inequalities bounding excess risk in terms of the best L-leaf approximation and an estimation error term with log(nd) factors.
  • Introduce the PSHAB function class to model piecewise anisotropic Besov regularity across tree cells.
  • Establish minimax optimal rates over PSHAB spaces for both regression and classification, including heavy-tailed noise scenarios.
  • Provide explicit rate expressions for the approximation errors in PSHAB (Theorem 5.5 and Theorem 5.6) and relate them to the generalization bounds (Theorem 6.1).
  • Compare to prior work and discuss how non-dyadic ERM trees achieve spatial adaptation without assuming grid-like partitions.

実験結果

リサーチクエスチョン

  • RQ1Can ERM-based decision trees achieve optimal statistical performance under random design for regression and classification?
  • RQ2How does the number of leaves L affect the excess risk and interpretability-accuracy trade-off?
  • RQ3What are the minimax rates for ERM trees over the PSHAB space, and how do sparsity, anisotropy, and spatial heterogeneity influence them?
  • RQ4Can we obtain robust guarantees under heavy-tailed noise beyond sub-Gaussian assumptions?
  • RQ5How do non-dyadic tree partitions compare to traditional dyadic assumptions in terms of theoretical guarantees?

主な発見

  • ERN regression trees satisfy oracle inequalities bounding excess risk by an oracle term plus an estimation error with a log(nd) factor, highlighting an interpretability-accuracy trade-off.
  • ERN classification trees satisfy analogous oracle inequalities under Tsybakov margin conditions, with rates depending on margin parameter ρ and density decay.
  • Introduction of PSHAB spaces enables minimax optimal convergence rates for ERM trees in both regression and classification, capturing sparsity, anisotropy, and spatial heterogeneity.
  • The analysis extends beyond sub-Gaussian noise to heavy-tailed settings, providing robustness guarantees for ERM trees.
  • The framework uses empirically localized Rademacher complexity to derive uniform concentration bounds for non-dyadic, highly adaptive tree estimators.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。