QUICK REVIEW

[论文解读] Robust Decision Trees Against Adversarial Examples

Hongge Chen, Huan Zhang|arXiv (Cornell University)|Feb 27, 2019

Adversarial Robustness in Machine Learning被引用 40

一句话总结

这篇论文为决策树和提升树建立了一般性的鲁棒训练框架，以抵抗对抗性扰动，形式化一个最大-最小目标，并为实际可扩展性近似鲁棒分裂。

ABSTRACT

Although adversarial examples and model robustness have been extensively studied in the context of linear models and neural networks, research on this issue in tree-based models and how to make tree-based models robust against adversarial examples is still limited. In this paper, we show that tree based models are also vulnerable to adversarial examples and develop a novel algorithm to learn robust trees. At its core, our method aims to optimize the performance under the worst-case perturbation of input features, which leads to a max-min saddle point problem. Incorporating this saddle point objective into the decision tree building procedure is non-trivial due to the discrete nature of trees --- a naive approach to finding the best split according to this saddle point objective will take exponential time. To make our approach practical and scalable, we propose efficient tree building algorithms by approximating the inner minimizer in this saddle point problem, and present efficient implementations for classical information gain based trees as well as state-of-the-art tree boosting models such as XGBoost. Experimental results on real world datasets demonstrate that the proposed algorithms can substantially improve the robustness of tree-based models against adversarial examples.

研究动机与目标

证明基于树的模型像神经网络一样对对抗性样本敏感。
提出面向最坏情况输入扰动的决策树通用鲁棒训练框架。
开发与经典信息增益树以及像 XGBoost 这样的现代 GBDT 方法兼容的鲁棒分裂的可扩展近似。
在真实数据集上使用多种攻击方法评估鲁棒性提升。

提出的方法

将鲁棒分裂选择形式化为对每个样本周围的 ell-infty 范围内扰动的最大-最小优化。
定义一个歧义集 Delta I，由第 j 个特征接近分裂阈值的点组成，并引入二进制分配变量以建模最坏情况扰动。
提供两个实用近似： (1) 针对信息增益树的鲁棒分裂，复杂度为每个分裂 O(d|I|^2)；(2) 针对提升树的鲁棒分裂，使用四个代表性情形以提高效率。
将框架扩展到梯度提升决策树，通过二阶损失泰勒展开来表达分裂分数，采用类似 XGBoost 的评分。
提供算法实现（鲁棒信息增益树的算法1与鲁棒提升树的算法2）以实现可扩展训练。

实验结果

研究问题

RQ1基于树的模型是否也会对与神经网络类似的对抗性扰动表现出脆弱性？
RQ2我们能否定义一个鲁棒的最大-最小分裂决策目标，在不带来不可承受的计算成本的情况下提升对抗鲁棒性？
RQ3如何高效实现对经典决策树和像 XGBoost 这样的现代提升框架的鲁棒分裂？
RQ4在真实数据集上，鲁棒训练对鲁棒性和准确性的经验影响是什么？

主要发现

基于树的模型像神经网络一样易受对抗性样本影响。
可以将最大-最小鲁棒训练目标并入树构建过程，以在扰动下优化最坏情形性能。
高效近似使信息增益树和提升树的鲁棒分裂成为可能，使鲁棒训练可扩展到如 XGBoost 这样的大型模型。
实验显示在对抗攻击下鲁棒树显著提高鲁棒性，有时测试准确度与自然树相当甚至更高。
鲁棒训练提高了成功对抗性错分类所需的扰动（L-无穷范数），表明鲁棒性提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。