QUICK REVIEW

[论文解读] Improved Generalization Bounds for Robust Learning.

Idan Attias, Aryeh Kontorovich|arXiv (Cornell University)|Oct 4, 2018

Machine Learning and Algorithms参考文献 59被引用 15

一句话总结

本论文为对抗设置下的鲁棒学习提出了改进的泛化界，将学习者-对抗者交互建模为每输入最多存在k种干扰的零和博弈。将二值分类的先前界从O(1/ε⁴ log(|H|/δ))收紧至O(1/ε²(√(k·VC(H)) log^{3/2+α}(k·VC(H)) + log(1/δ)))，并将结果扩展至多类和回归任务，同时对函数类的k重最大值引入了关于fat-shattering维数和Rademacher复杂度的新分析。

ABSTRACT

We consider a model of robust learning in an adversarial environment. The learner gets uncorrupted training data with access to possible corruptions that may be effected by the adversary during testing. The learner's goal is to build a robust classifier, which will be tested on future adversarial examples. The adversary is limited to $k$ possible corruptions for each input. We model the learner-adversary interaction as a zero-sum game. This model is closely related to the adversarial examples model of Schmidt et al. (2018); Madry et al. (2017). Our main results consist of generalization bounds for the binary and multiclass classification, as well as the real-valued case (regression). For the binary classification setting, we both tighten the generalization bound of Feige, Mansour, and Schapire (2015), and are also able to handle infinite hypothesis classes. The sample complexity is improved from $O(\frac{1}{\epsilon^4}\log(\frac{|\mathcal{H}|}{\delta}))$ to $O\big(\frac{1}{\epsilon^2}(\sqrt{k \mathrm{VC}(\mathcal{H})}\log^{\frac{3}{2}+\alpha}(k\mathrm{VC}(\mathcal{H}))+\log(\frac{1}{\delta})\big)$ for any $\alpha > 0$. Additionally, we extend the algorithm and generalization bound from the binary to the multiclass and real-valued cases. Along the way, we obtain results on fat-shattering dimension and Rademacher complexity of $k$-fold maxima over function classes; these may be of independent interest. For binary classification, the algorithm of Feige et al. (2015) uses a regret minimization algorithm and an ERM oracle as a black box; we adapt it for the multiclass and regression settings. The algorithm provides us with near-optimal policies for the players on a given training sample.

研究动机与目标

解决在对抗干扰下学习泛化的问题，其中对抗者每输入最多可施加k种干扰。
改进现有二值分类的泛化界，特别是Feige等人（2015年）提出的O(1/ε⁴ log(|H|/δ))界，以降低样本复杂度。
将鲁棒学习框架从二值分类扩展至多类和实值（回归）设置。
分析在对抗干扰下，函数类的k重最大值的fat-shattering维数和Rademacher复杂度，这些是推导新界的关键。

提出的方法

将学习者-对抗者交互建模为零和博弈，其中对抗者每输入最多施加k种干扰。
将Feige等人（2015年）的后悔最小化框架适配至多类和回归设置，使用ERM预言机作为黑箱。
通过针对假设类的k重最大值的fat-shattering维数和Rademacher复杂度的精细化分析，推导泛化界。
为二值分类提出新的样本复杂度界，其规模为O(1/ε²(√(k·VC(H)) log^{3/2+α}(k·VC(H)) + log(1/δ)))。
建立鲁棒泛化误差与在k重干扰下假设类复杂度度量之间的理论联系。
通过后悔最小化框架的算法适应，为给定训练样本上的学习者和对抗者提供近似最优策略。

实验结果

研究问题

RQ1能否将Feige等人（2015年）提出的O(1/ε⁴ log(|H|/δ))界超越，实现鲁棒二值分类的更优泛化界？
RQ2如何在保持强泛化保证的前提下，将鲁棒学习框架从二值分类扩展至多类和回归设置？
RQ3在对抗干扰下，函数类的k重最大值的fat-shattering维数和Rademacher复杂度是多少？
RQ4当对抗者每输入最多施加k种干扰时，鲁棒学习的最优样本复杂度是什么？
RQ5能否将后悔最小化算法适配至多类和回归设置，在对抗干扰下提供近似最优策略？

主要发现

二值分类的泛化界从O(1/ε⁴ log(|H|/δ))改进为O(1/ε²(√(k·VC(H)) log^{3/2+α}(k·VC(H)) + log(1/δ)))，其中任意α > 0。
新界适用于有限和无限假设类，克服了先前工作的局限性。
该框架成功扩展至多类分类和回归任务，在这些设置中提供了泛化界。
论文推导出关于函数类的k重最大值的fat-shattering维数和Rademacher复杂度的新理论结果。
该算法方法为给定训练样本上的学习者和对抗者均生成了近似最优策略。
分析表明，在k重干扰下，假设类的复杂度由√(k·VC(H))和对数因子决定，这些因素塑造了改进后的样本复杂度。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。