QUICK REVIEW

[论文解读] Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin

Colin Wei, Tengyu Ma|arXiv (Cornell University)|Oct 9, 2019

Adversarial Robustness in Machine Learning参考文献 63被引用 28

一句话总结

该论文提出了一种新型的全层边际（all-layer margin），作为深度神经网络的泛化度量，通过在所有层上归一化边际，实现了更紧致且与深度无关的泛化界。该方法在标准泛化和鲁棒泛化方面均提升了样本复杂度，并提出了一种训练算法（AMO），通过显式最大化该边际，显著提升了干净样本和对抗样本下的测试准确率。

ABSTRACT

For linear classifiers, the relationship between (normalized) output margin and generalization is captured in a clear and simple bound -- a large output margin implies good generalization. Unfortunately, for deep models, this relationship is less clear: existing analyses of the output margin give complicated bounds which sometimes depend exponentially on depth. In this work, we propose to instead analyze a new notion of margin, which we call the "all-layer margin." Our analysis reveals that the all-layer margin has a clear and direct relationship with generalization for deep models. This enables the following concrete applications of the all-layer margin: 1) by analyzing the all-layer margin, we obtain tighter generalization bounds for neural nets which depend on Jacobian and hidden layer norms and remove the exponential dependency on depth 2) our neural net results easily translate to the adversarially robust setting, giving the first direct analysis of robust test error for deep networks, and 3) we present a theoretically inspired training algorithm for increasing the all-layer margin. Our algorithm improves both clean and adversarially robust test performance over strong baselines in practice.

研究动机与目标

为解决深度网络中边际与泛化之间缺乏清晰、可解释关系的问题，而线性模型中则存在这种关系。
克服现有泛化界中存在的指数级深度依赖性或复杂归一化因子的问题。
将边际分析扩展至对抗鲁棒分类设置，首次为鲁棒测试误差提供直接的泛化界。
开发一种理论基础坚实的训练算法，通过最大化全层边际来提升泛化性能。

提出的方法

提出全层边际，定义为所有层中最小边际，经由层间复杂度（如权重范数或覆盖数）归一化。
推导出一个泛化界（定理 2.3），其形式与线性情况类似：测试误差与（复杂度总和 / 全层边际）² 的平均值成正比，避免了指数级深度依赖。
建立全层边际的下界，以输出边际和局部Lipschitz常数表示，从而实现数据相关的更紧泛化界。
通过在ℓ∞-球内扰动输入上定义鲁棒全层边际，将全层边际扩展至对抗设置。
将相同泛化框架应用于鲁棒分类，推导出一个泛化界，其中将数据相关项替换为对抗邻域内的最坏情况值。
开发一种对抗正则化训练算法（AMO），在反向传播过程中优化扰动以最大化全层边际。

实验结果

研究问题

RQ1能否为深度网络定义一个统一的边际概念，使其对泛化的刻画如同在线性模型中那样清晰？
RQ2能否在不依赖网络深度指数级依赖的前提下，为深度网络推导出泛化界？
RQ3全层边际框架能否扩展至为对抗鲁棒模型提供泛化保证？
RQ4基于最大化全层边际的训练算法能否同时提升干净和鲁棒测试性能？

主要发现

全层边际使得泛化界（定理 2.3）避免了指数级深度依赖，其形式与线性情况类似。
对于ReLU网络，所提出的边界消除了先前工作（Nagarajan & Kolter, 2019）中存在逆预激活依赖的问题，从而得到更紧致、更实用的保证。
鲁棒分类的泛化界（定理 4.1）是首个对鲁棒测试误差的直接分析，其结构与干净泛化界相似，但使用了最坏情况邻域量。
所提出的AMO训练算法在CIFAR-10上使用VGG-19时，将干净测试误差降低了0.6个百分点（从5.66%降至5.06%），并在WideResNet28-10上将鲁棒误差最多降低了0.98个百分点。
AMO在CIFAR-100上使用WRN28-10时，尽管对dropout概率进行了调优，仍比dropout在鲁棒泛化上降低了0.99%的误差（从18.77%降至17.78%）。
该方法对超参数选择具有鲁棒性，其性能在不同扰动步数$t$和学习率$\eta_{\text{perturb}}$下均保持稳定。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。