QUICK REVIEW

[论文解读] Understanding and Improving Fast Adversarial Training

Maksym Andriushchenko, Nicolas Flammarion|arXiv (Cornell University)|Jul 6, 2020

Adversarial Robustness in Machine Learning参考文献 47被引用 45

一句话总结

本文分析了基于 FGSM 的快速对抗训练为何会导致灾难性过拟合，并提出 GradAlign，一种正则化项，可提升鲁棒性并缩小与基于 PGD 的训练之间的差距。

ABSTRACT

A recent line of work focused on making adversarial training computationally efficient for deep learning models. In particular, Wong et al. (2020) showed that $\ell_\infty$-adversarial training with fast gradient sign method (FGSM) can fail due to a phenomenon called "catastrophic overfitting", when the model quickly loses its robustness over a single epoch of training. We show that adding a random step to FGSM, as proposed in Wong et al. (2020), does not prevent catastrophic overfitting, and that randomness is not important per se -- its main role being simply to reduce the magnitude of the perturbation. Moreover, we show that catastrophic overfitting is not inherent to deep and overparametrized networks, but can occur in a single-layer convolutional network with a few filters. In an extreme case, even a single filter can make the network highly non-linear locally, which is the main reason why FGSM training fails. Based on this observation, we propose a new regularization method, GradAlign, that prevents catastrophic overfitting by explicitly maximizing the gradient alignment inside the perturbation set and improves the quality of the FGSM solution. As a result, GradAlign allows to successfully apply FGSM training also for larger $\ell_\infty$-perturbations and reduce the gap to multi-step adversarial training. The code of our experiments is available at https://github.com/tml-epfl/understanding-fast-adv-training.

研究动机与目标

研究在何种条件下以及为何基于 FGSM 的快速对抗训练会产出鲁棒模型或灾难性过拟合。
分析 FGSM 随机性在训练中的作用及其对扰动大小的实际影响。
将灾难性过拟合与梯度对齐及神经网络的局部线性关系联系起来。
提出 GradAlign 以在扰动集合内显式最大化梯度对齐。
在 CIFAR-10、SVHN、ImageNet 上对比评估 GradAlign 与其他快速和多步对抗训练方法。

提出的方法

在 l_infty 威胁模型下形式化对抗训练，并对比 FGSM、带随机起始的 FGSM 与基于 PGD 的方法。
引入梯度对齐正则化项 GradAlign，最小化 1 - cos(梯度之间的夹角) 的值，使 x 与 x+eta 的梯度对齐。
分析单层卷积神经网络中的梯度对齐，以说明单个滤波器如何诱发非线性和过拟合。
给出理论界限，表明随机起始降低了期望扰动长度，并将其与线性近似质量联系起来。
在 CIFAR-10、SVHN、ImageNet 上经验性比较 FGSM、FGSM-RS、FGSM+GradAlign、AT for Free、PGD-2、PGD-10 等方法，并以 PGD-50-10 的鲁棒性作为主要评估指标。
记录训练细节与评估设置。

实验结果

研究问题

RQ1在何种条件下基于 FGSM 的对抗训练可以避免灾难性过拟合？
RQ2FGSM 的随机性（FGSM-RS）主要是降低扰动大小，还是还存在其他机制在起作用？
RQ3扰动集合中的梯度对齐如何与鲁棒性和灾难性过拟合相关？
RQ4能否通过最大化梯度对齐的正则化项（GradAlign）来防止灾难性过拟合、提升快速对抗训练而不需要大量的内部最大化？
RQ5与基于 PGD 的对抗训练相比，所提出的方法在像 CIFAR-10、SVHN、ImageNet 这样的标准基准上表现如何？

主要发现

FGSM 及相关快速对抗训练方法可能出现灾难性过拟合；GradAlign 能防止这种现象并缩小鲁棒性与 PGD-10 的差距。
FGSM-RS 并未本质上解决灾难性过拟合；减少 FGSM 步长即可在不使用随机性的情况下获得相似的鲁棒性。
随机起始降低了扰动长度的期望值，从而提升线性近似质量，解释了 FGSM-RS 的部分效益。
灾难性过拟合与梯度对齐下降以及 FGSM 与 PGD 方向之间的错位有关。
GradAlign 提高 x 与 x+eta 之间的梯度对齐，使 FGSM 训练在更大 l_infty 半径下也能达到鲁棒性，接近 PGD-10 的性能。
GradAlign 在与 PGD-2 结合时也能提升鲁棒性，并且可扩展到 ImageNet，尽管因双重反向传播带来训练速度下降。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。