Skip to main content
QUICK REVIEW

[论文解读] Fast is better than free: Revisiting adversarial training

Eric Wong, Leslie Rice|arXiv (Cornell University)|Jan 12, 2020
Adversarial Robustness in Machine Learning参考文献 46被引用 485
一句话总结

这篇论文表明,在随机初始化的情况下,FGSM 对抗训练在成本远低于基于 PGD 的鲁棒性时也能达到相近水平,并且快速训练技术能显著加速鲁棒模型学习,尽管存在一种称为灾难性过拟合的失败模式。

ABSTRACT

Adversarial training, a method for learning robust deep networks, is typically assumed to be more expensive than traditional training due to the necessity of constructing adversarial examples via a first-order method like projected gradient decent (PGD). In this paper, we make the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice. Specifically, we show that adversarial training with the fast gradient sign method (FGSM), when combined with random initialization, is as effective as PGD-based training but has significantly lower cost. Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with $ε=8/255$ in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at $ε=2/255$ in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds. Finally, we identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail. All code for reproducing the experiments in this paper as well as pretrained model weights are at https://github.com/locuslab/fast_adversarial.

研究动机与目标

  • 通过对抗性训练,推动更便宜、更新更快的实证鲁棒深度网络之路。
  • 评估弱对抗性方法(FGSM)是否能够达到与强对抗性方法(PGD)相当的鲁棒性。
  • 整合受 DAWNBench 启发的技术以加速对抗性训练(循环学习率、混合精度)。
  • 识别阻碍基于 FGSM 的鲁棒性的失效模式并提出改进措施。
  • 在 CIFAR-10 和 ImageNet 基准测试上展示实际鲁棒性与训练速度。

提出的方法

  • 将对抗性训练表述为在 l_infty 扰动(epsilon)下的鲁棒优化问题。
  • 使用带随机初始化的 FGSM 生成用于训练的对抗性样本。
  • 结合随机重启和 FGSM 步长调整(例如 alpha = 1.25 * epsilon)以提高鲁棒性。
  • 应用受 DAWNBench 启发的训练加速方法:循环学习率和混合精度运算。
  • 对强 PGD 攻击进行鲁棒性评估,并在 MNIST/CIFAR-10/ImageNet 上验证不同 epsilon 的情况。
  • 识别并分析灾难性过拟合作为一种失效模式,并提出基于早停的对策。

实验结果

研究问题

  • RQ1带随机初始化的 FGSM 对抗训练是否能实现与基于 PGD 的对抗训练相当的经验鲁棒性?
  • RQ2循环学习率与混合精度等训练加速对对抗性训练的效率和鲁棒性有何影响?
  • RQ3初始化和步长选择对基于 FGSM 的鲁棒性有何影响,可能出现哪些失效模式(“灾难性过拟合”)?
  • RQ4快速的基于 FGSM 的方法在 CIFAR-10 和 ImageNet 对抗强 PGD 评估时的表现如何?
  • RQ5在最少训练时间内实现鲁棒模型的实际指南是什么?

主要发现

  • 带随机初始化的 FGSM 对抗训练在 CIFAR-10 上的鲁棒性可与基于 PGD 的训练相当,但成本只有一小部分。
  • 使用循环学习率和混合精度训练可加速收敛,使 CIFAR-10 鲁棒模型在数分钟内、ImageNet 鲁棒模型在数小时内实现。
  • 在 CIFAR-10 的 epsilon = 8/255 时,对 PGD 的鲁棒精度大致与以往的基于 PGD 的工作相同,但训练时间显著减少。
  • 在 epsilon = 2/255 时,ImageNet 鲁棒模型在大约 12 小时内即可达到与前期方法相似的鲁棒性,使用快速技巧的 FGSM。
  • 当 FGSM 扰动被推到边界或使用零初始化时,可能出现名为灾难性过拟合的失效模式;基于 PGD 精度的提前停止可以恢复鲁棒性。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。