[论文解读] Recent Advances in Adversarial Training for Adversarial Robustness
本综述回顾最近在对抗性训练(AT)方面的进展,提出一种新颖的分类法,讨论泛化挑战,并勾画未来方向。
Adversarial training is one of the most effective approaches defending against adversarial examples for deep learning models. Unlike other defense strategies, adversarial training aims to promote the robustness of models intrinsically. During the last few years, adversarial training has been studied and discussed from various aspects. A variety of improvements and developments of adversarial training are proposed, which were, however, neglected in existing surveys. For the first time in this survey, we systematically review the recent progress on adversarial training for adversarial robustness with a novel taxonomy. Then we discuss the generalization problems in adversarial training from three perspectives. Finally, we highlight the challenges which are not fully tackled and present potential future directions.
研究动机与目标
- 提供关于对抗性训练方法及其提升鲁棒性的最新概览。
- 介绍一种新颖的 AT 方法分类法,并将其与鲁棒性提升联系起来。
- 讨论 AT 的泛化缺口,明确挑战与未来研究方向。
提出的方法
- 评述并将近期的 AT 方法归入一个结构化的分类法中(对抗性正则化、课程学习、集成、自适应 epsilon、半监督/无监督、高效训练及其他变体)。
- 总结来自 Table 1 的实验结果,以比较不同方法和数据集上的鲁棒性-性能权衡。
- 讨论在标准准确率、对抗鲁棒性以及未知攻击上的泛化问题。
- 强调最小-最大优化及泛化方面的理论与实践挑战,并提出超越 AT 的方向。
实验结果
研究问题
- RQ1对抗性训练方法的关键家族有哪些?它们在公式化和目标上有何不同?
- RQ2近期的 AT 方法在不同数据集和攻击上的表现如何?仍存在哪些泛化差距?
- RQ3当前 AT 方法的主要局限性(例如,最小-最大优化、过拟合、未知攻击)及潜在的超越 AT 的方向有哪些?
主要发现
| 出版物 | 模型架构 | 攻击 | ε | 数据集 | 准确度 | |
|---|---|---|---|---|---|---|
| Adversarial Regularization | Qin et al. (2019) | ResNet-152 | PGD 50 | 4/255 | ImageNet | 47.00% |
| Zhang et al. (2019b) | Wide ResNet | CW 10 | 0.031/1 | CIFAR-10 | 84.03% | |
| Wang et al. (2020) | ResNet-18 | PGD 20 | 8/255 | CIFAR-10 | 55.45% | |
| Kannan et al. (2018) | InceptionV3 | PGD 10 | 16/255 | ImageNet | 27.90% | |
| Mao et al. (2019) | Wide ResNet | PGD 20 | 8/255 | CIFAR-10 | 50.03% | |
| Zhang et al. (2020) | Wide ResNet | PGD 20 | 16/255 | CIFAR-10 | 49.86% | |
| Cai et al. (2018) | DenseNet-161 | PGD 7 | 8/255 | CIFAR-10 | 69.27% | |
| Wang et al. (2019) | 8-Layer ConvNet | PGD 20 | 8/255 | CIFAR-10 | 42.40% | |
| Pang et al. (2019) | Wide ResNet | PGD 10 | 0.005 | CIFAR-100 | 32.10% | |
| Kariyappa and Qureshi (2019) | ResNet-20 | PGD 30 | 0.09/1 | CIFAR-10 | 46.30% | |
| Yang et al. (2020a) | ResNet-20 | PGD 20 | 0.01/1 | CIFAR-10 | 52.4% | |
| Balaji et al. (2019) | ResNet-152 | PGD 1000 | 8/255 | ImageNet | 59.28% | |
| Ding et al. (2020) | Wide ResNet | PGD 100 | 8/255 | CIFAR-10 | 47.18% | |
| Cheng et al. (2020) | Wide ResNet | PGD 20 | 8/255 | CIFAR-10 | 73.38% | |
| Alayrac et al. (2019) | Wide ResNet | FGSM | 8/255 | CIFAR-10 | 62.18% | |
| Carmon et al. (2019) | Wide ResNet | PGD 10 | 8/255 | CIFAR-10 | 63.10% | |
| Zhai et al. (2019) | Customized ResNet | PGD 7 | 8/255 | CIFAR-10 | 42.48% | |
| Hendrycks et al. (2019) | Wide ResNet | PGD 20 | 0.3/1 | ImageNet | 50.40% | |
| Shafahi et al. (2019) | Wide ResNet | PGD 100 | 8/255 | CIFAR-10 | 46.19% | |
| Wong et al. (2020) | ResNet-50 | PGD 40 | 2/255 | ImageNet | 43.43% | |
| Andriushchenko and Flammarion (2020) | ResNet-50 | PGD 50 | 2/255 | ImageNet | 41.40% | |
| Kim et al. (2021) | PreActResNet-18 | FGSM | 8/255 | CIFAR-10 | 50.50% | |
| Vivek and Babu (2020b) | Wide ResNet | PGD 40 | 8/255 | MNIST | 88.51% | |
| Song et al. (2019) | Customized ConvNet | PGD 20 | 4/255 | CIFAR-10 | 58.10% | |
| Vivek and Babu (2020a) | Wide ResNet | PGD 100 | 0.3/1 | MNIST | 90.03% | |
| Huang et al. (2020) | Wide ResNet | PGD 20 | 8/255 | CIFAR-10 | 45.80% | |
| Zhang et al. (2019a) | Wide ResNet | PGD 20 | 8/255 | CIFAR-10 | 47.98% | |
| Dong et al. (2020) | Wide ResNet | PGD 20 | 8/255 | CIFAR-100 | 29.40% | |
| Wang and Zhang (2019) | Wide ResNet | CW 200 | 4/255 | CIFAR-10 | 60.30% | |
| Zhang and Wang (2019) | Wide ResNet | PGD 20 | 8/255 | CIFAR-100 | 47.20% | |
| Pang et al. (2020b) | Wide ResNet | PGD 500 | 8/255 | CIFAR-10 | 60.75% | |
| Lee et al. (2020) | PreActResNet-18 | PGD 20 | 8/255 | Tiny ImageNet | 20.31% | |
| Zhang and Xu (2020) | Wide ResNet | PGD 20 | 8/255 | CIFAR-10 | 45.11% | |
| Madry et al. (2018) | ResNet-50 | PGD 20 | 8/255 | CIFAR-10 | 45.80% | |
| Wang and Zhang (2019) | Wide ResNet | CW 200 | 4/255 | CIFAR-10 | 60.30% | |
| Zhang and Xu (2020) | Wide ResNet | PGD 20 | 8/255 | CIFAR-10 | 45.11% | |
| Pang et al. (2020a) | Wide ResNet | PGD 500 | 8/255 | CIFAR-10 | 60.75% |
- 对抗性训练仍是最有效的防御,但在对抗性评估下的准确性在许多数据集上仍显著低于干净准确率。
- 存在大量的 AT 方法(正则化、课程学习、集成、自适应 epsilon、半监督/无监督、高效训练),在鲁棒性与标准准确性之间有不同的权衡。
- 泛化差距(对抗性鲁棒泛化和对未知攻击的泛化)依然存在,尚未被当前的 AT 技术完整解决。
- 当前做法常依赖基于 PGD 的内部优化,这并不提供正式的鲁棒性证明,且计算成本可能很高。
- 半监督/无监督数据可以缩小样本复杂性差距并提升鲁棒性,尽管保证仍然有限。
- 加速 AT 的努力(如 Free-AT、FAST-AT、YOPO)有助于降低训练时间,但若未加以缓解,可能引入灾难性过拟合等问题。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。