QUICK REVIEW

[论文解读] On the Convergence and Robustness of Adversarial Training

Yisen Wang, Xingjun Ma|arXiv (Cornell University)|Dec 15, 2021

Adversarial Robustness in Machine Learning参考文献 31被引用 182

一句话总结

引入 FOSC 作为对抗训练中内部最大化的收敛判据，并提出一种动态训练策略，随着训练逐步提高对抗强度以提升鲁棒性，给出理论收敛保证并进行广泛实验。

ABSTRACT

Improving the robustness of deep neural networks (DNNs) to adversarial examples is an important yet challenging problem for secure deep learning. Across existing defense techniques, adversarial training with Projected Gradient Decent (PGD) is amongst the most effective. Adversarial training solves a min-max optimization problem, with the extit{inner maximization} generating adversarial examples by maximizing the classification loss, and the extit{outer minimization} finding model parameters by minimizing the loss on adversarial examples generated from the inner maximization. A criterion that measures how well the inner maximization is solved is therefore crucial for adversarial training. In this paper, we propose such a criterion, namely First-Order Stationary Condition for constrained optimization (FOSC), to quantitatively evaluate the convergence quality of adversarial examples found in the inner maximization. With FOSC, we find that to ensure better robustness, it is essential to use adversarial examples with better convergence quality at the extit{later stages} of training. Yet at the early stages, high convergence quality adversarial examples are not necessary and may even lead to poor robustness. Based on these observations, we propose a extit{dynamic} training strategy to gradually increase the convergence quality of the generated adversarial examples, which significantly improves the robustness of adversarial training. Our theoretical and empirical results show the effectiveness of the proposed method.

研究动机与目标

为对抗训练中内部最大化的定量收敛判据的必要性提供动机。
将 FOSC 作为对抗样本在 epsilon-球内的收敛质量的仿射不变度量引入。
证明在训练过程中逐步提升对抗强度可以提高鲁棒性。
为所提动态训练策略提供理论上的收敛保证。
在 MNIST 和 CIFAR-10 上对比现有方法，进行实证验证。

提出的方法

将对抗训练目标定义为在 epsilon-球上的内部极大化的 min-max 问题。
提出 FOSC，一阶非线性约束优化的驻点条件，作为内部极大化的收敛判据。
给出 FOSC 的闭式表达及其与扰动和梯度的关系，指出更小的 FOSC 对应更强的对抗。
提出一个动态对抗训练算法，在训练轮次中逐步收紧 FOSC 的阈值。
给出收敛分析，表明在到达 delta 精度（与内部最大化误差相关）之前，收敛到一阶驻点呈现子线性速率。
在 MNIST 和 CIFAR-10 上对 Dynamic 与 Standard 以及 Curriculum 对抗训练进行实验比较，包括 WideResNet 设置。

实验结果

研究问题

RQ1如何量化对抗训练中内部极大化的收敛质量？
RQ2FOSC 是否是对抗强度及下游鲁棒性的可靠指示？
RQ3相较于固定强度的 PGD 对抗训练，随时间增加对抗强度的动态课程是否提高鲁棒性？
RQ4所提动态对抗训练方法的理论收敛保证是什么？
RQ5在 MNIST 与 CIFAR-10（含更大容量网络）上，动态对抗训练对白盒与黑盒攻击的表现如何？

主要发现

FOSC 与对抗强度线性相关（当 FOSC 下降时，准确度下降、损失增加）。
在训练后期使用更高收敛质量的对抗样本进行训练能带来更好的鲁棒性；早期阶段的高收敛对抗者训练可能削弱鲁棒性。
动态对抗训练通过逐步收紧 FOSC 阈值，在对 Standard PGD 对抗训练的鲁棒性提升方面具有显著效果，尤其在 CIFAR-10 上。
理论分析表明在内部最大化精度 delta 下的子线性速率收敛到一阶驻点。
实证结果显示 Dynamic 训练在 MNIST 和 CIFAR-10 上实现强大的白盒和黑盒鲁棒性，CIFAR-10 上尤为显著，且在 WideResNet 架构中表现良好。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。