QUICK REVIEW

[论文解读] Smooth Adversarial Training

Cihang Xie, Mingxing Tan|arXiv (Cornell University)|Jun 25, 2020

Adversarial Robustness in Machine Learning参考文献 82被引用 99

一句话总结

SAT 将 ReLU 替换为平滑激活以增强对抗性训练，从而在不牺牲准确率或增加额外计算的情况下提升鲁棒性，并在 ImageNet 上使用 ResNet-50 和 EfficientNet-L1 进行验证。

ABSTRACT

It is commonly believed that networks cannot be both accurate and robust, that gaining robustness means losing accuracy. It is also generally believed that, unless making networks larger, network architectural elements would otherwise matter little in improving adversarial robustness. Here we present evidence to challenge these common beliefs by a careful study about adversarial training. Our key observation is that the widely-used ReLU activation function significantly weakens adversarial training due to its non-smooth nature. Hence we propose smooth adversarial training (SAT), in which we replace ReLU with its smooth approximations to strengthen adversarial training. The purpose of smooth activation functions in SAT is to allow it to find harder adversarial examples and compute better gradient updates during adversarial training. Compared to standard adversarial training, SAT improves adversarial robustness for "free", i.e., no drop in accuracy and no increase in computational cost. For example, without introducing additional computations, SAT significantly enhances ResNet-50's robustness from 33.0% to 42.3%, while also improving accuracy by 0.9% on ImageNet. SAT also works well with larger networks: it helps EfficientNet-L1 to achieve 82.2% accuracy and 58.6% robustness on ImageNet, outperforming the previous state-of-the-art defense by 9.5% for accuracy and 11.6% for robustness. Models are available at https://github.com/cihangxie/SmoothAdversarialTraining.

研究动机与目标

挑战“鲁棒性必须以牺牲准确性或使用更大模型”为前提的观点。
研究激活函数的平滑性如何影响对抗性训练的质量。
提出并评估使用平滑激活的平滑对抗性训练（SAT）。
量化在大规模与可扩展架构中的鲁棒性与准确率提升。

提出的方法

将 ReLU 视为对抗性训练中梯度非平滑导致的弱点来源。
在 SAT 的前向与反向传播中引入平滑激活函数（如 Softplus、SILU、GELU、ELU 变体）。
使用基于 PGD 的攻击对模型进行对抗样本训练，比较攻击者梯度质量与优化器梯度质量。
在 ImageNet 上使用 ResNet-50 与 EfficientNet-L1 以及在 CIFAR-10 上对 SAT 进行评估。
通过仅对回传过程平滑、仅对前向过程平滑以及完整的 SAT 来进行消融分析，以 isolating 效果。
将 SAT 与现有方法进行对比，并分析深度、宽度、分辨率等对可扩展性的影响。

实验结果

研究问题

RQ1ReLU 的非平滑梯度是否会降低对抗性训练的性能？
RQ2平滑激活是否能在不损害清晰准确率或增加计算量的情况下提升对抗鲁棒性？
RQ3相较于标准对抗性训练，使用更大网络和不同架构（如 EfficientNet）时 SAT 的表现如何？
RQ4在前向和回传两端同时平滑 vs 仅在一个端平滑的情况下，其影响如何？

主要发现

网络	对对抗者梯度质量的改善	对网络优化器梯度质量的改善	准确率 (%)	鲁棒性 (%)	备注
ResNet-50	✗	✗	68.8	33.0	Baseline with ReLU forward; standard adversarial training
ResNet-50	✓	✗	68.3	34.5	+1.5% robustness vs baseline; -0.5% accuracy
ResNet-50	✗	✓	69.4	35.8	+2.8% robustness vs baseline; +0.6% accuracy
ResNet-50	✓	✓	68.9	36.9	+3.9% robustness vs baseline; +0.1% accuracy

SAT 将 ResNet-50 在 ImageNet 上的鲁棒性从 33.0% 提升到 42.3%，且不损失准确率或增加额外成本（准确率提升 0.9%）。
前向和回传两端都使用平滑激活能带来最佳鲁棒性提升（相比 ReLU 基线最多提升 3.9%）。
EfficientNet-L1 通过 SAT 达到 82.2% 的准确率和 58.6% 的鲁棒性，超越以往方法在准确率上 9.5%、鲁棒性上 11.6% 的提升。
在测试的模型上，使用平滑激活的 SAT 能稳定提升鲁棒性，同时保持相近的准确率。
SAT 使深度、宽度、分辨率等尺度扩展成为可能，复合缩放在鲁棒性提升方面超越了标准对抗性训练。
在 CIFAR-10 上，Softplus/GELU/SmoothReLU 相对于 ReLU 可提升鲁棒性；ELU 在未平滑时可能不稳定（需 CELU 平滑）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。