QUICK REVIEW

[论文解读] Stochastic Activation Pruning for Robust Adversarial Defense

Guneet S. Dhillon, Kamyar Azizzadenesheli|arXiv (Cornell University)|Mar 5, 2018

Adversarial Robustness in Machine Learning参考文献 20被引用 206

一句话总结

SAP 将随机激活剪枝应用于预训练网络，作为对抗对抗样本的事后防御，在不进行微调的情况下提高鲁棒性和校准，并且在对抗训练下带来额外的收益。

ABSTRACT

Neural networks are known to be vulnerable to adversarial examples. Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild. To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model. In general, for such games, the optimal strategy for both players requires a stochastic policy, also known as a mixed strategy. In this light, we propose Stochastic Activation Pruning (SAP), a mixed strategy for adversarial defense. SAP prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate. We can apply SAP to pretrained networks, including adversarially trained models, without fine-tuning, providing robustness against adversarial examples. Experiments demonstrate that SAP confers robustness against attacks, increasing accuracy and preserving calibration.

研究动机与目标

用博弈论视角激发神经网络对对抗样本的鲁棒性。
将 SAP 作为一种随机混合策略引入，用于对预训练模型进行事后防御。
展示 SAP 对抗对抗扰动的有效性并评估其校准。
将 SAP 与 dropout 及对抗训练进行比较，并探索其在强化学习中的适用性。

提出的方法

将 SAP 定义为对手—防守方之间的极小极大博弈中的混合策略。
通过从与激活强度成正比的多项分布中采样，在每一层剪枝一个随机子集的激活。
通过逆采样概率对幸存的激活进行缩放，以保持动态范围，从而保持期望激活值。
对预训练网络进行事后应用 SAP，无需微调。
在图像分类（CIFAR-10 及 ResNet-20）和深度强化学习（Atari 的 DDQN）上评估 SAP。
将 SAP 与 dropout、高斯/噪声基线以及对抗训练进行比较，使用 MC 采样估计随机模型的梯度。

实验结果

研究问题

RQ1在不进行微调的情况下，SAP 是否能够提高预训练网络对抗对抗扰动的鲁棒性？
RQ2在视觉和强化学习任务中，SAP 如何影响在 FGSM 和迭代攻击下的准确性、校准和鲁棒性？
RQ3SAP 如何与对抗训练及其他随机防御相互作用？

主要发现

基于 SAP 的模型在某些扰动水平下对抗对抗扰动显示出更高的准确性（例如，在 CIFAR-10 实验中，SAP-100 在 lambda 值如 1、2、4 时产生绝对改进）。
SAP 在中等扰动下保持准确性并相对于密集模型提升校准。
对抗训练与 SAP 结合（ADV + SAP-100）在较大扰动幅度下比单独对抗训练获得更高的准确性。
在强化学习中，SAP-100 在多个 Atari 游戏中对非零扰动显著提升相对奖励，在某些情况下显示出非常大的增益。
SAP 倾向于优于 dropout 作为随机防御，并且作为事后修改在不重新训练的情况下仍然有效。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。