QUICK REVIEW

[论文解读] EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

Pin‐Yu Chen, Yash Sharma|arXiv (Cornell University)|Sep 13, 2017

Adversarial Robustness in Machine Learning参考文献 29被引用 65

一句话总结

这篇论文将对抗样本构造形式化为带弹性网正则化的优化，产生以L1为导向的扰动（EAD），在有效性上可与L2/L∞攻击相媲美，并提高转移性及与对抗训练的兼容性。

ABSTRACT

Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples - a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on $L_2$ and $L_\infty$ distortion metrics. However, despite the fact that $L_1$ distortion accounts for the total variation and encourages sparsity in the perturbation, little has been developed for crafting $L_1$-based adversarial examples. In this paper, we formulate the process of attacking DNNs via adversarial examples as an elastic-net regularized optimization problem. Our elastic-net attacks to DNNs (EAD) feature $L_1$-oriented adversarial examples and include the state-of-the-art $L_2$ attack as a special case. Experimental results on MNIST, CIFAR10 and ImageNet show that EAD can yield a distinct set of adversarial examples with small $L_1$ distortion and attains similar attack performance to the state-of-the-art methods in different attack scenarios. More importantly, EAD leads to improved attack transferability and complements adversarial training for DNNs, suggesting novel insights on leveraging $L_1$ distortion in adversarial machine learning and security implications of DNNs.

研究动机与目标

激发对基于L1的对抗扰动的研究，以理解DNNs的鲁棒性漏洞。
提出一种新攻击（EAD），将L1和L2惩罚结合起来，生成视觉上相似但稀疏的扰动。
证明EAD在成功率上可与最先进的L2攻击相匹配，同时提供不同的扰动特征。
在使用基于L1的攻击时，展示改进的转移性以及与对抗训练的互补效果。

提出的方法

将定向对抗攻击形式化为带弹性网正则化的优化：在约束 x∈[0,1]^p 下，最小化 c·f(x,t) + β·||x−x0||1 + ||x−x0||2^2。
使用弹性网损失通过L1项促进扰动的稀疏性，并通过L2项实现稳定性。
采用基于logits的C&W损失 f(x,t)，以罐定的置信度参数κ驱动目标标签t。
用迭代收缩阈值算法（ISTA）及其快速变体（FISTA）求解不可导问题。
引入一个专门的收缩阈值算子 Sβ，以在箱型约束下处理L1惩罚。
将 EN-规则（弹性网目标）与 L1-规则（最小L1失真）在选择最终对抗样本方面进行比较。

实验结果

研究问题

RQ1弹性网正则化是否能够产生在L1失真较小的对抗样本，同时在效果上与L2/L∞攻击相当？
RQ2加入L1惩罚如何影响攻击的转移性以及对防御（如防御性蒸馏和对抗训练）的鲁棒性？
RQ3在使用EAD时，L1失真与L2/L∞失真之间的权衡是什么，决策规则如何影响这一点？
RQ4与先前的基于L2的方法相比，EAD是否提高了对防御性蒸馏模型的攻击转移性？

主要发现

在各种设置下，EAD在平均情況下对MNIST、CIFAR10和ImageNet实现了100%的攻击成功率。
EAD能够产生比带L1的I-FGM显著更低的L1失真对抗样本（在MNIST、CIFAR10和ImageNet上约降低47%–87%）。
EAD提高了对防御性蒸馏网络的转移性，在适当的转移性参数κ下，MNIST的攻击成功率接近99%，在某些设置中优于C&W攻击。
包含L1惩罚（β>0）会产生一组不同的对抗样本，可以与对抗训练互补，当与C&W攻击结合时提高鲁棒性。
L1规则还能进一步降低L1失真，但可能增加L2和Linf失真，同时仍保持100%的ASR。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。