QUICK REVIEW

[论文解读] Model Agnostic Dual Quality Assessment for Adversarial Machine Learning and an Analysis of Current Neural Networks and Defenses.

Danilo Vargas, Shashank Kotyan|arXiv (Cornell University)|Jun 14, 2019

Adversarial Robustness in Machine Learning被引用 6

一句话总结

本文提出了一种与模型无关的双重质量评估框架，以解决对抗性机器学习评估中的偏差问题，引入了鲁棒性等级，并提出了一种新型的 $L_\infty$ 黑盒攻击，其扰动仅需 One-Pixel 攻击的 12%。结果表明，当前的模型和防御方法在所有鲁棒性等级下仍然易受攻击，且仅使用 $L_1$/$L_2$ 指标不足以检测虚假的对抗性样本。

ABSTRACT

There exists a vast number of adversarial attacks and defences for machine learning algorithms of various types which makes assessing the robustness of algorithms a daunting task. To make matters worse, there is an intrinsic bias in these adversarial algorithms. Here, we organise the problems faced: a) Model Dependence, b) Insufficient Evaluation, c) False Adversarial Samples, and d) Perturbation Dependent Results). Based on this, we propose a model agnostic dual quality assessment method, together with the concept of robustness levels to tackle them. We validate the dual quality assessment on state-of-the-art neural networks (WideResNet, ResNet, AllConv, DenseNet, NIN, LeNet and CapsNet) as well as adversarial defences for image classification problem. We further show that current networks and defences are vulnerable at all levels of robustness. The proposed robustness assessment reveals that depending on the metric used (i.e., $L_0$ or $L_\infty$), the robustness may vary significantly. Hence, the duality should be taken into account for a correct evaluation. Moreover, a mathematical derivation, as well as a counter-example, suggest that $L_1$ and $L_2$ metrics alone are not sufficient to avoid spurious adversarial samples. Interestingly, the threshold attack of the proposed assessment is a novel $L_\infty$ black-box adversarial method which requires even less perturbation than the One-Pixel Attack (only $12\%$ of One-Pixel Attack's amount of perturbation) to achieve similar results. Code is available at this http URL.

研究动机与目标

为解决对抗性机器学习评估中的固有偏差，包括模型依赖性、评估不足、虚假对抗性样本以及与扰动相关的评估结果。
开发一种与模型无关的框架，以实现在多种神经网络和防御方法中的一致且全面的鲁棒性评估。
引入鲁棒性等级的概念，以在不同对抗性条件下的模型表现进行评估。
证明当前最先进模型和防御方法在所有鲁棒性等级下仍然易受攻击。
表明仅使用 $L_1$ 和 $L_2$ 指标不足以实现可靠的对抗性样本检测，并验证结合 $L_0$ 和 $L_\infty$ 指标进行双重评估的必要性。

提出的方法

提出一种双重质量评估方法，通过同时使用 $L_0$ 和 $L_\infty$ 范数来评估鲁棒性，以捕捉不同类型的对抗性扰动特征。
引入鲁棒性等级的概念，以系统性地分析模型在不同扰动强度下的行为。
开发一种新型的 $L_\infty$ 黑盒对抗攻击，其成功率达到与 One-Pixel 攻击相当，但仅需其 12% 的扰动量。
通过数学推导和反例证明，仅使用 $L_1$ 和 $L_2$ 指标无法避免虚假对抗性样本的产生。
在一系列最先进模型上验证该框架，包括 WideResNet、ResNet、AllConv、DenseNet、NIN、LeNet 和 CapsNet。
将该评估方法应用于多种图像分类任务中的对抗性防御，确保在不同网络架构上的与模型无关的评估。

实验结果

研究问题

RQ1与单一指标方法相比，该双重质量评估框架在评估对抗鲁棒性方面有何改进？
RQ2当前最先进神经网络在不同鲁棒性等级下的脆弱性程度如何？
RQ3根据数学分析，仅使用 $L_1$ 和 $L_2$ 指标是否足以防止虚假对抗性样本的生成？
RQ4所提出的 $L_\infty$ 黑盒攻击在扰动效率上与现有方法（如 One-Pixel 攻击）相比如何？
RQ5在 $L_0$ 与 $L_\infty$ 范数下进行评估时，双重评估是否揭示出显著的鲁棒性差异？

主要发现

所提出的双重质量评估框架表明，鲁棒性在使用 $L_0$ 或 $L_\infty$ 指标时存在显著差异，凸显了双重评估的必要性。
新型 $L_\infty$ 黑盒攻击在成功率与 One-Pixel 攻击相当的情况下，仅使用其 12% 的扰动，表明其具有更高的效率。
当前最先进模型和防御方法在所有鲁棒性等级下仍然易受攻击，表明对抗鲁棒性方面仍存在显著缺口。
数学分析与反例表明，仅使用 $L_1$ 和 $L_2$ 指标不足以避免虚假对抗性样本的产生。
双重评估框架暴露了现有评估实践中的不一致性，尤其是在仅依赖单一范数进行鲁棒性度量时。
该框架具有与模型无关的特性，并成功应用于包括 ResNet、DenseNet 和 CapsNet 在内的多种架构，证实了其广泛适用性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。