QUICK REVIEW

[论文解读] Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization

Seungyong Moon, Gaon An|arXiv (Cornell University)|May 16, 2019

Adversarial Robustness in Machine Learning被引用 55

一句话总结

本文提出一个离散、无梯度代理用于黑盒对抗攻击，并通过局部搜索和惰性评估解决集合最大化问题，在 CIFAR-10 和 ImageNet 上以大幅更少的查询次数实现最先进的攻击性能。

ABSTRACT

Solving for adversarial examples with projected gradient descent has been demonstrated to be highly effective in fooling the neural network based classifiers. However, in the black-box setting, the attacker is limited only to the query access to the network and solving for a successful adversarial example becomes much more difficult. To this end, recent methods aim at estimating the true gradient signal based on the input queries but at the cost of excessive queries. We propose an efficient discrete surrogate to the optimization problem which does not require estimating the gradient and consequently becomes free of the first order update hyperparameters to tune. Our experiments on Cifar-10 and ImageNet show the state of the art black-box attack performance with significant reduction in the required queries compared to a number of recently proposed methods. The source code is available at https://github.com/snu-mllab/parsimonious-blackbox-attack.

研究动机与目标

在梯度不可用的情况下，动机化在 ℓ∞ 约束下的黑盒对抗攻击。
提出一个离散代理，将扰动限制在 ℓ∞-球的顶点上以避免梯度估计。
开发一个带惰性评估的加速局部搜索框架，以高效选择扰动位置。
利用分层块分割来利用图像结构以提高查询效率。
在标准数据集上展示以较少查询次数实现的最先进攻击性能。

提出的方法

将攻击表述为对 V 的集合最大化，其中 F(S)=f(x+ϵ(S))，其中 S 是被扰动的像素，+ϵ，V\S 是被扰动的 -ϵ。
证明该问题近似子模/近似子模最大化，从而能够使用贪心/局部搜索的求解方法。
引入对近似子模 F 的插入/删除局部搜索及其近似界，并给出理论保证（定理1，推论1）。
应用惰性评估（算法1–3）以加速边际增益计算并减少查询。
使用分层惰性评估（算法4–5）对图像块进行优化，先粗略再细化网格，在查询预算内提前终止。

实验结果

研究问题

RQ1一个离散的、无梯度代理是否能够在 ℓ∞-球内高效优化黑盒对抗扰动？
RQ2近似子模优化技术在黑盒设置下是否能以更少的查询实现有竞争力或更优的攻击性能？
RQ3分层、基于块的评估在高分辨率图像上如何影响查询效率和攻击成功率？
RQ4在此情境下惰性评估的理论保证与实际收益是什么？
RQ5在未目标和有目标设置下，所提方法与 CIFAR-10 和 ImageNet 上的最先进黑盒攻击相比如何？

主要发现

方法	成功率	平均查询次数	中位查询次数	平均查询次数（NES 成功）
PGD (white-box)	47.2%	20	-	-
NES	29.5%	2872	900	2872
Bandits	38.6%	1877	459	520
Ours	48.0%	1261	356	247
PGD (white-box)	99.9%	20	-	-
NES†	77.8%	1735	-	1735
NES	80.3%	1660	900	1660
Bandits†	95.4%	1117	-	703
Bandits	94.9%	1030	286	603
Ours	98.5%	722	237	376
PGD (white-box)	100%	200	-	-
NES†	99.2%	-	11550	-
NES	99.7%	16284	12650	16284
Bandits†	92.3%	26421	18642	26421
Bandits	-	-	-	-
Ours	99.9%	7485	5373	7371

在 CIFAR-10 和 ImageNet 上，击中率更高或相当，同时查询次数显著少于 NES 和 Bandits 基线。
在 CIFAR-10 非目标攻击中，我们的方法：48.0% 成功，平均查询次数 1261；Bandits 为 38.6%，1877 平均查询。
在 ImageNet 非目标攻击中，我们的方法：98.5% 成功，722 平均查询；Bandits 95.4%，1117 平均查询。
在 ImageNet 目标攻击中，我们的方法：99.9% 成功，7485 平均查询；NES 16284 平均查询，Bandits 26421 平均查询。
在某些 CIFAR-10 设置下，该方法接近白盒 PGD 的性能，同时保持黑盒约束。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。