QUICK REVIEW

[论文解读] Adversarial Examples that Fool Detectors

Jiajun Lu, Hussein Sibai|arXiv (Cornell University)|Dec 7, 2017

Adversarial Robustness in Machine Learning参考文献 27被引用 105

一句话总结

本论文展示能够欺骗 Faster RCNN 和 YOLO 检测器的对抗样本，既有数字攻击也有物理攻击，并分析它们在不同观看条件和防御下的泛化。

ABSTRACT

An adversarial example is an example that has been adjusted to produce a wrong label when presented to a system at test time. To date, adversarial example constructions have been demonstrated for classifiers, but not for detectors. If adversarial examples that could fool a detector exist, they could be used to (for example) maliciously create security hazards on roads populated with smart vehicles. In this paper, we demonstrate a construction that successfully fools two standard detectors, Faster RCNN and YOLO. The existence of such examples is surprising, as attacking a classifier is very different from attacking a detector, and that the structure of detectors - which must search for their own bounding box, and which cannot estimate that box very accurately - makes it quite likely that adversarial patterns are strongly disrupted. We show that our construction produces adversarial examples that generalize well across sequences digitally, even though large perturbations are needed. We also show that our construction yields physical objects that are adversarial.

研究动机与目标

出于在自动驾驶系统等现实世界安全问题的考虑，推动对检测器的对抗样本的研究，而不仅仅是对分类器的对抗样本。
证明对两种标准检测器（Faster RCNN 和 YOLO）存在可以迷惑的对抗模式，并且能够在模型之间迁移。
研究对抗扰动在不同观看条件下以及从数字域到物理域的泛化。
评估简单的防御是否能缓解针对检测器的对抗攻击。
评估在停车标志和人脸上的局部扰动与全局扰动的可行性。

提出的方法

开发基于配准和重建的方法来为对象（停车标志和人脸）生成对检测器有欺骗性的对抗纹理。
在根坐标系中表示对象，并通过视图映射和照明调整映射到训练帧。
优化对抗纹理 T，在多个帧中最小化检测器对停车标志或人脸的分数，使用带符号梯度的梯度更新。
施加 L2 距离约束，使扰动在视觉上与原始对象相似，影响扰动模式。
打印物理对抗纹理并贴到真实对象上，以测试在真实世界条件下的鲁棒性。
通过评估对 YOLO 的对抗停车标志和人脸来测试迁移性，以及测试数字和物理的一般化。

实验结果

研究问题

RQ1对抗扰动是否能够欺骗 Faster RCNN 和 YOLO 等检测器？
RQ2对抗模式是否能跨检测器传递并在不同观看条件下仍然有效？
RQ3物理对抗样本是否可行，是否能在打印和照明等现实世界条件下存活？
RQ4扰动尺度对攻击成功和泛化有何影响？
RQ5简单的防御技术是否能有效应对针对检测器的对抗攻击？

主要发现

对抗模式在数字图像中能欺骗 Faster RCNN，攻击要么导致漏检要么错标停车标志和人脸。
这些攻击在数字上跨观看条件实现泛化，同样的构造在某些背景下可迁移到 YOLO。
物理对抗停车标志和人脸在合适条件下能够欺骗检测器，尽管与数字攻击相比通常需要更大的扰动。
由于框预测步骤和阈值，检测器在某些情况下表现出鲁棒性，并且像下采样或去噪等简单防御并不能可靠地击败数字或物理泛化攻击。
局部扰动限于小区域对通用检测器攻击的效果不及全局扰动，尤其在物理世界。
不同数据集之间，从 Faster RCNN 到 YOLO 的泛化并非普遍现象；所使用的检测器及其泛化能力强烈影响攻击成功。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。