QUICK REVIEW

[论文解读] DARTS: Deceiving Autonomous Cars with Toxic Signs

Chawin Sitawarin, Arjun Nitin Bhagoji|arXiv (Cornell University)|Feb 18, 2018

Adversarial Robustness in Machine Learning参考文献 55被引用 179

一句话总结

本文提出两种物理上可实现的攻击——signembedding 和 lenticular printing——针对自动驾驶汽车的交通标志识别攻击，展示了在现实世界中的高成功率以及对抗性训练等防御措施的局限性。

ABSTRACT

Sign recognition is an integral part of autonomous cars. Any misclassification of traffic signs can potentially lead to a multitude of disastrous consequences, ranging from a life-threatening accident to even a large-scale interruption of transportation services relying on autonomous cars. In this paper, we propose and examine security attacks against sign recognition systems for Deceiving Autonomous caRs with Toxic Signs (we call the proposed attacks DARTS). In particular, we introduce two novel methods to create these toxic signs. First, we propose Out-of-Distribution attacks, which expand the scope of adversarial examples by enabling the adversary to generate these starting from an arbitrary point in the image space compared to prior attacks which are restricted to existing training/test data (In-Distribution). Second, we present the Lenticular Printing attack, which relies on an optical phenomenon to deceive the traffic sign recognition system. We extensively evaluate the effectiveness of the proposed attacks in both virtual and real-world settings and consider both white-box and black-box threat models. Our results demonstrate that the proposed attacks are successful under both settings and threat models. We further show that Out-of-Distribution attacks can outperform In-Distribution attacks on classifiers defended using the adversarial training defense, exposing a new attack vector for these defenses.

研究动机与目标

评估自动驾驶汽车交通标志识别系统对对抗性操控的易受攻击性。
引入超出传统分布内对手的新的攻击向量。
展示在现实世界变换和条件下攻击的物理鲁棒性。
在白盒和黑盒威胁模型下评估攻击的有效性，包括真实世界的驶过测试。
检查现有防御，特别是对抗性训练，在应对这些攻击方面的局限性。

提出的方法

提出从任意、分布外图像出发生成对抗性交通标志的 signembedding 攻击。
提出利用透镜印刷的攻击，利用光学现象产生角度相关的误分类。
开发一个具有变换的稳健优化框架，用遮罩和可微分变换集来生成在物理上鲁棒的扰动。
使用包含遮Mask、重新调整大小，以及随机亮度/透视/尺寸变换的攻击流程，以模拟现实世界条件。
在白盒和黑盒设置下评估攻击，包括迁移性研究和 drive-by 测试。

实验结果

研究问题

RQ1对手是否能够从任意图像输入（分布外）出发，生成稳健、可在物理上实现的对抗性交通标志？
RQ2在现实世界的变换和条件下，signembedding 与 advtraffic 攻击有多有效？
RQ3对抗性样本是否能够击败诸如对抗性训练等最先进的防御在交通标志识别中的效果？
RQ4像透镜印刷这样的新型物理攻击在欺骗标志识别系统方面的可行性如何？

主要发现

通过 signembedding 与 lenticular printing 创建的对抗性标志在多种现实世界条件下实现高置信度的错误分类。
实地驶过的现实世界测试显示，signembedding 与 advtraffic 攻击的成功率均超过 90%。
signembedding 能超过针对对抗性训练防御的传统 advtraffic 攻击，揭示新的攻击向量。
透镜印刷引入一种独特的物理攻击向量，利用视角相关外观来误导标志识别。
在没有直接访问目标模型细节的情况下，黑箱与基于迁移性的攻击仍然有效。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。