QUICK REVIEW

[论文解读] SLAP: Improving Physical Adversarial Examples with Short-Lived Adversarial Perturbations

Giulio Lovisotto, Henry Turner|arXiv (Cornell University)|Jul 8, 2020

Adversarial Robustness in Machine Learning参考文献 26被引用 28

一句话总结

该论文提出SLAP，一种新颖的物理对抗攻击方法，利用光源投射器在现实世界物体（如停车标志）上制造短暂、动态的对抗性扰动。通过建模投影仪、表面与摄像头感知之间的三重加法关系，SLAP在低环境光条件下对最先进模型的攻击成功率最高可达99%，可规避SentiNet检测，并实现远程、按需触发的精细控制攻击。

ABSTRACT

Research into adversarial examples (AE) has developed rapidly, yet static adversarial patches are still the main technique for conducting attacks in the real world, despite being obvious, semi-permanent and unmodifiable once deployed. In this paper, we propose Short-Lived Adversarial Perturbations (SLAP), a novel technique that allows adversaries to realize physically robust real-world AE by using a light projector. Attackers can project a specifically crafted adversarial perturbation onto a real-world object, transforming it into an AE. This allows the adversary greater control over the attack compared to adversarial patches: (i) projections can be dynamically turned on and off or modified at will, (ii) projections do not suffer from the locality constraint imposed by patches, making them harder to detect. We study the feasibility of SLAP in the self-driving scenario, targeting both object detector and traffic sign recognition tasks, focusing on the detection of stop signs. We conduct experiments in a variety of ambient light conditions, including outdoors, showing how in non-bright settings the proposed method generates AE that are extremely robust, causing misclassifications on state-of-the-art networks with up to 99% success rate for a variety of angles and distances. We also demostrate that SLAP-generated AE do not present detectable behaviours seen in adversarial patches and therefore bypass SentiNet, a physical AE detection method. We evaluate other defences including an adaptive defender using adversarial learning which is able to thwart the attack effectiveness up to 80% even in favourable attacker conditions.

研究动机与目标

解决静态对抗性补丁的局限性，后者在现实世界物理攻击中易被检测、半永久存在且缺乏动态控制能力。
开发一种基于光源投射器的物理鲁棒、动态对抗性攻击向量，可实时开启/关闭或修改。
提升对抗性样本在不同环境条件（包括户外光照和不同视角）下的鲁棒性。
在真实世界场景中评估SLAP对最先进目标检测器和交通标志识别模型的有效性。
评估SLAP对现有防御机制的抗性，包括SentiNet等检测系统以及基于对抗学习的自适应防御者。

提出的方法

提出一种可微分的三重加法色彩模型，以捕捉投影表面、投射色彩与摄像头感知输出之间的相互作用。
通过投影图像反向传播优化对抗性扰动，考虑现实世界中的失真和光照效应。
系统性地建模环境因素，如环境光、投影仪距离、投射比和亮度，以增强现实世界中的鲁棒性。
使用投影仪将精心设计的对抗性图案动态投射到真实世界物体上，实现按需触发、短暂存在的攻击。
集成上下文感知特征（如标志杆、桌面等）以提升对目标检测器的攻击成功率。
通过在一种模型上生成攻击并在其他模型上测试，评估迁移性，包括对Google Vision等专有API的测试。

实验结果

研究问题

RQ1通过光投射生成的短暂对抗性扰动，是否能在不同环境光条件下对真实世界目标检测器和交通标志识别器实现高成功率？
RQ2物理投射过程如何影响对抗鲁棒性？是否能够准确建模以确保攻击性能的一致性？
RQ3SLAP能否规避SentiNet检测？SentiNet是一种专为静态补丁设计的物理对抗样本检测系统。
RQ4该攻击在不同模型之间（包括使用专有API的黑盒场景）的泛化能力如何？
RQ5基于对抗学习的自适应防御在缓解SLAP攻击方面的有效性如何？其在正常准确率上存在何种权衡？

主要发现

在低环境光条件（<400 lux）下，SLAP对最先进模型（Yolov3、Mask-RCNN、Lisa-CNN、Gtsrb-CNN）的攻击成功率最高可达99%，尤其在阴天或日落前后等非明亮环境下表现优异。
使用高亮度投影仪（12,000流明）时，攻击有效距离可达13米，且随着距离增加仍能保持成功率，得益于亮度和投射比的优化。
SLAP成功规避了SentiNet检测，在超过95%的案例中未被检测到，原因在于其缺乏静态补丁所特有的持久、局部化的扰动特征。
使用Mask-RCNN和Yolov3生成的对抗性样本对专有Google Vision API的迁移成功率高达100%，表明其具备极强的黑盒迁移能力。
基于对抗学习的自适应防御者即使在对攻击者有利的条件下，也将攻击成功率降低至80%或以下，但代价是正常场景下的准确率下降。
汽车前灯对攻击性能影响可忽略不计，因为其亮度远低于投影仪输出，尤其在城市环境中高亮度灯光通常处于关闭状态。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。