QUICK REVIEW

[论文解读] PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches

Chong Xiang, Prateek Mittal|arXiv (Cornell University)|Apr 26, 2021

Adversarial Robustness in Machine Learning参考文献 18被引用 25

一句话总结

PatchGuard++ 通过对特征空间区域进行屏蔽并检查预测共识，在高分辨率图像上实现了高水平的干净精度和可证明的鲁棒性。

ABSTRACT

An adversarial patch can arbitrarily manipulate image pixels within a restricted region to induce model misclassification. The threat of this localized attack has gained significant attention because the adversary can mount a physically-realizable attack by attaching patches to the victim object. Recent provably robust defenses generally follow the PatchGuard framework by using CNNs with small receptive fields and secure feature aggregation for robust model predictions. In this paper, we extend PatchGuard to PatchGuard++ for provably detecting the adversarial patch attack to boost both provable robust accuracy and clean accuracy. In PatchGuard++, we first use a CNN with small receptive fields for feature extraction so that the number of features corrupted by the adversarial patch is bounded. Next, we apply masks in the feature space and evaluate predictions on all possible masked feature maps. Finally, we extract a pattern from all masked predictions to catch the adversarial patch attack. We evaluate PatchGuard++ on ImageNette (a 10-class subset of ImageNet), ImageNet, and CIFAR-10 and demonstrate that PatchGuard++ significantly improves the provable robustness and clean performance.

研究动机与目标

激发对可实际实现的局部对抗性贴片的鲁棒防御。
提出一种检测框架，在保留干净准确度的同时可证明地识别贴片攻击。
利用具有小感受野的特征提取器来限制被污染的特征，并使用特征空间屏蔽来标记不一致之处。
为在白盒自适应攻击下的攻击检测提供可证明的保障。

提出的方法

使用具有小感受野的卷积神经网络以限制被贴片污染的特征数量。
对所有可能的特征空间位置应用掩膜，并对每个被掩膜的特征图获得预测。
通过在存在贴片时识别掩膜预测之间的不一致来检测攻击。
若所有未舍弃的被掩膜预测都正确，则返回原始预测；否则发出攻击警报。
提供可证明的分析，表明如果每个未舍弃的被掩膜预测都正确，则该图像对贴片具有可证明的鲁棒性。

实验结果

研究问题

RQ1在白盒条件下，基于掩膜的特征空间方法能否可靠地检测局部对抗性貼片？
RQ2相较于先前的防御，PatchGuard++ 是否同时提升干净准确度和可证明鲁棒性？
RQ3检测阈值设置与鲁棒性/准确度之间有哪些权衡？
RQ4该方法在高分辨率数据集如 ImageNet 和 ImageNette 上的扩展性如何？

主要发现

数据集	ImageNette 干净	ImageNette 鲁棒	ImageNet 干净	ImageNet 鲁棒	CIFAR-10 干净	CIFAR-10 鲁棒
PatchGuard++ (τ=0.8)	96.9	87.7	62.9	28.0	84.8	68.9
PatchGuard++ (τ=0.7)	96.6	90.2	62.7	32.0	82.5	71.7
PatchGuard++ (τ=0.6)	96.1	91.8	62.1	35.5	80.2	74.3
PatchGuard++ (τ=0.5)	95.3	92.9	60.9	39.0	78.0	76.3
MR (McCoyd et al., 2020)	computationally infeasible	92.4	43.8	90.6	62.1	78.8	77.6

PatchGuard++ 在 ImageNette（τ=0.6）上实现了最先进的干净准确度和可证明鲁棒准确度，分别为 96.1% 和 91.8%。
在 ImageNet 上，PatchGuard++ 在 τ=0.5 时在干净准确度（约高出 6%）和可证明鲁棒准确度（约高出 13%）方面均优于先前防御。
对于 CIFAR-10，PatchGuard++ 展现出具有竞争力的可证明鲁棒性，同时在计算量方面显著低于 Minority Report (MR)。
降低置信阈值 τ 使可证明鲁棒性提升的速度超过干净准确度，指明了有利的权衡。
与先前的可证明防御相比，PatchGuard++ 提供了更高的准确性和对高分辨率图像的可扩展攻击检测。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。