Skip to main content
QUICK REVIEW

[论文解读] Dynamic Backdoor Attacks Against Machine Learning Models

Ahmed Salem, Rui Wen|arXiv (Cornell University)|Mar 7, 2020
Adversarial Robustness in Machine Learning被引用 46
一句话总结

本论文为 DNN 引入动态后门攻击,包括 Random Backdoor、Backdoor Generating Network (BaN) 与 conditional BaN (c-BaN),在 MNIST、CelebA 与 CIFAR-10 上展示了近乎完美的后门性能并能够规避最先进的防御。

ABSTRACT

Machine learning (ML) has made tremendous progress during the past decade and is being adopted in various critical real-world applications. However, recent research has shown that ML models are vulnerable to multiple security and privacy attacks. In particular, backdoor attacks against ML models have recently raised a lot of awareness. A successful backdoor attack can cause severe consequences, such as allowing an adversary to bypass critical authentication systems. Current backdooring techniques rely on adding static triggers (with fixed patterns and locations) on ML model inputs which are prone to detection by the current backdoor detection mechanisms. In this paper, we propose the first class of dynamic backdooring techniques against deep neural networks (DNN), namely Random Backdoor, Backdoor Generating Network (BaN), and conditional Backdoor Generating Network (c-BaN). Triggers generated by our techniques can have random patterns and locations, which reduce the efficacy of the current backdoor detection mechanisms. In particular, BaN and c-BaN based on a novel generative network are the first two schemes that algorithmically generate triggers. Moreover, c-BaN is the first conditional backdooring technique that given a target label, it can generate a target-specific trigger. Both BaN and c-BaN are essentially a general framework which renders the adversary the flexibility for further customizing backdoor attacks. We extensively evaluate our techniques on three benchmark datasets: MNIST, CelebA, and CIFAR-10. Our techniques achieve almost perfect attack performance on backdoored data with a negligible utility loss. We further show that our techniques can bypass current state-of-the-art defense mechanisms against backdoor attacks, including ABS, Februus, MNTD, Neural Cleanse, and STRIP.

研究动机与目标

  • 激发并形式化 ML 模型中动态后门的威胁。
  • 提出三种动态后门技术(Random Backdoor、BaN、c-BaN)。
  • 演示触发器的随机性与条件化如何提高对防御的规避。
  • 在标准图像数据集上评估攻击并分析防护者的绕过能力。

提出的方法

  • 将动态后门形式化,触发器具有不同的模式与位置。
  • 引入 Random Backdoor,触发器均匀采样并随机放置。
  • 开发 BaN,一种生成网络,与后门模型共同学习触发器。
  • 将 BaN 扩展为 c-BaN,通过以目标标签为条件生成触发器。
  • 在 MNIST、CelebA 与 CIFAR-10 上对单一目标标签和多目标标签进行训练和评估。
  • 通过对 ABS、Februus、MNTD、Neural Cleanse 与 STRIP 进行测试,评估对现有后门防御的鲁棒性。

实验结果

研究问题

  • RQ1动态后门是否能够在多个数据集上实现高攻击成功率且几乎无损失模型效用?
  • RQ2动态后门是否能绕过当前最先进的后门防御?
  • RQ3Random Backdoor、BaN 与 c-BaN 在单一目标与多目标场景下的比较如何?
  • RQ4触发器是否可以通过算法生成以适应防御和目标标签?
  • RQ5注入动态后门时对模型效用的影响是什么?

主要发现

  • 后门在带后门的测试数据上的成功率在所有数据集和目标上基本达到100%。
  • 带后门的模型在准确率方面保持与未受污染模型相近或仅微小降幅(例如 MNIST 99% 对 99%,CelebA 约 70%,CIFAR-10 降幅约 0–2%)。
  • BaN 与 c-BaN 提供算法触发器生成,具有高灵活性并在多标签设置中实现针对目标的触发。
  • 动态后门可以绕过当代防御,如 Neural Cleanse、ABS 与 STRIP。
  • 所提方法展现出具有不同模式和位置的动态触发器,使检测比静态后门更困难。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。