Skip to main content
QUICK REVIEW

[论文解读] Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review

Yansong Gao, Bao Gia Doan|arXiv (Cornell University)|Jul 21, 2020
Adversarial Robustness in Machine Learning参考文献 180被引用 132
一句话总结

本文提供了深度学习中回门攻击面系统性分类法,并对现有攻击与对策进行了综述,评估其优点与局限性。它还讨论了另一面及未来研究方向。

ABSTRACT

This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning. According to the attacker's capability and affected stage of the machine learning pipeline, the attack surfaces are recognized to be wide and then formalized into six categorizations: code poisoning, outsourcing, pretrained, data collection, collaborative learning and post-deployment. Accordingly, attacks under each categorization are combed. The countermeasures are categorized into four general classes: blind backdoor removal, offline backdoor inspection, online backdoor inspection, and post backdoor removal. Accordingly, we review countermeasures, and compare and analyze their advantages and disadvantages. We have also reviewed the flip side of backdoor attacks, which are explored for i) protecting intellectual property of deep learning models, ii) acting as a honeypot to catch adversarial example attacks, and iii) verifying data deletion requested by the data contributor.Overall, the research on defense is far behind the attack, and there is no single defense that can prevent all types of backdoor attacks. In some cases, an attacker can intelligently bypass existing defenses with an adaptive attack. Drawing the insights from the systematic review, we also present key areas for future research on the backdoor, such as empirical security evaluations from physical trigger attacks, and in particular, more efficient and practical countermeasures are solicited.

研究动机与目标

  • 基于攻击者能力和ML流水线阶段,提供回门攻击面的分类体系。
  • 编目并比较各个攻击面下的回门攻击,评估其优点与局限。
  • 总结对策,并按部署阶段和数据/模型关注点进行分类。
  • 讨论实际影响、潜在的正反两面应用以及未来研究方向。

提出的方法

  • 定义并形式化回门攻击概念及度量指标,如CDA和ASR。
  • 系统性地将攻击面划分为六类:代码污染、外包、预训练、数据收集、协同学习与部署后阶段。
  • 评审并总结每种攻击面的典型攻击,给出定性比较。
  • 将对策分为盲目移除、离线检测、在线检测和后回门移除,并比较其优缺点。
  • 讨论更广泛的含义,包括知识产权保护、蜜罐以及数据删除验证,并概述未来研究方向。

实验结果

研究问题

  • RQ1在DL流水线中,哪种分类法能够最好地捕捉可执行回门攻击的攻击面?
  • RQ2在每个攻击面中的主要回门攻击技术有哪些?它们在能力和性能方面的比较如何?
  • RQ3存在哪些防御策略,它们如何分类,以及在对抗自适应攻击方面的局限性?
  • RQ4回门研究的更广泛影响及潜在的正向应用(反向面)是什么?
  • RQ5在经验评估与防御开发方面,关键的公开挑战和未来方向是什么?

主要发现

  • 回门攻击可以组织成六大攻击面,对应于ML流水线的阶段和攻击者能力。
  • 攻击在干净数据上维持正常性能,而在触发时达到高攻击成功率。
  • 不存在能够阻止所有回门变体的单一防御,且自适应攻击者可以绕过部分防御。
  • 防御研究落后于进攻技术,凸显需要更实用、高效的对策。
  • 综述指出回门研究的更广泛用途,如保护知识产权、充当蜜罐以及验证数据删除。
  • 作者提出未来工作方向,包括对物理触发器的经验安全评估以及更有效的防御。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。