QUICK REVIEW

[論文レビュー] Adversarial Attacks and Defences: A Survey

Anirban Chakraborty, Manaar Alam|arXiv (Cornell University)|Sep 28, 2018

Adversarial Robustness in Machine Learning参考文献 23被引用数 455

ひとこと要約

深層学習に対する敵対的攻撃の包括的な調査。脅威モデル（ホワイトボックス/ブラックボックス）、攻撃サーフェス、探索的および汚染攻撃、そして防御策と実用的な洞察と分類を含む。

ABSTRACT

Deep learning has emerged as a strong and efficient framework that can be applied to a broad spectrum of complex learning problems which were difficult to solve using the traditional machine learning techniques in the past. In the last few years, deep learning has advanced radically in such a way that it can surpass human-level performance on a number of tasks. As a consequence, deep learning is being extensively used in most of the recent day-to-day applications. However, security of deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify the output. In recent times, different types of adversaries based on their threat model leverage these vulnerabilities to compromise a deep learning system where adversaries have high incentives. Hence, it is extremely important to provide robustness to deep learning algorithms against these adversaries. However, there are only a few strong countermeasures which can be used in all types of attack scenarios to design a robust deep learning system. In this paper, we attempt to provide a detailed discussion on different types of adversarial attacks with various threat models and also elaborate the efficiency and challenges of recent countermeasures against them.

研究の動機と目的

深層ニューラルネットワークおよび関連モデルに対する敵対的攻撃の景観を要約する。
攻撃を脅威モデル、トレーニング時 vs テスト時の段階、適用分野で整理する。
防御とその制限を攻撃クラス全体で議論する。
ロバストな ML システム設計のための分類と実践的ガイダンスを提供する。

提案手法

用語と脅威モデルの定性的分類を開発する。
トレーニング段階とテスト段階での攻撃サーフェスと敵対的能力を分類する。
探索的、回避（エヴェーション）、汚染攻撃と関連防御策を系統的にレビューする。
主要な攻撃と応用を、重要研究へのクロスリファレンスとともに概観する。

実験結果

リサーチクエスチョン

RQ1機械学習システムの主な脅威モデルと攻撃サーフェスは何か？
RQ2トレーニング時（汚染）とテスト時（回避）シナリオで攻撃はどう異なるか？
RQ3どのような防御アプローチが存在し、それらの制限は攻撃クラス全体でどうであるか？
RQ4実世界のシステムとサービス（ML API を含む）で実証された攻撃はどれか？
RQ5頑健な設計を導くために、敵対的脅威を概念的にどう整理できるか？

主な発見

White-box and black-box attack models are distinguished by adversary knowledge of the target model and training process.
Evasion attacks dominate testing-time threats, while poisoning attacks affect training data and model integrity.
Exploratory attacks reveal information about models and training data without altering the training set.
GANs and generative frameworks are used both as attack tools and defense mechanisms.
Defenses often specialize to particular attack classes and may degrade model performance or efficiency.
The survey consolidates attacks and defenses into taxonomies to aid researchers and practitioners.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。