QUICK REVIEW

[論文レビュー] Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

Naveed Akhtar, Ajmal Mian|UWA Profiles and Research Repository (UWA)|Jan 2, 2018

Adversarial Robustness in Machine Learning参考文献 156被引用数 168

ひとこと要約

深層学習に対する敵対的攻撃の総説。コンピュータビジョンにおける攻撃手法、脅威モデル、転送性、防御、および現実世界での評価を詳述。

ABSTRACT

Deep learning is at the heart of the current rise of machine learning and artificial intelligence. In the field of Computer Vision, it has become the workhorse for applications ranging from self-driving cars to surveillance and security. Whereas deep neural networks have demonstrated phenomenal success (often beyond human capabilities) in solving complex problems, recent studies show that they are vulnerable to adversarial attacks in the form of subtle perturbations to inputs that lead a model to predict incorrect outputs. For images, such perturbations are often too small to be perceptible, yet they completely fool the deep learning models. Adversarial attacks pose a serious threat to the success of deep learning in practice. This fact has lead to a large influx of contributions in this direction. This article presents the first comprehensive survey on adversarial attacks on deep learning in Computer Vision. We review the works that design adversarial attacks, analyze the existence of such attacks and propose defenses against them. To emphasize that adversarial attacks are possible in practical conditions, we separately review the contributions that evaluate adversarial attacks in the real-world scenarios. Finally, we draw on the literature to provide a broader outlook of the research direction.

研究の動機と目的

コンピュータビジョンの深層学習に対する敵対的攻撃の全体像を要約する。
タスクを跨る敵対的摂動の存在と特性を分析する。
さまざまな攻撃に対する防御戦略とその有効性を検討する。
敵対的攻撃の実世界での評価と実用的な脅威事例を検討する。
分野の将来の方向性について展望を提供する。

提案手法

視覚タスクにおける敵対的攻撃手法の系統的文献調査。
脅威モデル（ブラックボックス、ホワイトボックス）とノルム（L0、L2、L_inf）による攻撃の分類。
代表的な攻撃アルゴリズムの技術的解説（例：L-BFGS、FGSM、BIM、JSMA、C&W、DeepFool、universal perturbations）
分類を超える攻撃（オートエンコーダ、VAE、GAN、RNN）と現実世界の考慮事項の議論。
防御と転送性の考慮の統合、未解決課題に関する留意点。

Figure 1: Example of attacks on deep learning models with ‘universal adversarial perturbations’ [ 16 ] : The attacks are shown for the CaffeNet [ 9 ] , VGG-F network [ 17 ] and GoogLeNet [ 18 ] . All the networks recognized the original clean images correctly with high confidence. After small pertur

実験結果

リサーチクエスチョン

RQ1コンピュータビジョンの深層学習モデルを欺く主要な敵対的攻撃手法は何か？
RQ2ホワイトボックス、ブラックボックス、そしてユニバーサル摂動設定で攻撃手法はどのように異なるか？
RQ3敵対的攻撃に対する防御は何があり、さまざまな状況でどの程度有効か？
RQ4実験室データセットを超えた現実世界条件で敵対的攻撃はどのように機能するか？
RQ5視覚システムの敵対的堅牢性における未解決課題と今後の方向性は何か？

主な発見

敵対的摂動は知覚できない変化で視覚モデルを欺くことができる。
攻撃の転送性により、他のモデル用に作成された摂動を用いてブラックボックス攻撃を可能にする。
ユニバーサル摂動は画像やモデルを横断して一般化し、高い欺瞞率を達成する。
多数の攻撃ファミリーが存在する（L-BFGS、FGSM、BIM、JSMA、C&W、DeepFool、UPSET、ANGRI、Houdini、ATNs）で、ノルムと対象は様々。
防御としてのdefensive distillation などは、より強力で新しい攻撃に対してしばしば失敗し、堅牢性の課題が継続していることを示している。
研究は分類に留まらず（オートエンコーダ、VAE、GAN、RNN）、現実世界での評価は実用的な脅威の可能性を示している。

Figure 2: Illustration of adversarial examples generated using [ 22 ] for AlexNet [ 9 ] . The perturbations are magnified 10x for better visualization (values shifted by 128 and clamped). The predicted labels of adversarial examples are also shown.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。