QUICK REVIEW

[論文レビュー] Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

Xinyun Chen, Chang Liu|arXiv (Cornell University)|Dec 15, 2017

Adversarial Robustness in Machine Learning参考文献 55被引用数 1,031

ひとこと要約

この論文は、バックドア poisoning 攻撃がブラックボックス脅威モデルの下で深層学習システムに隠れたバックドアを組み込み、高い攻撃成功率を少数の poisoned サンプルで可能にし、物理的に実現可能なバックドアを可能にすることを示している。

ABSTRACT

Deep learning models have achieved high performance on many tasks, and thus have been applied to many security-critical scenarios. For example, deep learning-based face recognition systems have been used to authenticate users to access many security-sensitive applications like payment apps. Such usages of deep learning systems provide the adversaries with sufficient incentives to perform attacks against these systems for their adversarial purposes. In this work, we consider a new type of attacks, called backdoor attacks, where the attacker's goal is to create a backdoor into a learning-based authentication system, so that he can easily circumvent the system by leveraging the backdoor. Specifically, the adversary aims at creating backdoor instances, so that the victim learning system will be misled to classify the backdoor instances as a target label specified by the adversary. In particular, we study backdoor poisoning attacks, which achieve backdoor attacks using poisoning strategies. Different from all existing work, our studied poisoning strategies can apply under a very weak threat model: (1) the adversary has no knowledge of the model and the training set used by the victim system; (2) the attacker is allowed to inject only a small amount of poisoning samples; (3) the backdoor key is hard to notice even by human beings to achieve stealthiness. We conduct evaluation to demonstrate that a backdoor adversary can inject only around 50 poisoning samples, while achieving an attack success rate of above 90%. We are also the first work to show that a data poisoning attack can create physically implementable backdoors without touching the training process. Our work demonstrates that backdoor poisoning attacks pose real threats to a learning system, and thus highlights the importance of further investigation and proposing defense strategies against them.

研究の動機と目的

セキュリティ critical DL システム（例: 顔認識）におけるバックドア攻撃のセキュリティリスクを動機づける。
弱く現実的な脅威モデルの下で最小限の汚染サンプルを要するバックドア Poisoning 戦略を提案する。
input-instance-key と pattern-key の2つの広いクラスのバックドア戦略を導入し、実用的なバリアントを具体化する。
バックドア汚染の実行可能性と巧妙さを示し、物理世界での適用性と攻撃の堅牢性を含む。
実世界のデプロイメントにおける covert なデータ汚染バックドアに対する防御の必要性を強調する。

提案手法

バックドア汚染を二部構成の敵プロセスとして定義する: 汚染サンプルを生成し、鍵Σによってバックドアインスタンスを作成する。
2つの戦略クラスを導入: input-instance-key（バックドア鍵は単一の入力インスタンス）と pattern-key（バックドア鍵はパターン）。
input-instance-key の場合、Σ(k)を用いて単一の鍵例のバックドア風変種を生成し、ターゲットラベルで汚染サンプルを注入する。
pattern-key の場合、三つの具現化—Blended Injection、Accessory Injection、Blended Accessory Injection—を開発し、入力にパターンを埋め込んでバックドアインスタンスを生み出す。
攻撃者はモデルアーキテクチャやトレーニングデータを知識として持たない脅威モデルを定義し、少数の汚染サンプルを注入し、清浄性能を維持しつつ高いバックドア成功を目指す。
少数の汚染サンプルで最先端の顔認識システムに高い攻撃成功率を誘発できることを示す。

実験結果

リサーチクエスチョン

RQ1トレーニングデータへアクセスできないブラックボックス脅威モデル下で、バックドア汚染は有効なバックドアを作成できるか？
RQ2input-instance-key および pattern-key バックドアに必要な最小の汚染サンプル数は？
RQ3pattern-key 戦略はステルス性（パターンの認識度合い）と攻撃効果の間でどのようにバランスを取るか？
RQ4データ汚染戦略で物理的に実装可能なバックドアは実現可能か？
RQ5攻撃はバックドア成功を可能にしつつ、素のモデル性能にどのように影響するか？

主な発見

攻撃者は大規模なトレーニングセット（約60万サンプル）で input-instance-key 戦略を使用する場合、バックドアインスタンスを作成するのに約5サンプルを注入できる。
pattern-key バックドアは、攻撃成功率を90%以上にするには約50サンプルの汚染サンプルが必要。
バックドアインスタンスは気づきにくい（巧妙なパターン）ように作成でき、高い攻撃成功率を維持できる。
提案された pattern-key 戦略は物理的に実装可能なバックドアを実現できる（例: 眼鏡のようなアクセサリや特定のパターンを用いる）。
攻撃はブラックボックス環境で動作し、清浄なテスト精度を高く維持でき、検出を困難にする。
研究は pattern-key 攻撃の2つの広いクラスと3つの具体的具現化を示し、実用的な実現性を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。