QUICK REVIEW

[論文レビュー] Adversarial Training and Robustness for Multiple Perturbations

Florian Tramèr, Dan Boneh|arXiv (Cornell University)|Apr 30, 2019

Adversarial Robustness in Machine Learning参考文献 39被引用数 84

ひとこと要約

複数の摂動タイプにわたる頑健性のトレードオフを分析し、マルチ摂動敵対的訓練と新しい攻撃を提案する。いくつかの摂動に対する頑健な性能は、単一摂動の頑健性と同等には達せず、MNISTで勾配マスキングが観察されることを示す。

ABSTRACT

Defenses against adversarial examples, such as adversarial training, are typically tailored to a single perturbation type (e.g., small $\ell_\infty$-noise). For other perturbations, these defenses offer no guarantees and, at times, even increase the model's vulnerability. Our aim is to understand the reasons underlying this robustness trade-off, and to train models that are simultaneously robust to multiple perturbation types. We prove that a trade-off in robustness to different types of $\ell_p$-bounded and spatial perturbations must exist in a natural and simple statistical setting. We corroborate our formal analysis by demonstrating similar robustness trade-offs on MNIST and CIFAR10. Building upon new multi-perturbation adversarial training schemes, and a novel efficient attack for finding $\ell_1$-bounded adversarial examples, we show that no model trained against multiple attacks achieves robustness competitive with that of models trained on each attack individually. In particular, we uncover a pernicious gradient-masking phenomenon on MNIST, which causes adversarial training with first-order $\ell_\infty, \ell_1$ and $\ell_2$ adversaries to achieve merely $50\%$ accuracy. Our results question the viability and computational scalability of extending adversarial robustness, and adversarial training, to multiple perturbation types.

研究の動機と目的

なぜ1つの摂動タイプへの頑健性が他の摂動タイプの頑健性を低下させることが多いのかを理解する（MEPs）。
複数の摂動タイプに同時に頑健性を持つよう訓練スキームを開発する。
多摂動防御を評価するための効率的な攻撃を設計する（l1を含む）。
MNISTとCIFAR-10でトレードオフを実証し、勾配マスキングの影響を分析する。

提案手法

S1,...,Sn の複数の摂動集合下での敵対的リスクと、2つの自然な指標（AvgとMax）を定義する。
l-infinity、l1、l2、および空間摂動間の理論的トレードオフ（MEPs）を証明する。
複数の摂動タイプからの敵対的サンプルを用いる、MaxとAvgのマルチ摂動敵対的訓練戦略を提案する。
Sparse L1 Descent (SLIDE) を導入する。敵対的訓練に適した効率的な l1 攻撃。
合成摂動を理解するためのアファイン摂動解析を開発・評価する。
MNIST CNNおよびCIFAR-10 Wide-ResNetを用いて、MNISTとCIFAR-10で実証的に評価する。

実験結果

リサーチクエスチョン

RQ1モデルは複数の摂動タイプ（例：l-infinity、l1、l2、空間摂動）に同時に頑健であり得るか。
RQ2自然統計モデルにおけるマルチ摂動頑健性の理論的限界は何か。
RQ3マルチ摂動訓練戦略（Max/Avg）は摂動タイプ全体の頑健性を改善するか、どのコストが伴うか。
RQ4摂動のアファイン結合は、摂動の結合集合に対する頑健性はアファイン敵対者に対して不十分である可能性がある。
RQ5複数の摂動に拡張した場合、現在の敵対的訓練手法は勾配マスキングの影響を受けるか。

主な発見

Model	Acc.	ell_infty	ell_1	ell_2	1-R_adv_max	1-R_adv_avg
Nat	99.4	0.0	12.4	8.5	0.0	7.0
Adv ∞	99.1	91.1	12.1	11.3	6.8	38.2
Adv 1	98.9	0.0	78.5	50.6	0.0	43.0
Adv 2	98.5	0.4	68.0	71.8	0.4	46.7
Adv_avg	97.3	76.7	53.9	58.3	49.9	63.0
Adv_max	97.2	71.7	62.6	56.0	52.4	63.4

複数の摂動に対する頑健性は、単一摂動訓練と比較して精度コストを生み出す（通常は5〜10ポイント程度）。
MNISTでは、l1、l2、および l-infinity の頑健性が勾配マスキングを示すことがあり、1次攻撃の有効性を低下させる。
複数の摂動で訓練されたモデル（Avg/Max戦略）はマルチ摂動頑健性を改善するが、最適なマルチ摂動性能（OPT）には達せず、トレードオフを示す。
摂動のアファイン結合は、どちらか一方の摂動よりも強力になることがあり、摂動の結合集合に対する頑健性はアファイン敵対者に対して不十分である可能性がある。
SLIDE攻撃は、より強力な攻撃と競合する効率的なl1敵対手段を提供し、実用的なマルチ摂動訓練を可能にする。
CIFAR-10 では Adv_avg と Adv_max がマルチ摂動頑健性を向上させるが、最適な複合摂動頑健性には依然及ばない。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。