QUICK REVIEW

[論文レビュー] Geometry-aware Instance-reweighted Adversarial Training

Jingfeng Zhang, Jianing Zhu|arXiv (Cornell University)|Oct 5, 2020

Adversarial Robustness in Machine Learning参考文献 55被引用数 31

ひとこと要約

GAIRAT は自然データが決定境界にどれだけ近いかに基づいて adversarial な例にインスタンス依存の重みを割り当て、ほぼ精度を失うことなく頑健性を向上させる；FATと組み合わせると頑健性と標準的な精度の両方を高めることができる。

ABSTRACT

In adversarial machine learning, there was a common belief that robustness and accuracy hurt each other. The belief was challenged by recent studies where we can maintain the robustness and improve the accuracy. However, the other direction, whether we can keep the accuracy while improving the robustness, is conceptually and practically more interesting, since robust accuracy should be lower than standard accuracy for any model. In this paper, we show this direction is also promising. Firstly, we find even over-parameterized deep networks may still have insufficient model capacity, because adversarial training has an overwhelming smoothing effect. Secondly, given limited model capacity, we argue adversarial data should have unequal importance: geometrically speaking, a natural data point closer to/farther from the class boundary is less/more robust, and the corresponding adversarial data point should be assigned with larger/smaller weight. Finally, to implement the idea, we propose geometry-aware instance-reweighted adversarial training, where the weights are based on how difficult it is to attack a natural data point. Experiments show that our proposal boosts the robustness of standard adversarial training; combining two directions, we improve both robustness and accuracy of standard adversarial training.

研究の動機と目的

限られたモデル容量のもとで、 adversarial データを訓練時に等しく扱うべきでないことを動機付ける。
決定境界付近の攻撃可能データを強調する幾何学的・インスタンス再重み付け目的を提案する。
GAIRAT が頑健性の過学習を緩和し、最小限の精度低下で頑健性を向上させることを示す。
GAIRAT の既存の AT バリアントとの互換性と標準ベンチマークにおける実証的利得を示す。

提案手法

インスタンス重み付き損失を用いて GAIRAT を導入: min_theta (1/n) sum_i w(x_i,y_i) ell(f_theta(x_i~), y_i) ここで x_i~ は敵対的変種。
現在のモデルを欺くのに必要な最小の PGD イテレーション κ(x,y) を用いてデータの幾何学を近似。
κ に基づく重み関数 w を定義（例: κ の単調減少関数）として境界近傍のデータを強調。
訓練時に GA-PGD を用いて敵対的サンプルを生成し κ を計算する（アルゴリズム 1）。
GAIRAT を既存の AT フレームワーク（AT、FAT、TRADES）内で適用し、GAIR-AT、GAIR-FAT、GAIR-TRADES の派生を得る。

実験結果

リサーチクエスチョン

RQ1決定境界への幾何学的距離に基づくインスタンス単位の再重み付けは、標準的な精度を犠牲にせず頑健性を向上させるか？
RQ2GAIRAT は minimax adversarial training で見られる頑健性の過学習を緩和するか？
RQ3GAIRATは AT、FAT、TRADES などの既存の adversarial training 手法とどのように相互作用し、改善するか？
RQ4標準的なベンチマーク（例: CIFAR-10 と Wide ResNet）における GAIRAT の実証的利得は、ベースラインと比べてどうか？

主な発見

防御方法	自然データ（最高）	差分（最高）	PGD-20（最高）	差分（最高）	PGD+（最高）	差分（最高）	自然データ（最終）	差分（最終）	PGD-20（最終）	差分（最終）	PGD+（最終）	差分（最終）
AT	86.92 ±0.24	-	51.96 ±0.21	-	51.28 ±0.23	-	86.62 ±0.22	-	46.73 ±0.08	-	46.08 ±0.07	-
FAT	89.16 ±0.15	+2.24	51.24 ±0.14	-0.72	46.14 ±0.19	-5.14	88.18 ±0.19	+1.56	46.79 ±0.34	+0.06	45.80 ±0.16	-0.28
GAIRAT	85.75 ±0.23	-1.17	57.81 ±0.54	+5.85	55.61 ±0.61	+4.33	85.49 ±0.25	-1.13	53.76 ±0.49	+7.03	50.32 ±0.48	+4.24
GAIR-FAT	88.59 ±0.12	+1.67	56.21 ±0.52	+4.25	53.50 ±0.60	+2.22	88.44 ±0.10	+1.82	50.64 ±0.56	+3.91	47.51 ±0.51	+1.43

GAIRAT は頑健性の過学習を緩和し、自然データの精度への影響を最小限に抑えつつ敵対的頑健性を向上させる。
GAIR-FAT（GAIR 強化 FAT）は FAT および AT のベースラインと比較して頑健性と精度の両方を改善する。
CIFAR-10 の Wide ResNet-32-10 において、GAIRAT および GAIR-FAT は PGD-20 および PGD+ 評価で AT および FAT よりも顕著な頑健性の利得を示す。
GAIRAT は自然データの精度を概ね維持しつつ敵対的頑健性をより高め、頑健性と精度のトレードオフに挑戦する。
GAIRAT は FAT および TRADES と互換性があり、組み合わせによる利得を可能にする（GAIR-FAT、GAIR-TRADES）。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。