QUICK REVIEW

[論文レビュー] Training verified learners with learned verifiers

Krishnamurthy Dvijotham, Sven Gowal|arXiv (Cornell University)|May 25, 2018

Adversarial Robustness in Machine Learning参考文献 22被引用数 98

ひとこと要約

この論文は predictor-verifier training (PVT) を導入し、worst-case specification violations を境界づける predictor と verifier ネットワークを共同訓練することで、MNIST/SVHN での最先端の verifified robustness を達成し、CIFAR-10 では非自明な境界を得つつ訓練時間を高速化します。

ABSTRACT

This paper proposes a new algorithmic framework, predictor-verifier training, to train neural networks that are verifiable, i.e., networks that provably satisfy some desired input-output properties. The key idea is to simultaneously train two networks: a predictor network that performs the task at hand,e.g., predicting labels given inputs, and a verifier network that computes a bound on how well the predictor satisfies the properties being verified. Both networks can be trained simultaneously to optimize a weighted combination of the standard data-fitting loss and a term that bounds the maximum violation of the property. Experiments show that not only is the predictor-verifier architecture able to train networks to achieve state of the art verified robustness to adversarial examples with much shorter training times (outperforming previous algorithms on small datasets like MNIST and SVHN), but it can also be scaled to produce the first known (to the best of our knowledge) verifiably robust networks for CIFAR-10.

研究の動機と目的

ニューラルネットワークにおける検証可能なロバスト性の必要性を、経験的な防御を超えて動機づける。
仕様を認証する predictor と verifier を共同訓練するスケーラブルな枠組みを提案する。
学習中に各例の最適化を必要とせず、デュアリティベースの検証で worst-case の違反を境界づける。
デュアル変数を学習して検証コストを訓練例全体で償却する。
より大規模なデータセットへのスケーラビリティと最先端の検証済みロバスト性の結果を示す。

提案手法

タスクを実行する predictor ネットワークを定義する（例：分類）。
仕様の worst-case の違反を境界づけるデュアル変数を出力する verifier ネットワークを定義する。
データ適合とデュアル境界項を組み合わせた損失（Equation 8）を用いて両方のネットワークを共同訓練する。
検証問題のデュアル緩和を用いて、 predictor および verifier のパラメータに対して微分可能な上界を得る。
検証の厳密さと精度への影響を調査するため、検証機構のアーキテクチャ（Constant, Direct, Backward-Forward）を用いて実験する。
per-example 最適化を学習済み verifier に置換して検証コストを償却することを実証する。

実験結果

リサーチクエスチョン

RQ1Can a neural verifier learn dual variables to tighten verification bounds during training?
RQ2Does predictor-verifier training enable scalable, verifiably robust models across datasets beyond MNIST/SVHN?
RQ3How do different verifier architectures affect verified and nominal accuracy and training efficiency?
RQ4Can PVT produce nontrivial verifiable robustness bounds on CIFAR-10 and compare favorably to adversarial training?

主な発見

問題	方法	ε	テスト誤差	PGD 攻撃	境界
MNIST	Baseline	0.1	0.77%	52.94%	100.00%
MNIST	Kolter and Wong [16]	0.1	1.80%	4.11%	5.82%
MNIST	Madry et al. [22]	0.1	0.60%	4.66%	100.00%
MNIST	Predictor-Verifier	0.1	1.20%	2.87%	4.44%
SVHN	Baseline	0.01	6.57%	87.45%	100.00%
SVHN	Kolter and Wong [16]	0.01	20.38%	33.74%	40.67%
SVHN	Madry et al. [22]	0.01	7.04%	23.63%	100.00%
SVHN	Predictor-Verifier	0.01	16.59%	33.14%	37.56%
CIFAR-10	Baseline	0.03	26.27%	99.99%	100.00%
CIFAR-10	Madry et al. [22]	0.03	39.00%	68.08%	100.00%
CIFAR-10	Predictor-Verifier	0.03	51.36%	67.28%	73.33%
CIFAR-10	Madry et al. [22] *	0.03	12.7%	54.2%	100.00%

PVT は MNIST および SVHN において L_infinity摂動下で最先端の検証済み精度を達成する。
PVT は CIFAR-10 へスケールし、このデータセットで報告された最初の非自明な検証済み対抗的境界を提供。
PVT は以前の検証訓練法と比較して訓練時間が大幅に短く、例えば MNIST 性能到達まで6分対競合法の約5時間。
検証機構のアーキテクチャはデータセットにより競争力あるいはそれ以上の検証境界を生む（Constant は最悪）。
PVT は検証済みロバストネスで標準的な adversarial training を上回るが、nominal accuracy を犠牲にする可能性があり、クリーン精度のさらなる改善余地を示す。
検証時間分析は、PVT モデルが1例あたりの検証時間を控えめな予算でほぼ最適な境界を実現できることを示す（例：15 ms 予算）。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。