QUICK REVIEW

[論文レビュー] Analyzing and Improving Representations with the Soft Nearest Neighbor Loss

Nicholas Frosst, Nicolas Papernot|arXiv (Cornell University)|Feb 5, 2019

Domain Adaptation and Few-Shot Learning被引用数 32

ひとこと要約

本論文は Soft Nearest Neighbor Loss を拡張して、表現におけるクラス多様体の混乱度を定量化し、それを正則化項として用いて一般化を向上させ、エンタングルメントのある隠れ表現が不確実性の較正とout-of-distribution検出を改善することを示す。

ABSTRACT

We explore and expand the $ extit{Soft Nearest Neighbor Loss}$ to measure the $ extit{entanglement}$ of class manifolds in representation space: i.e., how close pairs of points from the same class are relative to pairs of points from different classes. We demonstrate several use cases of the loss. As an analytical tool, it provides insights into the evolution of class similarity structures during learning. Surprisingly, we find that $ extit{maximizing}$ the entanglement of representations of different classes in the hidden layers is beneficial for discrimination in the final layer, possibly because it encourages representations to identify class-independent similarity structures. Maximizing the soft nearest neighbor loss in the hidden layers leads not only to improved generalization but also to better-calibrated estimates of uncertainty on outlier data. Data that is not from the training distribution can be recognized by observing that in the hidden layers, it has fewer than the normal number of neighbors from the predicted class.

研究の動機と目的

discriminative および generative models における表現の soft nearest neighbor loss を用いたクラス類似構造の特徴付け。
隠れ層での entanglement を最大化することが正則化として機能し、 generalization を改善する。
entangled な表現が out-of-distribution または adversarial データに対する不確実性推定を改善することを示す。

提案手法

温度パラメータを持つ soft nearest neighbor loss を定義・拡張し、表現空間における entanglement を測定する。
損失ボーナスを訓練目的に追加して entanglement を促進し、正則化効果を評価する。
学習中のネットワーク層を跨ぐ entanglement のダイナミクスを分析し、特徴抽出と識別の関係を理解する。
real と synthetic データ間の entanglement を測定するために GANs に適用し、ピクセル空間での訓練目的としての利用（MNIST の例）を探る。
entangled モデルと baseline モデルで Deep k-Nearest Neighbors (DkNN) を用いた不確実性の較正を評価する。
entanglement による out-of-distribution および adversarial データへのロバスト性を調査し、 adversarial 例の転移性を検討する。

実験結果

リサーチクエスチョン

RQ1 soft nearest neighbor loss は表現空間におけるクラスマンフォールドの entanglement をどのように定量化できるか？
RQ2 隠れ層で entanglement を最大化することは、最終層の識別性を損なうことなく一般化と較正を改善するか？
RQ3 entangled な表現は out-of-distribution または adversarial 入力の検出を改善し、 adversarial 例の転移性に影響を与えるか？
RQ4 soft nearest neighbor loss は生成モデルの単独目的として、または識別モデルの正則化項として機能し得るか？
RQ5 CIFAR-10 の ResNet のようなネットワークで学習中の entanglement の層ごとのダイナミクスはどうなるか？

主な発見

モデルタイプ	データセット	基準	エンタングルド	ベースライン
CNN	MNIST	Best Test Accuracy	99.23%	98.83%
CNN	MNIST	Average Test Accuracy	99.16%	98.82%
CNN	Fashion-MNIST	Best Test Accuracy	91.48%	90.42%
CNN	Fashion-MNIST	Average Test Accuracy	91.06%	90.25%
CNN	SVHN	Best Test Accuracy	88.81%	87.63%
CNN	SVHN	Average Test Accuracy	89.90%	89.71%
ResNet	CIFAR10	Best Test Accuracy	91.220%	90.780%
ResNet	CIFAR10	Average Test Accuracy	89.900%	89.713%

隠れ層での entanglement を最大化すると、 cross-entropy loss と組み合わせた場合に MNIST、Fashion-MNIST、SVHN、CIFAR-10 で一般化が改善する。
Entangled な表現は Deep k-Nearest Neighbors (DkNN) を用いた不確実性推定の較正性を高める。
Entanglement は out-of-distribution および adversarial 入力の識別を助け、外れ値は隠れ層で予測クラスの近傍が少ない傾向にある。
Soft nearest neighbor loss による正則化は過学習を遅らせ、訓練データとテストデータの一般化ギャップを縮小する。
Entangled なモデルは adversarial perturbations の転移性が低下し、活性化空間で In-distribution と Out-of-distribution の分離がより鮮明になる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。