QUICK REVIEW

[論文レビュー] MEMO: Test Time Robustness via Adaptation and Augmentation

Marvin Zhang, Sergey Levine|arXiv (Cornell University)|Oct 18, 2021

Advanced Neural Network Applications参考文献 38被引用数 85

ひとこと要約

MEMO は、単一のテスト入力を拡張してテスト時に事前学習済みの確率モデルを適応させ、拡張に跨る周辺エントロピーを最小化して不変性と信頼性を強制し、ImageNet-C/R/A および CIFAR-10 系列で堅牢性を改善する。

ABSTRACT

While deep neural networks can attain good accuracy on in-distribution test points, many applications require robustness even in the face of unexpected perturbations in the input, changes in the domain, or other sources of distribution shift. We study the problem of test time robustification, i.e., using the test input to improve model robustness. Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions, such as access to multiple test points, that prevent widespread adoption. In this work, we aim to study and devise methods that make no assumptions about the model training process and are broadly applicable at test time. We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable: when presented with a test example, perform different data augmentations on the data point, and then adapt (all of) the model parameters by minimizing the entropy of the model's average, or marginal, output distribution across the augmentations. Intuitively, this objective encourages the model to make the same prediction across different augmentations, thus enforcing the invariances encoded in these augmentations, while also maintaining confidence in its predictions. In our experiments, we evaluate two baseline ResNet models, two robust ResNet-50 models, and a robust vision transformer model, and we demonstrate that this approach achieves accuracy gains of 1-8\% over standard model evaluation and also generally outperforms prior augmentation and adaptation strategies. For the setting in which only one test point is available, we achieve state-of-the-art results on the ImageNet-C, ImageNet-R, and, among ResNet-50 models, ImageNet-A distribution shift benchmarks.

研究の動機と目的

訓練時の変更や大規模なテストバッチへのアクセスに依存しない、テスト時のロバスト化手法を動機づけ、研究する。
単一のテストポイントを用いてテスト時にすべてのモデルパラメータを適応させる、プラグアンドプレイの MEMO アプローチを提案する。
周辺エントロピーの最小化を通じて、拡張間で予測を不変にしつつ自信を保つことを促す。
既存のロバスト性技術や BN 適応と組み合わせた場合の適合性と追加的利得を示す。

提案手法

訓練済みの確率モデル f_theta を与え、単一のテスト入力 x と拡張集合 A を提示する。
x の拡張を B 回サンプリングして拡張入力のバッチを形成し、拡張に対して p_theta(y|a(x)) を平均して周辺出力分布を推定する。
MEMO ロスを周辺分布 H(bar{p}_theta(.|x)) のエントロピーとして定義し、周辺エントロピーを最小化するように勾配降下法で theta を更新する（テスト点ごとに1回の勾配ステップ）。
適応後、更新されたパラメータを用いて元の入力 x を予測する。テスト時に正解ラベルは不要。
事前学習手順を変更せずに、BN統計適応や他のロバスト性手法と組み合わせることも可能。

実験結果

リサーチクエスチョン

RQ1テスト時適応は、訓練過程の仮定やテストバッチへのアクセスなしで効果的になり得るか？
RQ2単一のテストポイントの拡張コピー間の周辺エントロピーを最小化することで、分布シフトに対する堅牢性が向上するか？
RQ3MEMO が既存のロバストネス技術（例：BN適応、AugMix、MoEx）とさまざまなモデルアーキテクチャとデータセットでどのように相互作用するか？
RQ4適応と拡張のどちらが観測された利得の支配的要因か、拡張の選択とサンプル数の役割は何か？

主な発見

MEMO は、困難な分布シフトのベンチマークで標準評価より 1–8% の精度向上をもたらす。
ImageNet の単一点テストでは、MEMO は ResNet-50 の最先端結果を達成し、ImageNet-C、ImageNet-R、ImageNet-A に対する堅牢性を強化。
MEMO は ResNet とビジョン変換器モデルの両方の堅牢性を向上させ、いくつかのベンチマークで従来の拡張/適応戦略を上回る。
アブレーション研究は、拡張間の不変性と自信の維持の両方が MEMO の重要な要素であることを示す。
MEMO は事前学習済みのロバストモデルと BN 適応と組み合わせて性能をさらに向上させることができ、データ拡張を多用して訓練したモデルにも強い利得を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。