QUICK REVIEW

[论文解读] Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift

Zachary Nado, Shreyas Padhy|arXiv (Cornell University)|Jun 19, 2020

Domain Adaptation and Few-Shot Learning参考文献 52被引用 95

一句话总结

本文提出预测时批归一化，通过在预测时使用一个小的未标记批次重新校准激活，提升在协变量偏移下的准确性和校准性，在 CIFAR-10-C 和 ImageNet-C 上取得强结果。

ABSTRACT

Covariate shift has been shown to sharply degrade both predictive accuracy and the calibration of uncertainty estimates for deep learning models. This is worrying, because covariate shift is prevalent in a wide range of real world deployment settings. However, in this paper, we note that frequently there exists the potential to access small unlabeled batches of the shifted data just before prediction time. This interesting observation enables a simple but surprisingly effective method which we call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. Using this one line code change, we achieve state-of-the-art on recent covariate shift benchmarks and an mCE of 60.28\% on the challenging ImageNet-C dataset; to our knowledge, this is the best result for any model that does not incorporate additional data augmentation or modification of the training pipeline. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness (e.g. deep ensembles) and combining the two further improves performance. Our findings are supported by detailed measurements of the effect of this strategy on model behavior across rigorous ablations on various dataset modalities. However, the method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift, and is therefore worthy of additional study. We include links to the data in our figures to improve reproducibility, including a Python notebooks that can be run to easily modify our analysis at https://colab.research.google.com/drive/11N0wDZnMQQuLrRwRoumDCrhSaIhkqjof.

研究动机与目标

在协变量偏移下，动机并形式化预测时批设置，其中预测在测试时以小批量发生。
提出一个简单、高效的方法——预测时 BN——以使用当前预测批统计量重新校准激活。
在跨图像和非图像模态的协变量偏移基准上评估该方法，并分析何时有帮助或失败。

提出的方法

将预测时批设置形式化为带批次损失和风险最小化。
在每个预测时批上重新计算的批归一化统计量（预测时 BN）应用，而不是冻结训练 EMA 统计量。
将预测时 BN 与 vanilla BN、集成、温度缩放以及跨多个数据集的其他归一化变体进行比较。
提供消融研究以理解 epsilon 的作用、应重置哪些 BN 层，以及与预训练和自然漂移的交互。

实验结果

研究问题

RQ1在预测时批次重新计算批归一化统计量是否在协变量偏移下改善校准和准确性？
RQ2预测时 BN 与训练时 BN 和其他校准方法在图像与非图像模态上的表现有何差异？
RQ3预测时 BN 的局限性和失效模式，包括预训练和自然漂移的影响？
RQ4该方法对批量大小、BN 层选择和归一化超参数的敏感度如何？

主要发现

预测时 BN 将偏移数据的激活支撑与训练统计量对齐，在协变量偏移下改善校准，且通常提高准确性。
在 CIFAR-10-C 和 ImageNet-C 上，预测时 BN 具有强烈的校准性和具竞争力的准确性，在 ImageNet-C 上无额外数据增强的 mCE 为 60.28%。
该方法可与集成互补，在多种预测批大小下保持收益，即使仅为适中批大小（约 100）也有显著好处。
当与预训练一起使用时（如 ImageNet-C 上的 Noisy Student）以及在更自然的数据集漂移下，预测时 BN 可能表现不佳，指示有效性边界条件。
在自然对抗数据集（ImageNet-A）中，预测时 BN 提高了校准，在某些设置下甚至可超越训练 BN。
消融研究表明输出层前的归一化层本身不足以产生显著收益；对内部 BN 层进行再归一化可获得更好增益。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。