QUICK REVIEW

[论文解读] Test-time Batch Statistics Calibration for Covariate Shift

Fuming You, Jingjing Li|arXiv (Cornell University)|Oct 6, 2021

Domain Adaptation and Few-Shot Learning参考文献 61被引用 23

一句话总结

本文提出 α-BN，一种测试时批量归一化统计校准方法，通过在批量归一化层中混合源域与目标域统计量，缓解协变量偏移问题，同时保持判别性特征结构。该方法在图像分类、语义分割和对噪声的鲁棒性等12个数据集上均取得最先进性能，在无需任何训练的情况下，将 GTA5→Cityscapes 任务的 mIoU 提升了 15.5%。

ABSTRACT

Deep neural networks have a clear degradation when applying to the unseen environment due to the covariate shift. Conventional approaches like domain adaptation requires the pre-collected target data for iterative training, which is impractical in real-world applications. In this paper, we propose to adapt the deep models to the novel environment during inference. An previous solution is test time normalization, which substitutes the source statistics in BN layers with the target batch statistics. However, we show that test time normalization may potentially deteriorate the discriminative structures due to the mismatch between target batch statistics and source parameters. To this end, we present a general formulation $α$-BN to calibrate the batch statistics by mixing up the source and target statistics for both alleviating the domain shift and preserving the discriminative structures. Based on $α$-BN, we further present a novel loss function to form a unified test time adaptation framework Core, which performs the pairwise class correlation online optimization. Extensive experiments show that our approaches achieve the state-of-the-art performance on total twelve datasets from three topics, including model robustness to corruptions, domain generalization on image classification and semantic segmentation. Particularly, our $α$-BN improves 28.4\% to 43.9\% on GTA5 $ ightarrow$ Cityscapes without any training, even outperforms the latest source-free domain adaptation method.

研究动机与目标

为解决测试时归一化（T-BN）的局限性，后者可能因源域参数与目标域批量统计量不匹配而破坏判别性特征结构。
开发一种实用的、无需训练的适应方法，适用于域泛化（DG）和测试时适应（TTA），其中目标数据未预先收集。
在推理过程中通过平衡源域与目标域统计量，保持模型在协变量偏移下的鲁棒性与性能。
提出一个统一的在线优化框架 Core，利用成对类别相关性实现鲁棒的测试时适应。

提出的方法

提出 α-BN，一种通用公式，通过可学习或固定的超参数 α，线性混合批量归一化层中的源域与目标域批量统计量。
在推理过程中应用 α-BN 以校准批量归一化统计量，减少域偏移，同时保持源域预训练模型的判别能力。
设计 Core 框架，通过在线优化成对类别相关性，在推理过程中优化预测结果。
在 Core 框架中引入一种新颖的损失函数，通过利用类别间关系，促使跨批次表示保持一致且具有判别性。
将 α-BN 集成到标准的经验风险最小化（ERM）模型中，无需额外训练或架构修改。
在多个数据集上评估该方法，计算开销极低，验证了其高效性与有效性。

实验结果

研究问题

RQ1在推理过程中混合源域与目标域批量统计量，是否能在不重新训练的情况下提升模型在协变量偏移下的泛化能力？
RQ2α-BN 是否能在适应新领域的同时，保留源域训练中学习到的判别性特征结构？
RQ3基于成对类别相关性优化的 Core 框架，与现有方法相比，如何提升测试时适应性能？
RQ4α-BN 在不同任务与数据集上，对测试时批量大小和超参数 α 的变化是否具有鲁棒性？
RQ5α-BN 是否能提升下游微调任务中表示的质量，如 LogME 分数所示？

主要发现

在 GTA5→Cityscapes 任务上，α-BN 将 mIoU 提升了 15.5%（从 43.9% 提升至 59.4%），且无需任何训练，优于最新的无源域自适应方法。
在涵盖图像分类、语义分割和对噪声鲁棒性三个主题的十二个多样化数据集上，α-BN 均取得最先进性能。
该方法每张图像仅增加 0.0158 秒推理时间（80.94s vs. 72.84s），展现出极高的效率。
α-BN 对批量大小和超参数 α 具有鲁棒性，在分割任务中 α=0.7 表现最优，在分类任务中 α=0.9 表现最优。
McNemar 检验在所有任务中均显示统计显著性（p < 0.05），验证了其相对于 ERM 基线的性能提升。
α-BN 表示的 LogME 分数高于源域和 T-BN，表明 α-BN 发现了更适合微调的更优表示。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。