[论文解读] Test-Time Adaptation to Distribution Shift by Confidence Maximization and Input Transformation
论文提出在完全测试时自适应,使用非饱和似然比损失(HLR/SLR)、带滑动平均的多样性正则化,以及一个可训练的输入变换模块,以在没有目标标签的情况下提升对分布转变的鲁棒性。
Deep neural networks often exhibit poor performance on data that is unlikely under the train-time data distribution, for instance data affected by corruptions. Previous works demonstrate that test-time adaptation to data shift, for instance using entropy minimization, effectively improves performance on such shifted distributions. This paper focuses on the fully test-time adaptation setting, where only unlabeled data from the target distribution is required. This allows adapting arbitrary pretrained networks. Specifically, we propose a novel loss that improves test-time adaptation by addressing both premature convergence and instability of entropy minimization. This is achieved by replacing the entropy by a non-saturating surrogate and adding a diversity regularizer based on batch-wise entropy maximization that prevents convergence to trivial collapsed solutions. Moreover, we propose to prepend an input transformation module to the network that can partially undo test-time distribution shifts. Surprisingly, this preprocessing can be learned solely using the fully test-time adaptation loss in an end-to-end fashion without any target domain labels or source domain data. We show that our approach outperforms previous work in improving the robustness of publicly available pretrained image classifiers to common corruptions on such challenging benchmarks as ImageNet-C.
研究动机与目标
- 在无源/无目标标签的情况下,推动预训练模型在分布转移下的鲁棒性表现。
- 引入非饱和的回归式损失,以在高置信度预测下保持学习。
- 通过对批次进行滑动平均更新的多样性正则化来防止收敛到平庸解。
- 通过在前端附加一个可训练的输入变换模块部分性地抵消移位来增强自适应。
- 在 ImageNet-C 和 ImageNet-R 上,针对多种预训练骨干网络展示更强鲁棒性。
提出的方法
- 在预训练网络前追加一个可训练的输入变换模块,形成 g = f ∘ d,其中 d 部分抵消域移。
- 只对参数子集进行适应(例如 BN 的仿射参数),并在测试时对目标数据更新 BN 统计量。
- 使用两个非饱和的、基于似然比的损失 L_hlr 和 L_slr 进行自监督自适应,避免在高置信度下梯度消失。
- 引入带有滑动平均更新 p_t(y) 的多样性正则化项 L_div,以防止收敛到平庸的预测。
- 将 L_div 与 L_conf(非饱和变体)结合,以在保持多样性的同时推动信息性自适应。
实验结果
研究问题
- RQ1完全测试时自适应能否在没有源数据的情况下提升对腐蚀或移位分布的准确性?
- RQ2非饱和的似然比损失(HLR/SLR)在适应过程中是否比基于熵的损失为高置信度预测提供更好的梯度信号?
- RQ3滑动平均多样性正则化是否能稳定自适应并防止崩溃?
- RQ4在测试时学习的输入变换模块是否能进一步抵消分布移位?
- RQ5在 ImageNet-C 和 ImageNet-R 上,对多种预训练架构,所提出的方法表现如何?
主要发现
| 方法 | 高斯 | Shot | 冲击 | Defocus | Glass | 运动 | Zoom | Snow | Frost | Fog | Bright | Contrast | Elastic | Pixel | JPEG |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No Adaptation | 2.44 | 2.99 | 1.96 | 17.92 | 9.82 | 14.78 | 22.50 | 16.89 | 23.31 | 24.43 | 58.93 | 5.43 | 16.95 | 20.61 | 31.65 |
| Pseudo Labels | 2.44 | 2.99 | 1.96 | 17.92 | 9.82 | 14.78 | 22.50 | 16.89 | 23.31 | 24.43 | 58.93 | 5.43 | 16.95 | 20.61 | 31.65 |
| Epoch 1 – TENT | 32.70 | 35.34 | 35.11 | 32.79 | 31.80 | 47.22 | 53.02 | 51.82 | 43.42 | 60.44 | 68.82 | 27.53 | 58.47 | 61.63 | 55.98 |
| Epoch 1 – TENT+ | 33.96 | 36.66 | 35.75 | 33.70 | 33.33 | 47.73 | 53.22 | 52.16 | 44.79 | 60.62 | 68.91 | 35.60 | 58.81 | 61.82 | 56.23 |
| Epoch 1 – HLR (ours) | 38.39 | 41.11 | 40.28 | 38.25 | 38.18 | 51.63 | 55.55 | 55.45 | 48.96 | 62.19 | 68.17 | 49.47 | 60.34 | 62.51 | 57.42 |
| Epoch 1 – SLR (ours) | 39.51 | 42.09 | 41.58 | 39.35 | 39.02 | 52.67 | 55.80 | 55.92 | 49.64 | 62.62 | 68.47 | 50.27 | 60.80 | 63.01 | 57.80 |
| Epoch 5 – TENT | 16.04 | 23.22 | 25.85 | 19.05 | 17.40 | 49.02 | 52.78 | 52.72 | 34.31 | 61.19 | 68.54 | 1.26 | 59.26 | 62.15 | 56.17 |
| Epoch 5 – TENT+ | 33.97 | 37.95 | 36.93 | 32.69 | 33.36 | 51.42 | 54.33 | 54.55 | 45.80 | 62.09 | 69.03 | 24.08 | 60.36 | 63.10 | 57.21 |
| Epoch 5 – HLR (ours) | 41.37 | 44.04 | 43.68 | 41.74 | 41.09 | 54.26 | 56.43 | 57.03 | 50.81 | 63.05 | 68.29 | 50.98 | 61.15 | 63.08 | 58.13 |
| Epoch 5 – SLR (ours) | 41.52 | 42.90 | 44.07 | 41.69 | 40.78 | 54.76 | 56.59 | 57.35 | 51.01 | 63.53 | 68.72 | 50.65 | 61.49 | 63.46 | 58.32 |
- HLR 和 SLR 在 ImageNet-C 和 ImageNet-R 的若干预训练模型上优于 TENT 和 TENT+。
- 使用所提损失进行自适应可提升平均腐蚀准确度,尤其在使用鲁棒骨干网络(如 DeepAugment+AugMix)时。
- 单轮测试时自适应已能带来显著增益,至多第二轮(epoch 5)时还能继续获得收益,而 L_div 使训练更稳定。
- 输入变换(IT)模块在某些畸变(如冲击、对比度)下部分性地抵消移位,提升鲁棒性。
- 真实标签监督的自适应给出上限,在若干设置中,所提方法接近这些上限。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。