QUICK REVIEW

[论文解读] Source-free Domain Adaptation via Distributional Alignment by Matching Batch Normalization Statistics

Masato Ishii, Masashi Sugiyama|arXiv (Cornell University)|Jan 19, 2021

Domain Adaptation and Few-Shot Learning参考文献 25被引用 29

一句话总结

一个无源领域自适应方法，仅对目标编码器进行微调，以使 BN 统计与固定分类器对齐，并增加信息最大化以提升判别性，在无法获取源数据的情况下取得具竞争力的结果。

ABSTRACT

In this paper, we propose a novel domain adaptation method for the source-free setting. In this setting, we cannot access source data during adaptation, while unlabeled target data and a model pretrained with source data are given. Due to lack of source data, we cannot directly match the data distributions between domains unlike typical domain adaptation algorithms. To cope with this problem, we propose utilizing batch normalization statistics stored in the pretrained model to approximate the distribution of unobserved source data. Specifically, we fix the classifier part of the model during adaptation and only fine-tune the remaining feature encoder part so that batch normalization statistics of the features extracted by the encoder match those stored in the fixed classifier. Additionally, we also maximize the mutual information between the features and the classifier's outputs to further boost the classification performance. Experimental results with several benchmark datasets show that our method achieves competitive performance with state-of-the-art domain adaptation methods even though it does not require access to source data.

研究动机与目标

在适应过程中不访问源数据的情况下解决领域偏移问题。
利用在预训练模型中存储的 BN 统计量来近似源特征分布。
仅微调目标特征编码器，同时保持分类器固定。
通过信息最大化提升目标域分类性能。
在标准领域自适应基准上展示具有竞争力的性能。

提出的方法

将预训练模型分成一个固定的分类器和一个可调的目标编码器。
使用高斯近似定义 BN-统计匹配损失，以将目标 BN 统计量与分类器中存储的源 BN 统计量进行比较。
通过 BN 统计量最小化目标特征分布（来自 BN）与近似源分布之间的 KL 散度。
加入信息最大化损失以鼓励目标预测的判别性和多样性。
在未标记的目标数据上，联合优化编码器参数以最小化 L_IM + lambda * L_BNM。
论证在无源约束下该方法的稳定性与效率。

实验结果

研究问题

RQ1在不访问源数据的情况下，预训练分类器中存储的 BN 统计量能否有效近似源特征分布以实现领域对齐？
RQ2在无源自适应下，优化 BN-统计对齐与信息最大化是否能改善目标域分类？
RQ3与其他无源和典型 DA 方法相比，该方法在标准领域自适应基准上的表现如何？

主要发现

该方法在基准数据集上实现了与最先进的无源 DA 方法相当的精度。
BN-统计量匹配（通过高斯近似之间的 KL 散度）减少了域之间的分布差异。
信息最大化提高了判别性，帮助避免对目标数据过拟合。
在 Office-31 多种场景和数字识别任务上表现良好，常常超越某些典型 DA 方法。
该方法在广泛的超参数 lambda 范围内表现稳定，并且在较小的目标数据集时仍然有效。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。