[论文解读] Adapted Deep Embeddings: A Synthesis of Methods for $k$-Shot Inductive Transfer Learning
本文比较权重迁移、深度度量学习和少样本学习在 k-shot 归纳迁移中的表现,并提出 Adapted Embeddings(AdaptHistLoss 和 AdaptProtoNet),它们结合基于嵌入的损失与目标域适应以实现显著改进。
The focus in machine learning has branched beyond training classifiers on a single task to investigating how previously acquired knowledge in a source domain can be leveraged to facilitate learning in a related target domain, known as inductive transfer learning. Three active lines of research have independently explored transfer learning using neural networks. In weight transfer, a model trained on the source domain is used as an initialization point for a network to be trained on the target domain. In deep metric learning, the source domain is used to construct an embedding that captures class structure in both the source and target domains. In few-shot learning, the focus is on generalizing well in the target domain based on a limited number of labeled examples. We compare state-of-the-art methods from these three paradigms and also explore hybrid adapted-embedding methods that use limited target-domain data to fine tune embeddings constructed from source-domain data. We conduct a systematic comparison of methods in a variety of domains, varying the number of labeled instances available in the target domain ($k$), as well as the number of target-domain classes. We reach three principal conclusions: (1) Deep embeddings are far superior, compared to weight transfer, as a starting point for inter-domain transfer or model re-use (2) Our hybrid methods robustly outperform every few-shot learning and every deep metric learning method previously proposed, with a mean error reduction of 34% over state-of-the-art. (3) Among loss functions for discovering embeddings, the histogram loss (Ustinova & Lempitsky, 2016) is most robust. We hope our results will motivate a unification of research in weight transfer, deep metric learning, and few-shot learning.
研究动机与目标
- 评估三种 ITL 范式的有效性:在不同的 k 和 n 下进行权重迁移、深度度量学习和少样本学习。
- 评估使用有限目标域数据来适配嵌入的混合方法。
- 确定哪些损失函数和适配策略在跨域迁移中表现最佳。
- 通过 Adapted Embeddings 为统一迁移、度量学习和少样本学习提供指导。
提出的方法
- 对六种方法(WeightAdapt、HistLoss、ProtoNet、AdaptHistLoss、AdaptProtoNet、Baseline)在多个数据集和配置下进行系统性实验比较。
- 改变每个类的目标域带标签样本数(k)和目标类数量(n),以绘制性能景观。
- 使用在源域学习的 HistLoss 或 ProtoNet 的嵌入,并通过微调嵌入在目标域实现适配。
- 通过在保持源域嵌构成的同时,在目标域上对嵌入进行微调来引入自适应嵌入。
- 在每个配置下重复实验 10 次,固定源/目标类别划分以评估鲁棒性。
实验结果
研究问题
- RQ1哪个 ITL 范式在 k 和 n 变化下提供最强的基线性能?
- RQ2在所有数据集上,适配嵌入(AdaptHistLoss、AdaptProtoNet)是否优于非适配嵌入和非嵌入迁移方法?
- RQ3HistLoss 是否是用于小-k ITL 的最鲁棒嵌入损失?
- RQ4当 k 从非常小到较大时,权重迁移方法与基于嵌入的方法相比如何?
- RQ5能否将嵌入损失与目标域适配相结合的混合方法在多样领域中提供持续的改进?
主要发现
- 适配嵌入在所有数据集和配置中,当 k>1 时,始终优于非适配嵌入和适配非嵌入方法。
- AdaptHistLoss 通常在适配方法中提供最强性能,超过 AdaptProtoNet。
- WeightAdapt 在所测试的 k 和 n 设置中劣于适配嵌入,且随着 k 增大其优势减弱。
- 在各数据集上,适配嵌入相对于最佳表现的替代方法实现了平均错误率降低约 34%。
- HistLoss 被识别为在小-k ITL 中最鲁棒的嵌入损失,而 ProtoNet 随着 k 增大而变得吃力,但通过适配受益。
- 总体而言,适配嵌入相对于现有方法提供了实质性、系统性的改进,并促使将这些研究方向统一起来。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。