QUICK REVIEW

[Paper Review] Adapted Deep Embeddings: A Synthesis of Methods for $k$-Shot Inductive Transfer Learning

Tyler R. Scott, Karl Ridgeway|arXiv (Cornell University)|May 22, 2018

Domain Adaptation and Few-Shot Learning48 citations

TL;DR

The paper compares weight transfer, deep metric learning, and few-shot learning for k-shot inductive transfer, and introduces Adapted Embeddings (AdaptHistLoss and AdaptProtoNet) which combine embedding-based losses with target-domain adaptation to achieve substantial improvements.

ABSTRACT

The focus in machine learning has branched beyond training classifiers on a single task to investigating how previously acquired knowledge in a source domain can be leveraged to facilitate learning in a related target domain, known as inductive transfer learning. Three active lines of research have independently explored transfer learning using neural networks. In weight transfer, a model trained on the source domain is used as an initialization point for a network to be trained on the target domain. In deep metric learning, the source domain is used to construct an embedding that captures class structure in both the source and target domains. In few-shot learning, the focus is on generalizing well in the target domain based on a limited number of labeled examples. We compare state-of-the-art methods from these three paradigms and also explore hybrid adapted-embedding methods that use limited target-domain data to fine tune embeddings constructed from source-domain data. We conduct a systematic comparison of methods in a variety of domains, varying the number of labeled instances available in the target domain ($k$), as well as the number of target-domain classes. We reach three principal conclusions: (1) Deep embeddings are far superior, compared to weight transfer, as a starting point for inter-domain transfer or model re-use (2) Our hybrid methods robustly outperform every few-shot learning and every deep metric learning method previously proposed, with a mean error reduction of 34% over state-of-the-art. (3) Among loss functions for discovering embeddings, the histogram loss (Ustinova & Lempitsky, 2016) is most robust. We hope our results will motivate a unification of research in weight transfer, deep metric learning, and few-shot learning.

Motivation & Objective

Assess the effectiveness of three ITL paradigms: weight transfer, deep metric learning, and few-shot learning across varying k and n.
Evaluate hybrid approaches that adapt embeddings using limited target-domain data.
Identify which losses and adaptation strategies yield best cross-domain transfer performance.
Provide guidance toward unifying transfer, metric learning, and few-shot learning through adapted embeddings.

Proposed method

Systematic experimental comparison of six methods (WeightAdapt, HistLoss, ProtoNet, AdaptHistLoss, AdaptProtoNet, Baseline) across multiple datasets and configurations.
Vary target-domain labeled examples per class (k) and number of target classes (n) to map performance landscapes.
Use source-domain embeddings learned with HistLoss or ProtoNet and apply target-domain adaptation through fine-tuning of embeddings.
Introduce adapted embeddings by fine-tuning embeddings on the target domain while retaining source-domain embedding structure.
Replicate experiments 10 times per configuration with fixed source/target class splits to assess robustness.

Experimental results

Research questions

RQ1Which ITL paradigm yields the strongest baseline performance across k and n?
RQ2Do adapted embeddings (AdaptHistLoss, AdaptProtoNet) outperform both non-adapted embeddings and non-embedding transfer methods across datasets?
RQ3Is HistLoss the most robust embedding loss among deep metric learning losses for k-shot ITL?
RQ4How do weight-transfer methods compare to embedding-based approaches when k scales from very small to large?
RQ5Can a hybrid approach that combines embedding losses with target-domain adaptation provide consistent improvements across diverse domains?

Key findings

Adapted embeddings consistently outperform non-adapted embeddings and adapted non-embedding methods across all datasets and configurations with k>1.
AdaptHistLoss generally yields the strongest performance among adapted methods, surpassing AdaptProtoNet.
WeightAdapt is inferior to adapted embeddings across the tested k and n settings, and its advantage diminishes as k grows.
Across datasets, adapted embeddings achieve a mean error reduction of 34% over the best-performing alternative methods.
HistLoss was identified as the most robust embedding loss among those evaluated for small-k ITL, while ProtoNet struggles as k grows but benefits from adaptation.
Overall, adapted embeddings provide a substantial, systematic improvement over existing approaches and encourage unifying these research strands.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.