Skip to main content
QUICK REVIEW

[Paper Review] An introduction to domain adaptation and transfer learning

Wouter M. Kouw, Loog, Marco|arXiv (Cornell University)|Dec 31, 2018
Domain Adaptation and Few-Shot Learning213 references244 citations
TL;DR

This technical report surveys when and how classifiers trained on a source domain can generalize to a target domain, detailing simple data shifts and adaptation strategies.

ABSTRACT

In machine learning, if the training data is an unbiased sample of an underlying distribution, then the learned classification function will make accurate predictions for new samples. However, if the training data is not an unbiased sample, then there will be differences between how the training data is distributed and how the test data is distributed. Standard classifiers cannot cope with changes in data distributions between training and test phases, and will not perform well. Domain adaptation and transfer learning are sub-fields within machine learning that are concerned with accounting for these types of changes. Here, we present an introduction to these fields, guided by the question: when and how can a classifier generalize from a source to a target domain? We will start with a brief introduction into risk minimization, and how transfer learning and domain adaptation expand upon this framework. Following that, we discuss three special cases of data set shift, namely prior, covariate and concept shift. For more complex domain shifts, there are a wide variety of approaches. These are categorized into: importance-weighting, subspace mapping, domain-invariant spaces, feature augmentation, minimax estimators and robust algorithms. A number of points will arise, which we will discuss in the last section. We conclude with the remark that many open questions will have to be addressed before transfer learners and domain-adaptive classifiers become practical.

Motivation & Objective

  • Explain the general problem of domain adaptation and transfer learning and why standard classifiers fail under distribution shifts.
  • Define domains and formalize cross-domain generalization risk bounds.
  • Characterize simple data shifts (prior, covariate, concept) and discuss corresponding adaptation strategies.
  • Present a taxonomy of approaches for more complex domain shifts (e.g., importance weighting, subspace mapping, domain-invariant spaces, feature augmentation, minimax estimators, robust algorithms).
  • Highlight open questions and practical challenges in making transfer learners and domain-adaptive classifiers viable.

Proposed method

  • Ground the discussion in risk minimization and empirical risk frameworks.
  • Introduce generalization bounds for cross-domain settings using measures like the HΔH-divergence and joint error e*_{S,T}.
  • Derive target risk estimators under shifts using importance weighting and ratio adjustments of joint distributions (R_T(h) and related forms).
  • Classify data shifts into prior, covariate, and concept shifts with corresponding adaptation techniques.
  • Provide a structured overview of adaptation methods across more complex domain shifts (e.g., domain-invariant spaces, feature augmentation, minimax estimators, robust methods).
  • Discuss limitations and open questions in applying transfer learning in practice.

Experimental results

Research questions

  • RQ1Under what conditions can a classifier trained in a source domain generalize to a target domain?
  • RQ2How do different types of data set shift (prior, covariate, concept) affect cross-domain generalization risk?
  • RQ3What are the main adaptation strategies for handling domain shifts, and how do they relate to risk minimization principles?
  • RQ4What theoretical bounds relate source-trained performance to target-domain performance in domain adaptation?
  • RQ5What are the key open questions and practical barriers to making transfer learners viable?

Key findings

  • Cross-domain generalization bounds can be established when relations between source and target domains are specified, e.g., via the HΔH-divergence and the ideal joint error e*_{S,T}.
  • The difference between the target error of a source-trained classifier and the target error of the optimal target classifier is bounded by the sum of the joint error, domain divergence, and a complexity term.
  • Prior shift, covariate shift, and concept shift each admit distinct reweighting or cancellation techniques to estimate target risk without labeled target data (where applicable).
  • Covariate shift and prior shift can be addressed by adjusting sampling weights or joint distribution ratios to reflect target probabilities.
  • Concept shift remains challenging without labeled target data, requiring estimation of conditional distributions that depend on labeled target observations.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.