QUICK REVIEW

[Paper Review] Co-regularized Alignment for Unsupervised Domain Adaptation

Abhishek Kumar, Prasanna Sattigeri|arXiv (Cornell University)|Nov 13, 2018

Domain Adaptation and Few-Shot Learning118 citations

TL;DR

The paper introduces Co-DA, a co-regularized domain alignment method that uses two diverse feature spaces to align source and target distributions and enforces agreement between their target predictions to improve unsupervised domain adaptation performance.

ABSTRACT

Deep neural networks, trained with large amount of labeled data, can fail to generalize well when tested with examples from a \emph{target domain} whose distribution differs from the training data distribution, referred as the \emph{source domain}. It can be expensive or even infeasible to obtain required amount of labeled data in all possible domains. Unsupervised domain adaptation sets out to address this problem, aiming to learn a good predictive model for the target domain using labeled examples from the source domain but only unlabeled examples from the target domain. Domain alignment approaches this problem by matching the source and target feature distributions, and has been used as a key component in many state-of-the-art domain adaptation methods. However, matching the marginal feature distributions does not guarantee that the corresponding class conditional distributions will be aligned across the two domains. We propose co-regularized domain alignment for unsupervised domain adaptation, which constructs multiple diverse feature spaces and aligns source and target distributions in each of them individually, while encouraging that alignments agree with each other with regard to the class predictions on the unlabeled target examples. The proposed method is generic and can be used to improve any domain adaptation method which uses domain alignment. We instantiate it in the context of a recent state-of-the-art method and observe that it provides significant performance improvements on several domain adaptation benchmarks.

Motivation & Objective

Motivate and address misalignment of class-conditional distributions in unsupervised domain adaptation.
Propose co-regularized domain alignment to create multiple diverse feature spaces and enforce agreement on target predictions.
Show that co-regularization improves state-of-the-art results on standard DA benchmarks.
Provide instantiations of Co-DA within existing domain-adaptation frameworks to demonstrate performance gains.

Proposed method

Construct two diverse feature generators g1 and g2 (with corresponding classifiers h1 and h2) to form two predictions f1 = h1∘g1 and f2 = h2∘g2.
Minimize source prediction loss plus domain alignment loss for each view: Ly(fi; Ps) + Ld(gi#Ps, gi#Pt).
Encourage target prediction agreement by minimizing the L1 distance between f1 and f2 on unlabeled target data: Lp(f1,f2; Pt).
Promote diversity between g1 and g2 via a diversity term Dg(g1,g2) that pushes their source minibatch means apart within a cap ν.
Incorporate cluster assumption with conditional entropy minimization and Virtual Adversarial Training (VAT) on both source and target to stabilize learning.
Provide an instantiation that uses Jensen-Shannon divergence for domain alignment and a VAT-driven regularization to improve robustness.

Experimental results

Research questions

RQ1Can co-regularized alignments across multiple diverse feature spaces reduce erroneous class-condition alignment in unsupervised domain adaptation?
RQ2Does enforcing agreement between multiple target predictors improve target-domain accuracy beyond single-view domain alignment methods?
RQ3What is the impact of explicitly encouraging diversity between the feature spaces on alignment quality and predictive performance?
RQ4How does Co-DA perform relative to state-of-the-art methods (e.g., VADA) across standard DA benchmarks?
RQ5Does Co-DA provide benefits when combined with refinement steps like DIRT-T?

Key findings

Co-DA yields significant improvements over VADA on challenging MNIST→SVHN, reaching about 81.7% test accuracy on the target domain.
On MNIST→SVHN without instance normalization, Co-DA improves from 47.5% (VADA) to 52.0% (Co-DA) and to 55.3% with bn variants, with DIRT-T refining to around 88% when instance normalization is used.
For SVHN→MNIST and MNIST→MNIST-M, Co-DA shows consistent gains over VADA, with larger gains when instance normalization is used.
Co-DA variants with domain-alignment diversity (Co-DA bn) and two-branch configurations (Co-DA) outperform their single-branch counterparts in several benchmarks.
Combining Co-DA with DIRT-T can yield state-of-the-art results on several tasks, especially in settings without data augmentation.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.