Skip to main content
QUICK REVIEW

[Paper Review] Domain Generalization with MixStyle

Kaiyang Zhou, Yongxin Yang|arXiv (Cornell University)|Apr 5, 2021
Domain Adaptation and Few-Shot Learning56 references41 citations
TL;DR

MixStyle regularizes CNN training by probabilistically mixing instance-level feature statistics across domains, synthesizing new styles to improve generalization to unseen domains without explicit image synthesis. It yields strong improvements on DG tasks across classification, retrieval, and RL.

ABSTRACT

Though convolutional neural networks (CNNs) have demonstrated remarkable ability in learning discriminative features, they often generalize poorly to unseen domains. Domain generalization aims to address this problem by learning from a set of source domains a model that is generalizable to any unseen domain. In this paper, a novel approach is proposed based on probabilistically mixing instance-level feature statistics of training samples across source domains. Our method, termed MixStyle, is motivated by the observation that visual domain is closely related to image style (e.g., photo vs.~sketch images). Such style information is captured by the bottom layers of a CNN where our proposed style-mixing takes place. Mixing styles of training instances results in novel domains being synthesized implicitly, which increase the domain diversity of the source domains, and hence the generalizability of the trained model. MixStyle fits into mini-batch training perfectly and is extremely easy to implement. The effectiveness of MixStyle is demonstrated on a wide range of tasks including category classification, instance retrieval and reinforcement learning.

Motivation & Objective

  • Address domain shift in vision by learning domain-invariant features from multiple source domains.
  • Propose a lightweight, plug-and-play module that augments training by mixing style statistics across instances.
  • Demonstrate DG improvements across classification, retrieval, and reinforcement learning tasks.
  • Show that implicit style mixing improves generalization without generating new images.

Proposed method

  • Insert MixStyle between CNN layers to perturb style statistics in bottom feature maps.
  • Sample two instances from different domains and form mixed statistics using a convex combination with Beta-distributed weights (alpha hyperparameter).
  • Compute mixed statistics gamma_mix and beta_mix from original and reference batch statistics and apply them to style-normalized features.
  • Use Bernoulli(0.5) to activate MixStyle during training; no MixStyle at test time; gradients flow through mean/variance computations with stop-gradient on statistics.
  • Discuss placements of MixStyle across residual blocks to balance style vs. content information and report ablations comparing random vs domain-labeled shuffles.

Experimental results

Research questions

  • RQ1Can MixStyle improve domain generalization by augmenting style diversity at the feature level?
  • RQ2Where in a network should MixStyle be applied for best domain generalization performance?
  • RQ3How does MixStyle compare to pixel-level data augmentation and other DG methods on standard DG benchmarks?
  • RQ4Is MixStyle effective across tasks beyond classification, such as instance retrieval and reinforcement learning?

Key findings

  • MixStyle consistently improves generalization over a vanilla ResNet-18 on PACS, outperforming Mixup and DropBlock baselines.
  • MixStyle with random shuffle or domain labels achieves 82.8% and 83.7% average accuracy on PACS, respectively, surpassing most prior DG methods.
  • Applying MixStyle to multiple lower-level layers yields better performance; applying to the last block degrades performance, likely due to semantic content being captured there.
  • MixStyle outperforms pixel-level augmentation methods like L2A-OT in DG tasks while being much lighter computationally.
  • In cross-dataset person re-ID, MixStyle with random shuffle or domain label improves mAP/R1/R5/R10 over baselines across Market1501 and Duke datasets.
  • In reinforcement learning, MixStyle improves generalization to unseen environments and complements IBAC-SNI.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.