[Paper Review] Federated Adversarial Domain Adaptation
The paper introduces Federated Adversarial Domain Adaptation (FADA) for unsupervised federated domain adaptation, aligning representations across distributed source domains to a target domain using dynamic attention and representation disentanglement within a federated setting.
Federated learning improves data privacy and efficiency in machine learning performed over networks of distributed devices, such as mobile phones, IoT and wearable devices, etc. Yet models trained with federated learning can still fail to generalize to new devices due to the problem of domain shift. Domain shift occurs when the labeled data collected by source nodes statistically differs from the target node's unlabeled data. In this work, we present a principled approach to the problem of federated domain adaptation, which aims to align the representations learned among the different nodes with the data distribution of the target node. Our approach extends adversarial adaptation techniques to the constraints of the federated setting. In addition, we devise a dynamic attention mechanism and leverage feature disentanglement to enhance knowledge transfer. Empirically, we perform extensive experiments on several image and text classification tasks and show promising results under unsupervised federated domain adaptation setting.
Motivation & Objective
- Motivate and formalize unsupervised federated domain adaptation (UFDA) where data cannot be shared across domains.
- Derive a generalization bound for UFDA to guide algorithm design.
- Propose FADA to minimize domain shift via adversarial alignment and feature disentanglement in a federated setup.
Proposed method
- Develop dynamic attention to weight source-domain gradients based on their contribution to the target domain.
- Implement federated adversarial alignment by training domain-specific local feature extractors and a global discriminator without sharing data.
- Apply feature disentanglement to split representations into domain-invariant and domain-specific components, aided by a mutual information estimator (MINE).
- Use a two-step adversarial objective (domain identifier and generators) to align source and target distributions in UFDA.
- Incorporate a reconstruction loss to preserve representation integrity and an end-to-end SGD optimization loop (Algorithm 1).
- Leverage gap statistics to measure source contributions and compute a dynamic gradient mask for aggregation.
Experimental results
Research questions
- RQ1How can UFDA be practically achieved when data remain on local sources and only gradients are shared?
- RQ2Can adversarial domain alignment and representation disentanglement reduce domain shift in a federated setting?
- RQ3What is the effect of dynamic attention on weighting diverse source domains during aggregation?
- RQ4How does FADA perform across image and text classification tasks under UFDA?
- RQ5What theoretical guarantees can bound performance in UFDA?
Key findings
- FADA with the full set of components (dynamic attention, adversarial alignment, and disentanglement) achieves the best average performance on Digit-Five (73.6% in Table 1).
- Dynamic attention and adversarial alignment individually improve results over baselines, with the disentanglement variant providing strong gains (Model III) across tasks.
- UFDA is more challenging than multi-source DA with shared data, as shown by weaker performance under federated settings when data cannot be centralized.
- FADA yields more compact intra-class variance and larger inter-class variance in learned features compared to f-DANN and f-DAN (visualized via t-SNE in Figure 3).
- Across Office-Caltech10, DomainNet, and Amazon Review datasets, FADA with disentanglement consistently improves accuracy over strong baselines (Tables 2–4).
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.