[论文解读] KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation
KD3A 通过从多个源模型蒸馏知识,在隐私保护的去中心化设置下实现去中心化的无监督多源域自适应,利用 Knowledge Vote、Consensus Focus 和 BatchNorm MMD 提高鲁棒性并降低通信开销,同时处理负迁移问题。
Conventional unsupervised multi-source domain adaptation (UMDA) methods assume all source domains can be accessed directly. This neglects the privacy-preserving policy, that is, all the data and computations must be kept decentralized. There exists three problems in this scenario: (1) Minimizing the domain distance requires the pairwise calculation of the data from source and target domains, which is not accessible. (2) The communication cost and privacy security limit the application of UMDA methods (e.g., the domain adversarial training). (3) Since users have no authority to check the data quality, the irrelevant or malicious source domains are more likely to appear, which causes negative transfer. In this study, we propose a privacy-preserving UMDA paradigm named Knowledge Distillation based Decentralized Domain Adaptation (KD3A), which performs domain adaptation through the knowledge distillation on models from different source domains. KD3A solves the above problems with three components: (1) A multi-source knowledge distillation method named Knowledge Vote to learn high-quality domain consensus knowledge. (2) A dynamic weighting strategy named Consensus Focus to identify both the malicious and irrelevant domains. (3) A decentralized optimization strategy for domain distance named BatchNorm MMD. The extensive experiments on DomainNet demonstrate that KD3A is robust to the negative transfer and brings a 100x reduction of communication cost compared with other decentralized UMDA methods. Moreover, our KD3A significantly outperforms state-of-the-art UMDA approaches.
研究动机与目标
- Motivate privacy-preserving unsupervised multi-source domain adaptation (UMDA) where source data cannot be accessed.
- Propose a KD-based decentralized framework to leverage multiple source models for target-domain adaptation.
- Introduce mechanisms to detect and mitigate negative transfer from irrelevant or malicious sources.
- Provide a theoretical bound for KD3A and validate its practical effectiveness and communication efficiency.
提出的方法
- Knowledge Vote to produce high-quality consensus knowledge from multiple source models for target data.
- Consensus Focus to assign weights to source domains based on consensus quality to reduce negative transfer.
- BatchNorm MMD to decentralize the optimization of H-divergence by using BatchNorm statistics instead of raw data.
- Derivation of a decentralized generalization bound for KD3A showing improvements over the base UMDA bound.
- Algorithm 1 outlines KD3A’s training loop with three components in a privacy-preserving, decentralized setting.
实验结果
研究问题
- RQ1Can knowledge distillation across multiple decentralized source models improve unsupervised domain adaptation in the absence of source data?
- RQ2How can we detect and downweight malicious or irrelevant sources to prevent negative transfer in a decentralized setting?
- RQ3What is the impact of consensus-based weighting and BN-based divergence minimization on generalization bounds and performance?
- RQ4How does KD3A perform in large-scale multi-domain benchmarks compared to state-of-the-art UMDA methods and decentralized baselines?
主要发现
| 标准 | 方法 | Clipart | Infograph | Painting | Quickdraw | Real | Sketch | Avg | |
|---|---|---|---|---|---|---|---|---|---|
| W/o DA | Oracle | 69.3 ±0.37 | 34.5 ±0.42 | 66.3 ±0.67 | 66.8 ±0.51 | 80.1 ±0.59 | 60.7 ±0.48 | 63.0 | - |
| Source-only | - | 52.1 ±0.51 | 23.1 ±0.28 | 47.7 ±0.96 | 13.3 ±0.72 | 60.7 ±0.32 | 46.5 ±0.56 | 40.6 | - |
| H-divergence | MDAN | 60.3 ±0.41 | 25.0 ±0.43 | 50.3 ±0.36 | 8.2 ±1.92 | 61.5 ±0.46 | 51.3 ±0.58 | 42.8 | - |
| M^3SDA | M^3SDA | 58.6 ±0.53 | 26.0 ±0.89 | 52.3 ±0.55 | 6.3 ±0.58 | 62.7 ±0.51 | 49.5 ±0.76 | 42.6 | - |
| Knowledge Ensemble | DAEL | 70.8 ±0.14 | 26.5 ±0.13 | 57.4 ±0.28 | 12.2 ±0.7 | 65.0 ±0.23 | 60.6 ±0.25 | 48.7 | - |
| Source Selection | CMSS | 64.2 ±0.18 | 28.0 ±0.2 | 53.6 ±0.39 | 16.0 ±0.12 | 63.4 ±0.21 | 53.8 ±0.35 | 46.5 | - |
| Decentralized UMDA | SHO T^* | 61.7 | 22.2 | 52.6 | 12.2 | 67.7 | 48.6 | 44.2 | - |
| FAD A^* | FADA | 59.1 | 21.7 | 47.9 | 8.8 | 60.8 | 50.4 | 41.5 | - |
| KD3A | KD3A | 72.5 ±0.62 | 23.4 ±0.43 | 60.9 ±0.71 | 16.4 ±0.28 | 72.7 ±0.55 | 60.6 ±0.32 | 51.1 | - |
- KD3A achieves 51.1% average accuracy on DomainNet, outperforming state-of-the-art UMDA methods.
- KD3A attains oracle performance on Clipart and Sketch domains.
- KD3A reduces communication cost by about 100x versus other decentralized UMDA methods.
- Knowledge Vote and Consensus Focus effectively identify and downweight irrelevant/malicious domains, reducing negative transfer.
- BatchNorm MMD enables decentralized optimization of the H-divergence without accessing source data.
- KD3A demonstrates robustness to privacy leakage and negative transfer while achieving strong domain adaptation performance.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。