[论文解读] Federated Learning with Non-IID Data
本文分析在客户端非独立同分布(non-IID)数据下,FedAvg 性能下降的原因是权重发散度(以 Earth Mover's Distance, EMD 测量)所致,并提出一种带有少量全局共享数据集的数据共享策略以恢复准确性。
Federated learning enables resource-constrained edge compute devices, such as mobile phones and IoT devices, to learn a shared model for prediction, while keeping the training data local. This decentralized approach to train models provides privacy, security, regulatory and economic benefits. In this work, we focus on the statistical challenge of federated learning when local data is non-IID. We first show that the accuracy of federated learning reduces significantly, by up to 55% for neural networks trained for highly skewed non-IID data, where each client device trains only on a single class of data. We further show that this accuracy reduction can be explained by the weight divergence, which can be quantified by the earth mover's distance (EMD) between the distribution over classes on each device and the population distribution. As a solution, we propose a strategy to improve training on non-IID data by creating a small subset of data which is globally shared between all the edge devices. Experiments show that accuracy can be increased by 30% for the CIFAR-10 dataset with only 5% globally shared data.
研究动机与目标
- Quantify how non-IID data across clients reduces FedAvg accuracy compared to IID settings.
- Explain weight divergence in FedAvg and bound it using earth mover’s distance (EMD) between client and population distributions.
- Propose a data-sharing strategy with a small globally shared dataset to mitigate non-IID effects and evaluate its impact on accuracy.
提出的方法
- Use CNNs on MNIST, CIFAR-10, and a keyword spotting (KWS) dataset with FedAvg under IID, 2-class non-IID, and 1-class non-IID partitions.
- Define weight divergence as the relative distance between FedAvg and centralized SGD weights.
- Prove a bound on weight divergence that involves EMD between client distributions and the population distribution.
- Empirically correlate weight divergence with EMD and test accuracy across datasets and non-IID settings.
- Propose and evaluate a data-sharing strategy where a globally shared dataset (uniform across classes) is used at initialization and optionally a warm-up model is trained on G before distributed training.
- Demonstrate accuracy improvements (up to ~30%) on CIFAR-10 with 5% globally shared data.
实验结果
研究问题
- RQ1How does non-IID data distribution across clients affect FedAvg accuracy relative to IID data?
- RQ2Can weight divergence between FedAvg and centralized SGD be bounded by a function of EMD between client distributions and the population distribution?
- RQ3Does introducing a small globally shared dataset mitigate non-IID induced accuracy loss, and by how much?
主要发现
| Non-IID | B(批量大小) | E(本地训练轮次) | MNIST(%) | CIFAR-10(%) | KWS(%) |
|---|---|---|---|---|---|
| Non-IID(1) | large | 1 | 6.52 | 37.66 | 43.64 |
| Non-IID(1) | large | 5 | 6.77 | 37.11 | 43.62 |
| Non-IID(2) | large | 1 | 2.4 | 14.51 | 12.16 |
| Non-IID(1) | small | 1 | 11.31 | 51.31 | 54.5 |
| Non-IID(2) | small | 1 | 1.77 | 15.61 | 15.07 |
- FedAvg accuracy can drop significantly under highly skewed non-IID data (up to 55% for some setups).
- Weight divergence between FedAvg and centralized SGD grows with data skew; it can be bounded by a term involving EMD between client and population distributions.
- EMD increases weight divergence and reduces test accuracy; larger non-IID skew (higher EMD) correlates with larger accuracy loss, with CIFAR-10 showing substantial drops.
- A small globally shared dataset containing a uniform class distribution can substantially recover accuracy, e.g., up to ~30% improvement on CIFAR-10 with 5% shared data.
- Data-sharing requires balancing between the amount of globally shared data (beta) and the fraction of that data distributed to clients (alpha); even partial sharing yields meaningful gains.
- The data-sharing warm-up model on G enables higher starting performance and reduces the required central data volume to achieve gains.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。