[论文解读] Federated Learning on Non-IID Data Silos: An Experimental Study
这篇论文提出 NIID-Bench,是一个全面的基准,用于在多样化的非IID数据分区中评估联邦学习,并在九个数据集上实证分析四种FL算法,显示没有单一方法占优,非IID分布显著影响性能。
Due to the increasing privacy concerns and data regulations, training data have been increasingly fragmented, forming distributed databases of multiple "data silos" (e.g., within different organizations and countries). To develop effective machine learning services, there is a must to exploit data from such distributed databases without exchanging the raw data. Recently, federated learning (FL) has been a solution with growing interests, which enables multiple parties to collaboratively train a machine learning model without exchanging their local data. A key and common challenge on distributed databases is the heterogeneity of the data distribution among the parties. The data of different parties are usually non-independently and identically distributed (i.e., non-IID). There have been many FL algorithms to address the learning effectiveness under non-IID data settings. However, there lacks an experimental study on systematically understanding their advantages and disadvantages, as previous studies have very rigid data partitioning strategies among parties, which are hardly representative and thorough. In this paper, to help researchers better understand and study the non-IID data setting in federated learning, we propose comprehensive data partitioning strategies to cover the typical non-IID data cases. Moreover, we conduct extensive experiments to evaluate state-of-the-art FL algorithms. We find that non-IID does bring significant challenges in learning accuracy of FL algorithms, and none of the existing state-of-the-art FL algorithms outperforms others in all cases. Our experiments provide insights for future studies of addressing the challenges in "data silos".
研究动机与目标
- 识别水平联邦学习在分布式数据分区中非IID数据的关键挑战。
- 开发一个包含六种分区策略的全面非IID数据基准(NIID-Bench)。
- 在多样化的非IID设置下评估最先进的FL算法(FedAvg、FedProx、SCAFFOLD、FedNova)。
- 提供洞见并提供一个公开的排行榜以指导未来的FL研究。
提出的方法
- 引入覆盖标签倾斜、特征倾斜和数量倾斜的六种非IID分区策略。
- 通过对真实世界数据集进行分区来综合分布式非IID数据集,以实现受控的不平衡属性。
- 实现 NIID-Bench,提供公开代码和排行榜(论文中给出链接)。
- 使用标准CNN/MLP架构和SGD优化,在九个数据集(图像和表格)上进行实验。
- 在 rounds 中比较四种FL算法(FedAvg、FedProx、SCAFFOLD、FedNova),以顶级1准确率为指标。
- 分析数据偏斜类型如何影响收敛性、稳定性和最终准确率。
实验结果
研究问题
- RQ1在分布式数据分区中,常见的FL算法在一组广泛的非IID数据分布下的表现如何?
- RQ2哪些非IID分区策略能揭示每种FL算法的优点与弱点?
- RQ3是否存在一种在多样化的非IID场景中始终优于其他算法的单一算法?
- RQ4数据偏斜(标签、特征、数量)如何影响学习的稳定性和收敛?
- RQ5NIID-Bench 是否能为非IID数据分区上的鲁棒联邦学习未来方向提供借鉴?
主要发现
| category | 数据集 | partitioning | FedAvg | FedProx | SCAFFOLD | FedNova |
|---|---|---|---|---|---|---|
| 标签分布偏斜 | MNIST | p_k ~ Dir(0.5) | 98.9% ± 0.1% | 98.9% ± 0.1% | 99.0% ± 0.1% | 98.9% ± 0.1% |
| 标签分布偏斜 | FMNIST | p_k ~ Dir(0.5) | 88.1% ± 0.6% | 88.1% ± 0.9% | 88.4% ± 0.5% | 88.5% ± 0.5% |
| 标签分布偏斜 | CIFAR-10 | p_k ~ Dir(0.5) | 68.2% ± 0.7% | 67.9% ± 0.7% | 69.8% ± 0.7% | 66.8% ± 1.5% |
| 标签分布偏斜 | SVHN | p_k ~ Dir(0.5) | 86.1% ± 0.7% | 86.6% ± 0.9% | 86.8% ± 0.3% | 86.4% ± 0.6% |
| 数量偏斜 | MNIST | q ~ Dir(0.5) | 99.2% ± 0.1% | 99.2% ± 0.1% | 99.1% ± 0.1% | 99.1% ± 0.1% |
| 数量偏斜 | FMNIST | q ~ Dir(0.5) | 89.4% ± 0.1% | 89.7% ± 0.3% | 88.8% ± 0.4% | 86.1% ± 2.9% |
- 非IID数据显著降低了FL算法的学习准确率。
- 没有单一的最先进FL算法在所有非IID设置中占优。
- 标签分布偏斜通常比数量偏斜更具挑战性。
- 在非IIDFL中,批量归一化和部分采样导致的训练不稳定性普遍存在。
- NIID-Bench 展现了不同偏斜类型下算法表现的差异性,凸显了需要一个全面基准的必要性。
- 提供了公开的排行榜和代码库以便未来评估。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。