QUICK REVIEW

[论文解读] FedGroup: Accurate Federated Learning via Decomposed Similarity-Based Clustering

Moming Duan, Duo Liu|arXiv (Cornell University)|Oct 14, 2020

Privacy-Preserving Technologies in Data参考文献 34被引用 6

一句话总结

FedGroup 提出了一种基于相似性的客户端聚类框架，用于联邦学习，通过将具有相似优化方向的客户端分组，提升模型准确率，利用 HDLSS 向量分解降低计算复杂度，并支持新客户端的冷启动。该方法在 FEMNIST 上实现 +14.7% 的测试准确率提升，在 Sentiment140 上实现 +5.4% 的提升，优于 FedProx。

ABSTRACT

Federated Learning (FL) enables the multiple participating devices to collaboratively contribute to a global neural network model while keeping the training data locally. Unlike the centralized training setting, the non-IID and imbalanced (statistical heterogeneity) training data of FL is distributed in the federated network, which will increase the divergences between the local models and global model, further degrading performance. In this paper, we propose a novel clustered federated learning (CFL) framework FedGroup based on a similarity-based client clustering strategy, in which we 1) group the training of clients based on the similarities between the clients' optimize directions for high training performance; 2) reduce the complexity of client clustering algorithm by decomposing the high-dimension low-sample size (HDLSS) direction vectors. 3) implement a newcomer device cold start mechanism based on the auxiliary global model for framework scalability and practicality. FedGroup can achieve improvements by dividing joint optimization into groups of sub-optimization, and can be combined with FedProx, the state-of-the-art federated optimization algorithm. We evaluate FedGroup and FedGrouProx (combined with FedProx) on several open datasets. The experimental results show that our proposed frameworks significantly improving absolute test accuracy by +14.7% on FEMNIST compared to FedAvg, +5.4% on Sentiment140 compared to FedProx.

研究动机与目标

解决因客户端间数据非独立同分布（non-IID）和不平衡导致的联邦学习中统计异质性问题。
通过将具有相似优化方向的客户端分组，减少本地模型与全局模型之间的差异。
降低在高维、小样本（HDLSS）设置下客户端聚类的计算复杂度。
通过基于辅助全局模型的新客户端冷启动机制，实现实际部署。
通过将联合优化分解为子优化组，提升训练性能。

提出的方法

基于客户端优化方向向量的余弦相似度对客户端进行聚类，形成具有相似模型更新模式的组别。
将高维、小样本（HDLSS）方向向量分解为低维分量，以降低聚类复杂度。
在每个客户端组内应用子优化，以减少与全局模型的偏差并提升收敛性。
通过辅助全局模型集成冷启动机制，实现新客户端的初始化，而无需完整重训练。
将 FedGroup 与 FedProx 结合形成 FedGrouProx，提升性能的同时保持与现有联邦学习框架的兼容性。

实验结果

研究问题

RQ1基于优化方向相似性的客户端聚类在统计异质性条件下能否提升联邦学习性能？
RQ2HDLSS 向量分解对降低联邦客户端分组中聚类复杂度有何影响？
RQ3基于辅助全局模型的冷启动机制能否提升动态联邦网络中的可扩展性和实用性？
RQ4与全局联合优化相比，将客户端划分为子优化组对收敛性和测试准确率有何影响？
RQ5FedGroup 与 FedProx 的结合在非独立同分布数据上能多大程度上进一步提升性能？

主要发现

与 FedAvg 相比，FedGroup 在 FEMNIST 数据集上将测试准确率提升了 14.7%。
FedGrouProx 在 Sentiment140 数据集上相比 FedProx 实现了 5.4% 的绝对准确率提升。
基于相似性的聚类策略通过将优化方向一致的客户端分组，有效减少了本地模型与全局模型之间的偏差。
HDLSS 向量分解显著降低了客户端聚类的计算复杂度，同时未牺牲聚类质量。
冷启动机制通过辅助全局模型实现了新设备在联邦系统中的无缝集成。
该框架与 FedProx 兼容，支持在实际联邦学习部署中实现增量性能提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。