Skip to main content
QUICK REVIEW

[论文解读] The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation

Huancheng Chen, Johnny|arXiv (Cornell University)|Jan 21, 2023
Privacy-Preserving Technologies in Data被引用 11
一句话总结

FedHKD 通过共享超知识(均值表示和软预测)在不公开数据或生成模型的情况下实现联邦学习的个性化与全球改进,并在异质数据上证明了强大性能。

ABSTRACT

Heterogeneity of data distributed across clients limits the performance of global models trained through federated learning, especially in the settings with highly imbalanced class distributions of local datasets. In recent years, personalized federated learning (pFL) has emerged as a potential solution to the challenges presented by heterogeneous data. However, existing pFL methods typically enhance performance of local models at the expense of the global model's accuracy. We propose FedHKD (Federated Hyper-Knowledge Distillation), a novel FL algorithm in which clients rely on knowledge distillation (KD) to train local models. In particular, each client extracts and sends to the server the means of local data representations and the corresponding soft predictions -- information that we refer to as ``hyper-knowledge". The server aggregates this information and broadcasts it to the clients in support of local training. Notably, unlike other KD-based pFL methods, FedHKD does not rely on a public dataset nor it deploys a generative model at the server. We analyze convergence of FedHKD and conduct extensive experiments on visual datasets in a variety of scenarios, demonstrating that FedHKD provides significant improvement in both personalized as well as global model performance compared to state-of-the-art FL methods designed for heterogeneous data settings.

研究动机与目标

  • 解决全球模型在异质客户端数据上的性能下降问题。
  • 在不牺牲全局准确性的前提下,为每个客户端实现强个人化模型。
  • 提出一种私有化的数据自由知识蒸馏机制。

提出的方法

  • 客户端计算每个类别的均值数据表示和均值软预测(超知识)。
  • 服务器在差分隐私保护下聚合超知识并广播给下一轮。
  • 本地训练使用三项损失:交叉熵、对全局软预测的接近度、以及局部表示对全局表示的接近度。
  • 在分享前使用高斯机制对超知识进行私有化处理。
  • 无需公开数据集或服务器端生成模型。

实验结果

研究问题

  • RQ1FedHKD 在高度异质数据下是否能同时提升本地(个性化)和全球模型的准确性?
  • RQ2数据自由的超知识蒸馏如何影响收敛性与隐私性?
  • RQ3在非独立同分布条件下,FedHKD 相较于最先进的基于KD的以及非KD的联邦学习方法的性能如何?

主要发现

数据集方案本地准确率全局准确率参数量(M)耗时(s)公开数据客户端数量
SVHNFedAvg0.67660.73290.65440.494810
SVHNFedProx0.69270.67170.69910.519110
SVHNMoon0.66020.70850.71920.488310
SVHNFedAlign0.76750.79200.76560.642610
SVHNFedGen0.57880.56580.46790.362210
SVHNFedMD0.80380.80860.79120.681210
SVHNFedProto0.80710.81480.80390.606410
SVHNFedHKD*0.80640.81570.80720.640510
SVHNFedHKD0.80860.83810.78910.678110
CIFAR10FedAvg0.59500.62610.58250.474110
CIFAR10FedProx0.59810.62950.64900.479310
CIFAR10Moon0.59010.64820.55130.457910
CIFAR10FedAlign0.59480.60230.64020.497610
CIFAR10FedGen0.58790.63950.65330.480010
CIFAR10FedMD0.61470.66660.65330.508810
CIFAR10FedProto0.61310.65050.59390.501210
CIFAR10FedHKD*0.62270.65150.66750.504910
CIFAR10FedHKD0.62540.68160.66710.521310
CIFAR100FedAvg0.23610.26250.26580.213110
CIFAR100FedProx0.23320.28140.29550.226710
CIFAR100Moon0.23530.27290.24280.214110
CIFAR100FedAlign0.24670.26170.28540.228110
CIFAR100FedGen0.23930.27010.27390.217610
CIFAR100FedMD0.26810.30540.32930.232310
CIFAR100FedProto0.25680.31880.31700.212110
CIFAR100FedHKD*0.25510.29970.30160.228610
CIFAR100FedHKD0.29810.32450.33750.236910
  • FedHKD 在 SVHN、CIFAR10、CIFAR100 的本地和全局准确性方面通常优于基线。
  • 在 SVHN 上,FedHKD 将本地准确性提升最多达 20 个百分点,全球准确性提升最多达 39 个百分点,相较于 FedAvg。
  • FedHKD 在无公开数据或生成模型的情况下(相较于 FedMD、FedGen)往往在准确性上名列第一或第二。
  • FedHKD 相较于 FedAvg 的每轮训练时间增量较小,原因在于额外的正则化项。
  • FedHKD*(不含特征提取器约束)在全球准确性方面仍优于 FedProto。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。