QUICK REVIEW

[论文解读] The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation

Huancheng Chen, Johnny|arXiv (Cornell University)|Jan 21, 2023

Privacy-Preserving Technologies in Data被引用 11

一句话总结

FedHKD 通过共享超知识（均值表示和软预测）在不公开数据或生成模型的情况下实现联邦学习的个性化与全球改进，并在异质数据上证明了强大性能。

ABSTRACT

Heterogeneity of data distributed across clients limits the performance of global models trained through federated learning, especially in the settings with highly imbalanced class distributions of local datasets. In recent years, personalized federated learning (pFL) has emerged as a potential solution to the challenges presented by heterogeneous data. However, existing pFL methods typically enhance performance of local models at the expense of the global model's accuracy. We propose FedHKD (Federated Hyper-Knowledge Distillation), a novel FL algorithm in which clients rely on knowledge distillation (KD) to train local models. In particular, each client extracts and sends to the server the means of local data representations and the corresponding soft predictions -- information that we refer to as ``hyper-knowledge". The server aggregates this information and broadcasts it to the clients in support of local training. Notably, unlike other KD-based pFL methods, FedHKD does not rely on a public dataset nor it deploys a generative model at the server. We analyze convergence of FedHKD and conduct extensive experiments on visual datasets in a variety of scenarios, demonstrating that FedHKD provides significant improvement in both personalized as well as global model performance compared to state-of-the-art FL methods designed for heterogeneous data settings.

研究动机与目标

解决全球模型在异质客户端数据上的性能下降问题。
在不牺牲全局准确性的前提下，为每个客户端实现强个人化模型。
提出一种私有化的数据自由知识蒸馏机制。

提出的方法

客户端计算每个类别的均值数据表示和均值软预测（超知识）。
服务器在差分隐私保护下聚合超知识并广播给下一轮。
本地训练使用三项损失：交叉熵、对全局软预测的接近度、以及局部表示对全局表示的接近度。
在分享前使用高斯机制对超知识进行私有化处理。
无需公开数据集或服务器端生成模型。

实验结果

研究问题

RQ1FedHKD 在高度异质数据下是否能同时提升本地（个性化）和全球模型的准确性？
RQ2数据自由的超知识蒸馏如何影响收敛性与隐私性？
RQ3在非独立同分布条件下，FedHKD 相较于最先进的基于KD的以及非KD的联邦学习方法的性能如何？

主要发现

数据集	方案	本地准确率	全局准确率	参数量（M）	耗时（s）	公开数据	客户端数量
SVHN	FedAvg	0.6766	0.7329	0.6544	0.4948	否	10
SVHN	FedProx	0.6927	0.6717	0.6991	0.5191	否	10
SVHN	Moon	0.6602	0.7085	0.7192	0.4883	否	10
SVHN	FedAlign	0.7675	0.7920	0.7656	0.6426	否	10
SVHN	FedGen	0.5788	0.5658	0.4679	0.3622	是	10
SVHN	FedMD	0.8038	0.8086	0.7912	0.6812	是	10
SVHN	FedProto	0.8071	0.8148	0.8039	0.6064	否	10
SVHN	FedHKD*	0.8064	0.8157	0.8072	0.6405	否	10
SVHN	FedHKD	0.8086	0.8381	0.7891	0.6781	否	10
CIFAR10	FedAvg	0.5950	0.6261	0.5825	0.4741	否	10
CIFAR10	FedProx	0.5981	0.6295	0.6490	0.4793	否	10
CIFAR10	Moon	0.5901	0.6482	0.5513	0.4579	否	10
CIFAR10	FedAlign	0.5948	0.6023	0.6402	0.4976	否	10
CIFAR10	FedGen	0.5879	0.6395	0.6533	0.4800	否	10
CIFAR10	FedMD	0.6147	0.6666	0.6533	0.5088	是	10
CIFAR10	FedProto	0.6131	0.6505	0.5939	0.5012	否	10
CIFAR10	FedHKD*	0.6227	0.6515	0.6675	0.5049	否	10
CIFAR10	FedHKD	0.6254	0.6816	0.6671	0.5213	否	10
CIFAR100	FedAvg	0.2361	0.2625	0.2658	0.2131	否	10
CIFAR100	FedProx	0.2332	0.2814	0.2955	0.2267	否	10
CIFAR100	Moon	0.2353	0.2729	0.2428	0.2141	否	10
CIFAR100	FedAlign	0.2467	0.2617	0.2854	0.2281	否	10
CIFAR100	FedGen	0.2393	0.2701	0.2739	0.2176	否	10
CIFAR100	FedMD	0.2681	0.3054	0.3293	0.2323	是	10
CIFAR100	FedProto	0.2568	0.3188	0.3170	0.2121	否	10
CIFAR100	FedHKD*	0.2551	0.2997	0.3016	0.2286	否	10
CIFAR100	FedHKD	0.2981	0.3245	0.3375	0.2369	否	10

FedHKD 在 SVHN、CIFAR10、CIFAR100 的本地和全局准确性方面通常优于基线。
在 SVHN 上，FedHKD 将本地准确性提升最多达 20 个百分点，全球准确性提升最多达 39 个百分点，相较于 FedAvg。
FedHKD 在无公开数据或生成模型的情况下（相较于 FedMD、FedGen）往往在准确性上名列第一或第二。
FedHKD 相较于 FedAvg 的每轮训练时间增量较小，原因在于额外的正则化项。
FedHKD*（不含特征提取器约束）在全球准确性方面仍优于 FedProto。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。