QUICK REVIEW

[論文レビュー] The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation

Huancheng Chen, Johnny|arXiv (Cornell University)|Jan 21, 2023

Privacy-Preserving Technologies in Data被引用数 11

ひとこと要約

FedHKDはパブリックデータや生成モデルなしでハイパー知識（平均表現とソフト予測）を共有することにより、フェデレーテッドラーニングの個別化とグローバルな改善を可能にし、異種データでの強力な性能を示します。

ABSTRACT

Heterogeneity of data distributed across clients limits the performance of global models trained through federated learning, especially in the settings with highly imbalanced class distributions of local datasets. In recent years, personalized federated learning (pFL) has emerged as a potential solution to the challenges presented by heterogeneous data. However, existing pFL methods typically enhance performance of local models at the expense of the global model's accuracy. We propose FedHKD (Federated Hyper-Knowledge Distillation), a novel FL algorithm in which clients rely on knowledge distillation (KD) to train local models. In particular, each client extracts and sends to the server the means of local data representations and the corresponding soft predictions -- information that we refer to as ``hyper-knowledge". The server aggregates this information and broadcasts it to the clients in support of local training. Notably, unlike other KD-based pFL methods, FedHKD does not rely on a public dataset nor it deploys a generative model at the server. We analyze convergence of FedHKD and conduct extensive experiments on visual datasets in a variety of scenarios, demonstrating that FedHKD provides significant improvement in both personalized as well as global model performance compared to state-of-the-art FL methods designed for heterogeneous data settings.

研究の動機と目的

グローバルモデルが異種クライアントデータで性能低下を引き起こす問題に対処する。
各クライアントの個別モデルの精度を損なうことなく、強力な個別モデルを実現する。
プライバシーを保護するデータなし知識蒸留メカニズムを提案する。

提案手法

クライアントはクラスごとの平均データ表現と平均ソフト予測（ハイパー知識）を計算する。
サーバは差分プライバシーを用いてハイパー知識を集約し、次ラウンドにブロードキャストする。
ローカルトレーニングは3項ロス：交差エントロピー、グローバルソフト予測への近接、ローカル表現とグローバル表現の近接。
ハイパー知識は共有前にガウス機構でプライベート化される。
公開データセットやサーバー側生成モデルは不要。

実験結果

リサーチクエスチョン

RQ1FedHKDは高度に異種データの下でローカル（個別）とグローバルモデルの精度を同時に向上させることができるか？
RQ2データなしのハイパー知識蒸留は収束性とプライバシーにどのように影響するか？
RQ3非IID条件下でのFedHKDの性能は、最先端のKDベースおよび非KDのFL手法と比較してどうか？

主な発見

Dataset	Scheme	Local Acc	Global Acc	Params (M)	Time (s)	Pub Data	# Clients
SVHN	FedAvg	0.6766	0.7329	0.6544	0.4948	No	10
SVHN	FedProx	0.6927	0.6717	0.6991	0.5191	No	10
SVHN	Moon	0.6602	0.7085	0.7192	0.4883	No	10
SVHN	FedAlign	0.7675	0.7920	0.7656	0.6426	No	10
SVHN	FedGen	0.5788	0.5658	0.4679	0.3622	Yes	10
SVHN	FedMD	0.8038	0.8086	0.7912	0.6812	Yes	10
SVHN	FedProto	0.8071	0.8148	0.8039	0.6064	No	10
SVHN	FedHKD*	0.8064	0.8157	0.8072	0.6405	No	10
SVHN	FedHKD	0.8086	0.8381	0.7891	0.6781	No	10
CIFAR10	FedAvg	0.5950	0.6261	0.5825	0.4741	No	10
CIFAR10	FedProx	0.5981	0.6295	0.6490	0.4793	No	10
CIFAR10	Moon	0.5901	0.6482	0.5513	0.4579	No	10
CIFAR10	FedAlign	0.5948	0.6023	0.6402	0.4976	No	10
CIFAR10	FedGen	0.5879	0.6395	0.6533	0.4800	No	10
CIFAR10	FedMD	0.6147	0.6666	0.6533	0.5088	Yes	10
CIFAR10	FedProto	0.6131	0.6505	0.5939	0.5012	No	10
CIFAR10	FedHKD*	0.6227	0.6515	0.6675	0.5049	No	10
CIFAR10	FedHKD	0.6254	0.6816	0.6671	0.5213	No	10
CIFAR100	FedAvg	0.2361	0.2625	0.2658	0.2131	No	10
CIFAR100	FedProx	0.2332	0.2814	0.2955	0.2267	No	10
CIFAR100	Moon	0.2353	0.2729	0.2428	0.2141	No	10
CIFAR100	FedAlign	0.2467	0.2617	0.2854	0.2281	No	10
CIFAR100	FedGen	0.2393	0.2701	0.2739	0.2176	No	10
CIFAR100	FedMD	0.2681	0.3054	0.3293	0.2323	Yes	10
CIFAR100	FedProto	0.2568	0.3188	0.3170	0.2121	No	10
CIFAR100	FedHKD*	0.2551	0.2997	0.3016	0.2286	No	10
CIFAR100	FedHKD	0.2981	0.3245	0.3375	0.2369	No	10

FedHKDはSVHN、CIFAR10、CIFAR100の両方のローカルおよびグローバル精度で一般的にベースラインを上回る。
SVHNではFedHKDはローカル精度を最大20ポイント、グローバル精度を最大39ポイント改善。
FedHKDは公的データや生成モデルを使わずに、しばしば1位または2位の精度を獲得（FedMD, FedGenと比較）。
FedHKDは追加の正則化項のため、1ラウンドあたりの訓練時間の増加を控えめに保つ。
FedHKD*（特徴抽出器制約なし）はグローバル精度で依然としてFedProtoを上回る。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。