QUICK REVIEW

[論文レビュー] A Survey of What to Share in Federated Learning: Perspectives on Model Utility, Privacy Leakage, and Communication Efficiency

Jiawei Shao, Zijian Li|arXiv (Cornell University)|Jul 20, 2023

Privacy-Preserving Technologies in Data被引用数 12

ひとこと要約

FL共有方法の包括的な分類法（モデル、合成データ、知識）、プライバシーリスクの分析、防御策、および方法間での有用性と通信コストの経験的比較。

ABSTRACT

Federated learning (FL) has emerged as a secure paradigm for collaborative training among clients. Without data centralization, FL allows clients to share local information in a privacy-preserving manner. This approach has gained considerable attention, promoting numerous surveys to summarize the related works. However, the majority of these surveys concentrate on FL methods that share model parameters during the training process, while overlooking the possibility of sharing local information in other forms. In this paper, we present a systematic survey from a new perspective of what to share in FL, with an emphasis on the model utility, privacy leakage, and communication efficiency. First, we present a new taxonomy of FL methods in terms of three sharing methods, which respectively share model, synthetic data, and knowledge. Second, we analyze the vulnerability of different sharing methods to privacy attacks and review the defense mechanisms. Third, we conduct extensive experiments to compare the learning performance and communication overhead of various sharing methods in FL. Besides, we assess the potential privacy leakage through model inversion and membership inference attacks, while comparing the effectiveness of various defense approaches. Finally, we identify future research directions and conclude the survey.

研究の動機と目的

共有される情報の種類（モデル、合成データ、知識）に基づいて、FLの共有方法の新しい分類を提案する。
各共有方法カテゴリにおけるプライバシー漏洩リスクと防御機構を分析する。
共有方法間での性能と通信オーバーヘッドを経験的に比較し、プライバシー攻撃の脆弱性（モデル反転、メンバーシップ推定）を評価する。
プライバシー保護と効率性を備えたFLシステムの欠点と今後の方向性を論じる。

提案手法

FL共有をモデル共有、合成データ共有、知識共有に分ける分類を導入する。
代表的なFL手法とそれらのデータ/知識共有モダリティをレビュー・分類する（全体モデル、量子化モデル、スパースモデル、合成データ生成、蒸留信号、代理データセット）。
共有タイプごとにプライバシー攻撃の表面（勾配/反転/メンバーシップ推定/属性推定）と防御機構を調査する。
代表的な共有方法の性能と通信オーバーヘッドを比較する実験を実施する。
攻撃下でのプライバシー漏洩リスクを評価し、防御の有効性を評価する。
文献と実証結果から洞察と今後の方向性を総合する。

実験結果

リサーチクエスチョン

RQ1連邦学習における3つの主要な共有モダリティは何で、それらはモデルの有用性、プライバシー漏洩、通信効率にどのように影響するか？
RQ2プライバシー攻撃（例：勾配逆算、メンバーシップ推定、属性推定）は異なる共有方法にどのような影響を与え、どの防御策がこれらのリスクを軽減するか？
RQ3非IIDデータとプライバシー制約の下で、共有方法はモデル性能と通信コストの点で経験的にどのように比較されるか？

主な発見

3カテゴリの分類は、共有タイプ間のモデル有用性、プライバシー、効率性のトレードオフを明確にする。
知識共有と合成データ共有は通信コストを削減しプライバシーを向上させる可能性があるが、非IIDデータが強い場合にはモデル性能に影響することがある。
プライバシー攻撃は共有方法間で情報を漏らす可能性があり、方法ごとに効果が異なる防御機構を必要とする。
経験的結果は、モデル共有、合成データ共有、知識共有戦略の間で通信オーバーヘッドと収束挙動に顕著な差を示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。