QUICK REVIEW

[論文レビュー] Eavesdrop the Composition Proportion of Training Labels in Federated Learning

Lixu Wang, Shichao Xu|arXiv (Cornell University)|Oct 14, 2019

Privacy-Preserving Technologies in Data参考文献 52被引用数 34

ひとこと要約

本論文は、個々のクライアント更新を観察せずに、連合学習における訓練ラベルの構成比を明らかにする三つの推論攻撃—Class Sniffing、Quantity Inference、Whole Determination—を提案し、たとえ安全な集計や差分プライバシー下であっても同様に達成可能である。

ABSTRACT

Federated learning (FL) has recently emerged as a new form of collaborative machine learning, where a common model can be learned while keeping all the training data on local devices. Although it is designed for enhancing the data privacy, we demonstrated in this paper a new direction in inference attacks in the context of FL, where valuable information about training data can be obtained by adversaries with very limited power. In particular, we proposed three new types of attacks to exploit this vulnerability. The first type of attack, Class Sniffing, can detect whether a certain label appears in training. The other two types of attacks can determine the quantity of each label, i.e., Quantity Inference attack determines the composition proportion of the training label owned by the selected clients in a single round, while Whole Determination attack determines that of the whole training process. We evaluated our attacks on a variety of tasks and datasets with different settings, and the corresponding results showed that our attacks work well generally. Finally, we analyzed the impact of major hyper-parameters to our attacks and discussed possible defenses.

研究の動機と目的

連合学習における新しいプライバシー脆弱性の領域を提案する：訓練ラベルの数量的構成を推測すること。
個別の勾配更新を観察することに依存しない、三つの攻撃（Class Sniffing、Quantity Inference、Whole Determination）を提案する。
様々なタスク/データセットでの有効性を示し、ハイパーパラメータの影響と防御策について議論する。

提案手法

Class Sniffing：出力ニューロンの入力接続の更新を分析することにより、ある特定のラベルが訓練ラウンドに現れるかを推測する。
Quantity Inference：正の/負の重み更新の大きさを比較し、オフセット効果を除去して、特定のラベルを所有するクライアント数を推定する。
Whole Determination：比率指標と派生特徴量のクラスタリングを用いて、全訓練過程にわたるラベル構成比を評価する。

実験結果

リサーチクエスチョン

RQ1攻撃者は個別の更新を観察せずに、単一のFL訓練ラウンドで特定のラベルの有無を判断できるか。
RQ2攻撃者は平文の更新にアクセスせず、単一ラウンドおよび全訓練過程を通じて、各ラベルを所有するクライアント数の構成（数量の構成）を推定できるか。
RQ3訓練ラベルの構成比を、セキュア集約や差分プライバシーのような集約保護に対して頑健に推定できるか。

主な発見

ラベルの存在検知とラベル数の推定において高い成功率を達成する、三つの新しいラベル量推定攻撃。
攻撃は、安全な集約と差分プライバシー設定下でも有効であり、グローバルモデル更新と補助データに依存しているため。
定量的手法により、単一ラウンドおよび全訓練ラウンドにおけるラベル構成の漏洩を可能にし、FLにおける新たなプライバシリスクの次元を示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。