QUICK REVIEW

[论文解读] Data Poisoning Attacks Against Federated Learning Systems

Vale Tolpegin, Stacey Truex|arXiv (Cornell University)|Jul 16, 2020

Adversarial Robustness in Machine Learning参考文献 56被引用 51

一句话总结

本文通过标签翻转在联邦学习中演示了定向数据投毒，全球准确率和源类别召回显著下降，并提出基于PCA的防御以识别恶意更新。

ABSTRACT

Federated learning (FL) is an emerging paradigm for distributed training of large-scale deep neural networks in which participants' data remains on their own devices with only model updates being shared with a central server. However, the distributed nature of FL gives rise to new threats caused by potentially malicious participants. In this paper, we study targeted data poisoning attacks against FL systems in which a malicious subset of the participants aim to poison the global model by sending model updates derived from mislabeled data. We first demonstrate that such data poisoning attacks can cause substantial drops in classification accuracy and recall, even with a small percentage of malicious participants. We additionally show that the attacks can be targeted, i.e., they have a large negative impact only on classes that are under attack. We also study attack longevity in early/late round training, the impact of malicious participant availability, and the relationships between the two. Finally, we propose a defense strategy that can help identify malicious participants in FL to circumvent poisoning attacks, and demonstrate its effectiveness.

研究动机与目标

Motivate the study of privacy-preserving distributed learning and its vulnerability to malicious participants.
Characterize targeted data poisoning attacks in federated learning using label flipping.
Evaluate attack impact under varying attacker presence, timing, and availability.
Propose and empirically validate a defense strategy to identify malicious participants in FL.

提出的方法

Formalize a federated learning setup with an honest aggregator and multiple participants holding local data.
Use label flipping (src -> target) as the poisoning strategy for malicious participants.
Evaluate impact on CIFAR-10 and Fashion-MNIST using CNN architectures under varying malicious-participant rates.
Analyze attack timing (early vs late poisoning) and malicious-participant availability on attack efficacy.
Propose a defense where the aggregator detects malicious updates by extracting update components, applying PCA for dimensionality reduction, and identifying outliers.

实验结果

研究问题

RQ1How effective are targeted label flipping attacks in federated learning at reducing global model utility?
RQ2Do label flipping attacks preferentially degrade specific source/target classes versus remaining classes?
RQ3How do attack timing and malicious-participant availability affect attack impact in FL?
RQ4Can a PCA-based defense reliably distinguish malicious updates from honest ones to mitigate poisoning?

主要发现

Label flipping attacks can substantially reduce global model accuracy and source-class recall even with a minority of malicious participants.
Attacks tend to be targeted, causing large declines in source and target class recalls while leaving other classes relatively preserved.
Late-round poisoning and higher malicious-participant availability increase attack effectiveness, while the model can recover after poisoning ends in many scenarios.
A defense using PCA on processed update components can separate malicious updates from honest ones, enabling the aggregator to identify and block attackers.
Attack impact is more pronounced on the targeted source class recall than on other metrics, indicating a focused poisoning effect.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。