QUICK REVIEW

[论文解读] Analyzing Federated Learning through an Adversarial Lens

Arjun Nitin Bhagoji, Supriyo Chakraborty|arXiv (Cornell University)|Nov 29, 2018

Adversarial Robustness in Machine Learning参考文献 30被引用 385

一句话总结

该论文表明，单个恶意联邦学习代理即可执行定向模型污染，在隐蔽性和拜占庭鲁棒聚合下也能造成特定错误分类，同时全局模型仍能良好收敛。

ABSTRACT

Federated learning distributes model training among a multitude of agents, who, guided by privacy concerns, perform training using their local data but share only model parameter updates, for iterative aggregation at the server. In this work, we explore the threat of model poisoning attacks on federated learning initiated by a single, non-colluding malicious agent where the adversarial objective is to cause the model to misclassify a set of chosen inputs with high confidence. We explore a number of strategies to carry out this attack, starting with simple boosting of the malicious agent's update to overcome the effects of other agents' updates. To increase attack stealth, we propose an alternating minimization strategy, which alternately optimizes for the training loss and the adversarial objective. We follow up by using parameter estimation for the benign agents' updates to improve on attack success. Finally, we use a suite of interpretability techniques to generate visual explanations of model decisions for both benign and malicious models and show that the explanations are nearly visually indistinguishable. Our results indicate that even a highly constrained adversary can carry out model poisoning attacks while simultaneously maintaining stealth, thus highlighting the vulnerability of the federated learning setting and the need to develop effective defense strategies.

研究动机与目标

激励并量化联邦学习在单个不共谋的恶意代理下对模型污染的脆弱性。
证明在保持全局模型收敛的同时，可以实现对选定输入的定向错误分类。
在各种聚合方案下，探索带提升、隐蔽适应性与交替最小化的攻击策略。
通过准确率检查和权重更新统计评估可检测性，并分析拜占庭鲁棒聚合的鲁棒性。

提出的方法

用一个恶意代理形式化联邦学习中的定向模型污染威胁模型。
开发显式提升以放大恶意更新对良性更新的影响。
引入面向隐蔽性的损失项使恶意更新与验证准确率和更新统计一致。
提出交替最小化策略以分离对抗性和隐蔽性目标。
在如Krum和坐标中值等拜占庭鲁棒聚合机制下研究攻击。
纳入一种估计方法，在恶意代理并非每轮都被选中时更好地预测其他代理的更新。

实验结果

研究问题

RQ1单个恶意联邦学习代理是否能在保持整体收敛的同时诱导全局模型出现定向错误分类？
RQ2在标准聚合和拜占庭鲁棒聚合下，提升、隐蔽性和交替最小化策略在实现定向污染方面有多有效？
RQ3拜占庭鲁棒机制（Krum、坐标中值）是否能抵挡单个对手的定向模型污染？
RQ4对手在并非每轮都被选中时，是否能估计其他代理的更新以提高攻击成功率？
RQ5引入的隐蔽性度量（验证准确率检查和权重更新统计）在检测恶意更新方面是否有效？

主要发现

单个恶意代理的定向模型污染可以在高置信度下强制全局模型对选定输入产生错误分类，同时模型仍收敛到良好的测试性能。
显式提升使恶意更新支配良性更新，从而实现定向错误分类（例如，对Fashion-MNIST示例的100%置信度）。
基于验证准确率和权重更新统计的隐蔽性度量可以揭示恶意更新，隐蔽增强攻击在许多轮次中可规避检测。
隐蔽性与交替最小化的攻击可以维持接近良性更新分布，在大多数轮次中在不触发基于准确率或基于距离的警报的情况下实现高攻击成功率。
如Krum和坐标中值等拜占庭鲁棒聚合并不能完全抵御定向模型污染，在这些方案下攻击仍然有效。
估计其他代理的更新（前一步估计）提高攻击成功率，尤其是在恶意代理并非每轮都被选中时。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。