QUICK REVIEW

[论文解读] Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

Briland Hitaj, Giuseppe Ateniese|arXiv (Cornell University)|Feb 24, 2017

Privacy-Preserving Technologies in Data参考文献 67被引用 224

一句话总结

论文提出基于 GAN 的主动推理攻击，使内部人员能够在协作式（分布式/联邦）深度学习中从受害者处重建私有训练数据，即使共享被混淆或微分隐私保护。

ABSTRACT

Deep Learning has recently become hugely popular in machine learning, providing significant improvements in classification accuracy in the presence of highly-structured and large databases. Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15. Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level DP applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).

研究动机与目标

证明协作式深度学习可能向内部攻击者泄露训练数据。
展示 GAN 在学习过程中能够从私有训练数据中生成原型样本。
论证记录级差分隐私不足以在协作环境中防止此类攻击。
强调在协作式深度学习中应用差分隐私的风险，并将集中式学习作为一个隐私保护替代方案。

提出的方法

开发一个主动对手，冒充分布式学习协议中的协作者。
训练一个 GAN，利用受害者持续演化模型的反馈来生成与受害者私有数据相似的样本。
利用实时学习动力学从受害者的类别分布中重构原型训练样本。
证明即使本地参数通过差分隐私混淆，攻击仍然有效。
论证该攻击在白盒访问场景和仅交换部分梯度的参数共享情形下也有效。

实验结果

研究问题

RQ1内部攻击者是否可以使用 GAN 在协作深度学习中重构其他参与者的私有训练数据？
RQ2将差分隐私应用于共享参数是否能防止在协作环境中出现这样的基于 GAN 的信息泄露？
RQ3在对模型反演弱势的 CNN 及其他架构上，该攻击是否仍然有效？
RQ4在联邦学习中的安全聚合是否能降低风险，还是内部攻击者仍可利用学习动力学进行攻击？

主要发现

基于 GAN 的攻击可以在不访问原始数据的情况下生成在分布上与受害者私有数据不可区分的样本。
即使对参数在记录级别进行差分隐私混淆，攻击仍然有效。
攻击者可以影响学习过程，促使受害者披露更详细的信息。
该威胁适用于联邦或去中心化学习，以及集中式学习，因为内部参与者也可能导致泄露。
GAN 相较传统的模型反演方法，尤其对于 CNN，提供更强的信息泄露。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。