QUICK REVIEW

[论文解读] Evaluating Differentially Private Machine Learning in Practice

Bargav Jayaraman, David Evans|arXiv (Cornell University)|Feb 24, 2019

Privacy-Preserving Technologies in Data参考文献 80被引用 85

一句话总结

该论文在梯度扰动的机器学习中实证评估差分隐私的变体（AC、zCDP、RDP），通过成员身份推断和属性推断攻击，在逻辑回归和神经网络中衡量实际隐私泄漏。

ABSTRACT

Differential privacy is a strong notion for privacy that can be used to prove formal guarantees, in terms of a privacy budget, $ε$, about how much information is leaked by a mechanism. However, implementations of privacy-preserving machine learning often select large values of $ε$ in order to get acceptable utility of the model, with little understanding of the impact of such choices on meaningful privacy. Moreover, in scenarios where iterative learning procedures are used, differential privacy variants that offer tighter analyses are used which appear to reduce the needed privacy budget but present poorly understood trade-offs between privacy and utility. In this paper, we quantify the impact of these choices on privacy in experiments with logistic regression and neural network models. Our main finding is that there is a huge gap between the upper bounds on privacy loss that can be guaranteed, even with advanced mechanisms, and the effective privacy loss that can be measured using current inference attacks. Current mechanisms for differentially private machine learning rarely offer acceptable utility-privacy trade-offs with guarantees for complex learning tasks: settings that provide limited accuracy loss provide meaningless privacy guarantees, and settings that provide strong privacy guarantees result in useless models. Code for the experiments can be found here: https://github.com/bargavj/EvaluatingDPML

研究动机与目标

评估不同 epsilon 设置如何影响 DP ML 的隐私-效用权衡。
比较高级组合、zCDP 与 RDP 在实际中对累积隐私损失的影响。
量化通过成员推断和属性推断攻击得到的实际隐私泄漏。
评估在逻辑回归和神经网络上的 DP ML 实现，以识别实际风险。

提出的方法

回顾差分隐私的定义及其变体（AC、zCDP、RDP）及其理论上的组合性质。
聚焦以梯度扰动作为 ML 的 DP 机制，并通过梯度裁剪来限制灵敏度。
使用 moments accountant/等效分析来在不同 DP 概念下界定累积隐私损失。
用成员推断和属性推断攻击对深度学习(DL)和逻辑回归(LR)模型进行实证隐私泄漏评估。
将 DP 机制应用于基于经验风险最小化的学习任务和非凸深度学习场景，以研究效用影响。

实验结果

研究问题

RQ1不同的 epsilon 值和 DP 变体（AC、zCDP、RDP）如何影响 DP ML 的隐私-效用权衡？
RQ2正式 DP 保证与通过攻击观测到的实际隐私泄漏之间的实际差距有多大？
RQ3将 DP 变体应用于逻辑回归和神经网络时，在泄漏方面的表现如何？
RQ4更严格的组合分析是否在真实世界的 ML 任务中转化为有意义的隐私保护？

主要发现

DP 变体所保证的上界与已知攻击能够观察到的实际情况之间存在巨大的差距。
在可接受的效用水平下，无论使用哪种 DP 变体，形式上的保证基本没有意义。
在容易泄漏的设置下，成员推断和属性推断攻击所观测到的泄漏仍然相对较低。
对 epsilon 的敏感性在不同 DP 变体之间存在差异，效用与泄漏之间的权衡取决于任务和模型。
带裁剪的梯度扰动是深度学习中实现 DP 的一个实用方法，但在多次迭代中的组合会增加隐私预算。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。