QUICK REVIEW

[论文解读] Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12

Tang Jun, Aleksandra Korolova|arXiv (Cornell University)|Sep 8, 2017

Privacy-Preserving Technologies in Data参考文献 11被引用 186

一句话总结

本文分析 Apple 在 MacOS 10.12 上的本地差分隐私部署，揭示了每条数据的隐私参数，但由于预算管理和双层预算，日总隐私损失要大得多。

ABSTRACT

In June 2016, Apple announced that it will deploy differential privacy for some user data collection in order to ensure privacy of user data, even from Apple. The details of Apple's approach remained sparse. Although several patents have since appeared hinting at the algorithms that may be used to achieve differential privacy, they did not include a precise explanation of the approach taken to privacy parameter choice. Such choice and the overall approach to privacy budget use and management are key questions for understanding the privacy protections provided by any deployment of differential privacy. In this work, through a combination of experiments, static and dynamic code analysis of macOS Sierra (Version 10.12) implementation, we shed light on the choices Apple made for privacy budget management. We discover and describe Apple's set-up for differentially private data processing, including the overall data pipeline, the parameters used for differentially private perturbation of each piece of data, and the frequency with which such data is sent to Apple's servers. We find that although Apple's deployment ensures that the (differential) privacy loss per each datum submitted to its servers is $1$ or $2$, the overall privacy loss permitted by the system is significantly higher, as high as $16$ per day for the four initially announced applications of Emojis, New words, Deeplinks and Lookup Hints. Furthermore, Apple renews the privacy budget available every day, which leads to a possible privacy loss of 16 times the number of days since user opt-in to differentially private data collection for those four applications. We advocate that in order to claim the full benefits of differentially private data collection, Apple must give full transparency of its implementation, enable user choice in areas related to privacy loss, and set meaningful defaults on the privacy loss permitted.

研究动机与目标

了解 Apple 如何在 MacOS 10.12 中实现本地差分隐私。
识别使用的隐私预算管理与每条数据的隐私参数。
评估数据报告的频率及隐私损失随时间的累计方式。
评估隐私预算系统的透明度、可配置性及潜在的滥用向量。

提出的方法

对 macOS Sierra (10.12) 实现进行静态与动态代码分析。
对 DifferentialPrivacy.framework 和 dprivacyd 守护进程进行反编译与跟踪。
检查数据库表、配置文件和报告文件，以映射隐私预算和数据流。
通过实验性修改配置参数来观察对隐私参数和预算行为的影响。

实验结果

研究问题

RQ1在私有化前，不同数据类型使用的每条数据的隐私参数是什么？
RQ2记录被选入报告的频率有多高，以及每次报告和每天的最大隐私损失是多少？
RQ3每台设备的总隐私损失随时间是有界还是无界？
RQ4系统对参数与时序操作的鲁棒性如何，以及潜在的滥用向量是什么？

主要发现

每条数据的隐私参数值按数据类型定义（例如表情符号、NewWords），并与配置文件中的值一致。
系统按 BudgetKeyName 使用预算余额（ZBALANCE），并使用每个 SessionSeconds 增加的 SessionAmount 来控制预算增长。
报告生成器按 KeyName 选择最多 min(SessionAmount, 40) 条记录，受可用预算余额的约束，对于活跃类型，每日隐私损失等于 PrivacyParameter × SessionAmount。
隐私预算每个 SessionSeconds 重新补充，未使用的预算会结转，可能导致四个初始应用在时间上出现无界的总隐私损失。
四个初始应用的每日可允许隐私损失可能达到 16，总设备损失可以随自愿选择后的天数扩展，原因在于预算补充机制。
实现包含安全措施（硬编码的限制、配置更改困难），但仍可能被 root 访问滥用，或 Apple 将来更改预算或参数导致潜在滥用。
macOS 10.12.1 与 10.12.3 配置之间存在差异，特别是 NewWords 的 SessionAmount 增加，以及新增的 health/local words 预算，以及 SubmissionPriority。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。