QUICK REVIEW

[论文解读] Privacy Odometers and Filters: Pay-as-you-Go Composition

Ryan Rogers, Aaron Roth|arXiv (Cornell University)|May 26, 2016

Privacy-Preserving Technologies in Data参考文献 9被引用 40

一句话总结

本文引入了隐私过滤器和隐私里程表，用于自适应组合的差分隐私，其中隐私参数（ε, δ）可在分析过程中动态选择。研究证明，隐私过滤器可实现与标准组合定理相当的界，而隐私里程表仅产生微小的渐近因子损失——从而在理论上区分了这两种使用场景。

ABSTRACT

In this paper we initiate the study of adaptive composition in differential privacy when the length of the composition, and the privacy parameters themselves can be chosen adaptively, as a function of the outcome of previously run analyses. This case is much more delicate than the setting covered by existing composition theorems, in which the algorithms themselves can be chosen adaptively, but the privacy parameters must be fixed up front. Indeed, it isn't even clear how to define differential privacy in the adaptive parameter setting. We proceed by defining two objects which cover the two main use cases of composition theorems. A privacy filter is a stopping time rule that allows an analyst to halt a computation before his pre-specified privacy budget is exceeded. A privacy odometer allows the analyst to track realized privacy loss as he goes, without needing to pre-specify a privacy budget. We show that unlike the case in which privacy parameters are fixed, in the adaptive parameter setting, these two use cases are distinct. We show that there exist privacy filters with bounds comparable (up to constants) with existing privacy composition theorems. We also give a privacy odometer that nearly matches non-adaptive private composition theorems, but is sometimes worse by a small asymptotic factor. Moreover, we show that this is inherent, and that any valid privacy odometer in the adaptive parameter setting must lose this factor, which shows a formal separation between the filter and odometer use-cases.

研究动机与目标

解决当隐私参数（ε, δ）自适应选择而非预先固定时的差分隐私组合挑战。
定义并形式化两种新原原子：隐私过滤器（在达到隐私预算时停止计算）和隐私里程表（实时跟踪已实现的隐私损失）。
研究现有组合定理是否可扩展至查询数量和隐私参数均自适应选择的场景。
在自适应参数设置下，建立隐私过滤器与隐私里程表之间的正式分离，揭示里程表性能的固有局限性。
为过滤器和里程表提供近乎最优的紧致界，其中里程表相比非自适应组合仅损失一个微小的渐近因子。

提出的方法

将隐私过滤器定义为停止时间规则，在超出预设隐私预算前停止计算，确保（ε_total, δ_total）-差分隐私。
将隐私里程表定义为持续跟踪已实现隐私损失的机制，无需预设预算，提供累积隐私成本的实时估计。
使用集中不等式和鞅分析，界定在自适应选择ε_i和δ_i时隐私损失的尾部概率。
应用定理4.5（有界鞅的集中不等式），通过仔细选择参数（c = 1/n, t = n），推导出隐私损失的高概率界。
推导出两个主要界：一个适用于∑ε_i² ∈ [1/n², 1] 时的隐私里程表，另一个适用于∑ε_i²在此范围之外的情况，通过参数γ实现精细化调整。
使用一种翻译技术，将纯差分隐私（δ=0）的结果转化为一般（ε,δ）-DP结果，同时保持隐私保证并控制常数膨胀。

实验结果

研究问题

RQ1差分隐私组合定理能否扩展至隐私参数（ε, δ）在分析过程中自适应选择的场景？
RQ2隐私过滤器——在超过隐私预算前停止计算的机制——在自适应参数设置下是否仍然有效？
RQ3隐私里程表——实时跟踪累积隐私损失的机制——能否实现与非自适应组合定理相当的界？
RQ4在参数自适应选择时，隐私过滤器与隐私里程表的能力是否存在根本性差异？
RQ5在自适应参数设置下，里程表隐私界最小的渐近损失是多少？该损失是否不可避免？

主要发现

即使ε和δ自适应选择，隐私过滤器仍可构建为与现有非自适应组合定理界相当（仅常数因子差异）。
构造了一个隐私里程表，其界几乎匹配非自适应组合定理，但在偏差项中额外引入了√(log log n)的渐近因子。
该渐近因子被证明是固有的：在自适应参数设置下，任何有效的隐私里程表都必须承受此损失，从而在理论上确立了与过滤器的正式分离。
里程表的界表达为∑ε_i(e^{ε_i}−1)/2 + √(2∑ε_i²(log(110e) + 2log(log n / δ_g)))，该界以至少1−δ_g的概率成立。
当∑ε_i²超出[1/n², 1]范围时，通过精细参数γ调整界，将log(log n)替换为log(1/γ)，保持鲁棒性。
分析表明，ε_i < 1/(10n)的算法对效用和隐私的贡献可忽略，从而支持将1/n²作为有意义隐私损失累积的下限。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。