QUICK REVIEW

[论文解读] Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising

Borja Balle, Yu-Xiang Wang|arXiv (Cornell University)|May 16, 2018

Privacy-Preserving Technologies in Data被引用 129

一句话总结

本文提出一种解析的 Gaussian 机制，采用基于 CDF 的方差标定用于差分隐私，并加入后处理去噪以在高维情形下提高准确性。

ABSTRACT

The Gaussian mechanism is an essential building block used in multitude of differentially private data analysis algorithms. In this paper we revisit the Gaussian mechanism and show that the original analysis has several important limitations. Our analysis reveals that the variance formula for the original mechanism is far from tight in the high privacy regime ($\\varepsilon \ o 0$) and it cannot be extended to the low privacy regime ($\\varepsilon \ o \\infty$). We address these limitations by developing an optimal Gaussian mechanism whose variance is calibrated directly using the Gaussian cumulative density function instead of a tail bound approximation. We also propose to equip the Gaussian mechanism with a post-processing step based on adaptive estimation techniques by leveraging that the distribution of the perturbation is known. Our experiments show that analytical calibration removes at least a third of the variance of the noise compared to the classical Gaussian mechanism, and that denoising dramatically improves the accuracy of the Gaussian mechanism in the high-dimensional regime.

研究动机与目标

识别 classical Gaussian 机制在高隐私和低隐私情况下的局限性。
开发一种具有精确 CDF 基于标定的 analytic Gaussian 机制，用于 (ε,δ)-DP。
证明后处理去噪可以在不损及隐私的前提下提升效用。
提供实现 analytic calibration 和 denoising 的实际算法和指南。
通过合成数据实验和真实数据应用展示效用提升。

提出的方法

为 Gaussian 噪声开发一个精确的、基于 CDF 的标定条件，以确保 (ε,δ)-DP（Theorem 8）。
用显式的 Gaussian CDF 表达式替代尾部界基分析，以计算最优的 σ（Algorithm 1）。
通过涉及隐私损失随机变量的必要充要条件证明 DP 等价性（Theorem 5）。
给出一个可实现的算法，使用 Φ 计算 σ 并处理数值稳定性（Theorem 9）。
提出后处理去噪策略（贝叶斯和极小极大），利用已知噪声分布（Theorem 10、Theorem 11、Theorem 12）。
在均值估计和 NYC taxi heat maps 上展示去噪的收益（Section 5）。

实验结果

研究问题

RQ1是否可以通过 Gaussian CDF 将经典 Gaussian 机制的标定限比基于 ε 的尾部界更紧？
RQ2考虑完整的 DP 条件时，达到 (ε,δ)-DP 所需的精确 σ 值是多少？
RQ3对 Gaussian 噪声输出进行后处理去噪是否在不损害隐私的前提下提升效用，特别是在高维情形？
RQ4解析标定和去噪在合成数据和真实数据集（如 NYC taxi heat maps）上的表现如何？

主要发现

Analytic Gaussian Mechanism 通过使用 Gaussian CDF 标定 σ，降低噪声方差，从而实现更紧的隐私-实用权衡（当 ε→0 时方差显著下降）。
使用贝叶斯或极小极大策略对 Gaussian 噪声输出去噪，在高维及均值估计方面带来显著精度提升。
软阈值去噪对未知函数类具有良好适应性，在 Lp-球内提供近似最优风险，并改进 DP 发布。
应用实验表明，结合后处理在隐私均值估计和 NYC taxi heat maps 上显著提升实用性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。