QUICK REVIEW

[论文解读] A General Approach to Adding Differential Privacy to Iterative Training Procedures

H. Brendan McMahan, Galen Andrew|arXiv (Cornell University)|Dec 15, 2018

Privacy-Preserving Technologies in Data被引用 56

一句话总结

本文提出一个模块化框架，通过将训练过程、隐私机制配置和隐私核算解耦，将差分隐私应用到迭代训练中，并将 Moments Accountant 泛化到异构向量查询。

ABSTRACT

In this work we address the practical challenges of training machine learning models on privacy-sensitive datasets by introducing a modular approach that minimizes changes to training algorithms, provides a variety of configuration strategies for the privacy mechanism, and then isolates and simplifies the critical logic that computes the final privacy guarantees. A key challenge is that training algorithms often require estimating many different quantities (vectors) from the same set of examples --- for example, gradients of different layers in a deep learning architecture, as well as metrics and batch normalization parameters. Each of these may have different properties like dimensionality, magnitude, and tolerance to noise. By extending previous work on the Moments Accountant for the subsampled Gaussian mechanism, we can provide privacy for such heterogeneous sets of vectors, while also structuring the approach to minimize software engineering challenges.

研究动机与目标

在隐私敏感数据上推动将实用的 DP 集成到迭代训练中。
将训练过程、隐私机制配置和隐私核算解耦，以减少错误和工程量。
将 DP 核算推广到在训练过程中收集的异构向量组。
提供用于对向量进行分组并在组之间组合隐私保证的模块化机制。
提供实现指南以及对 TensorFlow Privacy 的参考以用于实际应用。

提出的方法

将训练更新表示为带有分组 clipping 和 noise 参数的高斯和查询。
为向量组引入联合与分离的 clipping 策略，以管理尺度和维度的异质性。
展示如何将来自多个向量组的隐私保证组合成一个等效的高斯和查询以进行核算。
描述通过采样概率 q、clip S_g 和噪声 sigma_g 来平衡隐私-效用权衡的超参数策略。
提出一个隐私账本和事后隐私核算（RDP），使隐私成本计算与训练实现解耦。

实验结果

研究问题

RQ1在每轮查询多个异构向量（如层梯度、批量归一化统计量、度量）时，如何将差分隐私应用于迭代训练？
RQ2如何在具有潜在不同范数和噪声水平的多个向量组之间组合隐私保证？
RQ3哪些实用策略和工具（如 TensorFlow Privacy）支持鲁棒、可配置的 DP 训练，同时不损害训练代码的完整性？
RQ4在大规模迭代训练中，哪些超参数策略（采样、裁剪、噪声）能实现期望的隐私-效用权衡？

主要发现

模块化方法通过将 Moments Accountant 扩展到多向量查询，实现在异构向量集合上的隐私保护。
在某些尺度多样的场景中，采用带分组尺度的联合裁剪可能优于逐向量裁剪。
单一高斯和查询等价性使得可以对复杂的多组 DP 机制应用统一的隐私核算器。
超参数策略（q、S_g、sigma_g）为实现目标隐私保证（ε、δ）并保持效用提供指导。
TensorFlow Privacy 实现了这些思路，提供 DPQuery 抽象和用于事后核算的隐私账本。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。