QUICK REVIEW

[论文解读] PrivyNet: A Flexible Framework for Privacy-Preserving Deep Neural Network Training

Meng Li, Liangzhen Lai|arXiv (Cornell University)|Sep 18, 2017

Privacy-Preserving Technologies in Data参考文献 17被引用 28

一句话总结

PrivyNet 是一种保护隐私的深度学习框架，将 DNN 拆分为本地特征提取器和基于云的分类器，利用预训练网络生成中间表示。通过基于 LDA 的剪枝优化本地网络拓扑，相比随机选择，其准确率提高 1.1%，PSNR 降低 1.25 dB（表明隐私泄露减少），在资源受限设备上实现了强大的效用-隐私权衡。

ABSTRACT

Massive data exist among user local platforms that usually cannot support deep neural network (DNN) training due to computation and storage resource constraints. Cloud-based training schemes provide beneficial services but suffer from potential privacy risks due to excessive user data collection. To enable cloud-based DNN training while protecting the data privacy simultaneously, we propose to leverage the intermediate representations of the data, which is achieved by splitting the DNNs and deploying them separately onto local platforms and the cloud. The local neural network (NN) is used to generate the feature representations. To avoid local training and protect data privacy, the local NN is derived from pre-trained NNs. The cloud NN is then trained based on the extracted intermediate representations for the target learning task. We validate the idea of DNN splitting by characterizing the dependency of privacy loss and classification accuracy on the local NN topology for a convolutional NN (CNN) based image classification task. Based on the characterization, we further propose PrivyNet to determine the local NN topology, which optimizes the accuracy of the target learning task under the constraints on privacy loss, local computation, and storage. The efficiency and effectiveness of PrivyNet are demonstrated with the CIFAR-10 dataset.

研究动机与目标

解决在资源受限的本地设备上训练深度神经网络的同时保护用户数据隐私的挑战。
实现在不传输原始用户数据的情况下进行基于云的 DNN 训练，降低因过度收集数据带来的隐私风险。
通过可配置的本地网络拓扑，实现效用与隐私泄露之间的灵活、细粒度权衡。
设计一种轻量级、可部署的框架，适用于计算和存储能力各异的多样化平台。
确保本地网络无需重新训练，从而保护隐私并降低计算开销。

提出的方法

将 DNN 拆分为两部分：本地神经网络（NN）用于特征提取，基于云的 NN 用于特定任务训练。
从预训练模型（如 VGG16）中提取本地 NN，避免本地训练并嵌入通用特征。
在本地 NN 中使用非线性、有损操作（卷积、池化）将数据转换为具有隐私保护特性的中间表示。
采用基于线性判别分析（LDA）的监督剪枝策略，选择本地 NN 中最优通道，以提升效用并减少隐私泄露。
在输出层和中间层均引入通道选择，以隐藏本地 NN 的结构，防止潜在攻击者推断。
引入表征框架，建模隐私损失与准确率对本地 NN 拓扑的依赖关系，实现在约束条件下的拓扑优化。

实验结果

研究问题

RQ1本地神经网络的拓扑如何影响在拆分 DNN 训练中隐私泄露与分类准确率之间的权衡？
RQ2预训练网络能否被有效重用于固定特征提取器，以避免本地训练并保持高效用？
RQ3基于 LDA 的监督剪枝相比随机剪枝或基于表征的剪枝，能在多大程度上改善效用-隐私权衡？
RQ4中间层的通道选择如何在不降低性能的前提下增强本地网络的匿名性？
RQ5剪枝对本地计算成本有何影响？是否能在不牺牲效用或增加隐私泄露的前提下显著降低计算成本？

主要发现

基于 LDA 的剪枝策略相比无剪枝的随机选择，实现了 1.1% 的更高分类准确率，PSNR 降低 1.25 dB（表明隐私泄露减少）。
与基于表征的剪枝相比，基于 LDA 的方法在准确率相近（相差不超过 0.5%）的前提下，隐私损失减少 0.45 dB，证明了更高的效率。
将本地 NN 的第一卷积层通道数从 64 减少到 16，显著降低了运行时间，对准确率或隐私的影响可忽略。
对所有卷积层进行渐进式剪枝（深度减半）后，准确率和隐私水平保持相近，同时大幅降低了本地计算开销。
中间层的通道选择显著提升了本地网络的匿名性，即使攻击者知晓预训练模型，也难以推断其结构。
该框架在多种设置下成功平衡了效用与隐私，实证结果证实了在资源和隐私约束下拓扑优化的有效性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。