QUICK REVIEW

[论文解读] When Machine Learning Meets Privacy: A Survey and Outlook

Bo Liu, Ming Ding|arXiv (Cornell University)|Nov 24, 2020

Privacy-Preserving Technologies in Data参考文献 170被引用 96

一句话总结

对机器学习中隐私问题的全面综述，将 ML 角色分为保护目标、保护工具和攻击工具，并概述未来研究方向。

ABSTRACT

The newly emerged machine learning (e.g. deep learning) methods have become a strong driving force to revolutionize a wide range of industries, such as smart healthcare, financial technology, and surveillance systems. Meanwhile, privacy has emerged as a big concern in this machine learning-based artificial intelligence era. It is important to note that the problem of privacy preservation in the context of machine learning is quite different from that in traditional data privacy protection, as machine learning can act as both friend and foe. Currently, the work on the preservation of privacy and machine learning (ML) is still in an infancy stage, as most existing solutions only focus on privacy problems during the machine learning process. Therefore, a comprehensive study on the privacy preservation problems and machine learning is required. This paper surveys the state of the art in privacy issues and solutions for machine learning. The survey covers three categories of interactions between privacy and machine learning: (i) private machine learning, (ii) machine learning aided privacy protection, and (iii) machine learning-based privacy attack and corresponding protection schemes. The current research progress in each category is reviewed and the key challenges are identified. Finally, based on our in-depth analysis of the area of privacy and machine learning, we point out future research directions in this field.

研究动机与目标

调研机器学习隐私的最新进展并识别关键挑战。
将隐私与 ML 的互动分为三种角色（私有 ML、ML 辅助的隐私保护、基于 ML 的隐私攻击）。
分析针对 ML 的隐私攻击、保护方案以及协作学习方法对隐私的影响。
为隐私保护的 ML 提供指导和未来研究方向。

提出的方法

按照隐私中的 ML 角色对现有工作进行分类（私有 ML、ML 辅助的隐私保护、ML 基于的隐私攻击）。
回顾私有 ML 中的攻击模型和保护方案，包括模型/数据隐私，以及各种威胁设定（白盒/黑盒）。
讨论加密、混淆/差分隐私以及安全计算作为 ML 的隐私保护技术。
解释分布式与协作学习框架（联邦学习、拆分学习）及其隐私影响。
总结 ML 辅助的隐私保护方法和基于 ML 的隐私攻击，为未来研究提供洞见。

实验结果

研究问题

RQ1在 ML 系统中，主要的隐私威胁是什么、以及 ML 在隐私中的角色（模型/数据隐私、ML 作为保护工具、ML 作为攻击工具）是什么？
RQ2哪些隐私保护技术与架构（加密、差分隐私、安全多方计算、联邦/拆分学习）能够有效保护 ML 模型和数据？
RQ3现有工作如何在私有 ML、ML 辅助隐私保护、基于 ML 的隐私攻击之间对攻击与保护进行分类和比较？
RQ4在具有非结构化数据和复杂模型的 ML 环境中，隐私面临的开放挑战与未来方向是什么？

主要发现

ML 中的隐私涉及三个相互作用的角色：ML 作为保护目标、ML 作为保护工具、以及 ML 作为攻击工具。
私有 ML 攻击聚焦于训练数据隐私和模型隐私，威胁包括模型提取、特征估计、成员身份推断和模型记忆。
差分隐私及相关隐私审计（moments accountant，Rényi DP）是核心，但在 ML 中存在局限性，尤其是对于非结构化数据。
加密（包括同态加密）和安全多方计算可以保护数据和模型，但会带来显著的计算和通信开销。
协作学习框架（联邦学习、拆分学习）可以降低数据暴露，但也带来独特的隐私风险和防御需求。
ML 辅助的隐私保护使用 ML 来识别隐私风险并调整共享策略，而 ML 基于的隐私攻击则利用 ML 的能力来推断敏感数据。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。