QUICK REVIEW

[论文解读] Membership Inference Attacks on Machine Learning: A Survey

Hongsheng Hu, Zoran Salčić|arXiv (Cornell University)|Mar 14, 2021

Adversarial Robustness in Machine Learning被引用 31

一句话总结

本文首次对机器学习模型的成员身份推断攻击（MIAs）及其防御进行了全面综述，包含分类体系、数据集、度量标准和开放资源。

ABSTRACT

Machine learning (ML) models have been widely applied to various applications, including image classification, text generation, audio recognition, and graph data analysis. However, recent studies have shown that ML models are vulnerable to membership inference attacks (MIAs), which aim to infer whether a data record was used to train a target model or not. MIAs on ML models can directly lead to a privacy breach. For example, via identifying the fact that a clinical record that has been used to train a model associated with a certain disease, an attacker can infer that the owner of the clinical record has the disease with a high chance. In recent years, MIAs have been shown to be effective on various ML models, e.g., classification models and generative models. Meanwhile, many defense methods have been proposed to mitigate MIAs. Although MIAs on ML models form a newly emerging and rapidly growing research area, there has been no systematic survey on this topic yet. In this paper, we conduct the first comprehensive survey on membership inference attacks and defenses. We provide the taxonomies for both attacks and defenses, based on their characterizations, and discuss their pros and cons. Based on the limitations and gaps identified in this survey, we point out several promising future research directions to inspire the researchers who wish to follow this area. This survey not only serves as a reference for the research community but also provides a clear description for researchers outside this research domain. To further help the researchers, we have created an online resource repository, which we will keep updated with future relevant work. Interested readers can find the repository at https://github.com/HongshengHu/membership-inference-machine-learning-literature.

研究动机与目标

对机器学习模型的成员身份推断攻击及防御进行全面评述。
基于攻击者知识、目标模型和方法学，建立对MIAs与防御的分类体系。
总结数据集、度量标准和开源资源，以支持经验研究与基准测试。

提出的方法

将MIAs分为基于二元分类器的方法和基于度量的方法，并详细说明白盒和黑盒设置。
描述影子训练作为为攻击模型构建训练数据的方法。
在输入表示下，解释白盒和黑盒情形中的二元分类器攻击模型。
概述使用预测正确性、损失、置信度和熵等指标的基于度量的MIAs。

实验结果

研究问题

RQ1ML模型的成员身份推断攻击的主要类别和机制是什么？
RQ2在白盒和黑盒访问条件下，攻击者如何构建有效的MIAs？
RQ3有哪些防御措施以及用于研究MIAs的数据集/度量？未来方向是什么？
RQ4有哪些开放资源可以促进MIAs及防御的基准测试与复现？

主要发现

MIAs 利用训练成员与非成员之间的模型行为差异来推断成员身份。
影子训练通过创建影子模型及带标签的成员/非成员数据，使基于二元分类器的MIAs更为有效。
白盒MIAs提供的信息多于黑盒MIAs，但在知识受限的情况下，黑盒MIAs也可能非常危险。
基于度量的MIAs 使用预测正确性、损失、置信度或熵来通过阈值判断成员身份。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。