QUICK REVIEW

[论文解读] Towards Demystifying Membership Inference Attacks

Stacey Truex, Ling Liu|arXiv (Cornell University)|Jun 28, 2018

Adversarial Robustness in Machine Learning参考文献 35被引用 86

一句话总结

本论文将黑盒成员资格推断攻击形式化，构建使用阴影数据集和阴影模型的通用攻击框架，并在模型和数据集间的可迁移性和数据驱动的脆弱性方面进行实证分析，包括联邦学习中的内部人员风险。

ABSTRACT

Membership inference attacks seek to infer membership of individual training instances of a model to which an adversary has black-box access through a machine learning-as-a-service API. In providing an in-depth characterization of membership privacy risks against machine learning models, this paper presents a comprehensive study towards demystifying membership inference attacks from two complimentary perspectives. First, we provide a generalized formulation of the development of a black-box membership inference attack model. Second, we characterize the importance of model choice on model vulnerability through a systematic evaluation of a variety of machine learning models and model combinations using multiple datasets. Through formal analysis and empirical evidence from extensive experimentation, we characterize under what conditions a model may be vulnerable to such black-box membership inference attacks. We show that membership inference vulnerability is data-driven and corresponding attack models are largely transferable. Though different model types display different vulnerabilities to membership inference, so do different datasets. Our empirical results additionally show that (1) using the type of target model under attack within the attack model may not increase attack effectiveness and (2) collaborative learning exposes vulnerabilities to membership inference risks when the adversary is a participant. We also discuss countermeasure and mitigation strategies.

研究动机与目标

在黑盒访问下对机器学习服务中的成员隐私风险进行表征。
开发一个带阴影数据集和阴影模型的通用攻击模型框架。
评估目标模型类型和训练数据如何影响对成员资格推断攻击的脆弱性。
在联邦学习情景下探索内部成员资格推断风险。
讨论对策与缓解策略。

提出的方法

将黑盒成员资格推断正式化为二分类任务的通用攻击模型。
通过 API探测引入阴影数据集生成，以模拟目标训练数据结构。
从阴影模型创建攻击模型训练数据，以训练二元成员身份分类器。
探索对阴影模型生成的集成方法，以提高攻击的通用性和鲁棒性。
演示数据驱动的脆弱性以及攻击模型在不同目标模型和数据集之间的可迁移性。
在联邦学习中作为成员身份推断风险考察内部威胁。

实验结果

研究问题

RQ1在什么条件下模型容易受到黑盒成员资格推断攻击？
RQ2目标模型类型、训练数据和攻击数据生成如何影响攻击有效性和可迁移性？
RQ3阴影数据集和阴影模型是否能准确反映目标模型行为以实现有效攻击？
RQ4在联邦学习（federated learning）设置中，成员资格推断的风险是什么，包括内部威胁？

主要发现

成员资格推断脆弱性是数据驱动的，攻击模型在不同设置之间具有很大可迁移性。
不同数据集和不同目标模型会产生不同的脆弱性，表明并不存在适用于所有情况的单一薄弱模式。
在攻击模型中使用目标模型类型并不一定会提高攻击效果。
协作或联邦学习环境在内部人员参与时会暴露成员资格推断的脆弱性。
通过阴影数据集和阴影模型进行的攻击构造即使在黑盒访问下也可以有效。
讨论了应对这些隐私风险的对策和缓解策略。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。