Skip to main content
QUICK REVIEW

[论文解读] MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples

Jinyuan Jia, Ahmed Salem|arXiv (Cornell University)|Sep 23, 2019
Adversarial Robustness in Machine Learning被引用 34
一句话总结

MemGuard 通过对置信分数向量添加精心设计的对抗性噪声来抵御黑箱成员推断攻击,具有正式的效用损失保证和两阶段优化。它不需要重新训练目标分类器,并在隐私-效用权衡方面优于现有防御。

ABSTRACT

In a membership inference attack, an attacker aims to infer whether a data sample is in a target classifier's training dataset or not. Specifically, given a black-box access to the target classifier, the attacker trains a binary classifier, which takes a data sample's confidence score vector predicted by the target classifier as an input and predicts the data sample to be a member or non-member of the target classifier's training dataset. Membership inference attacks pose severe privacy and security threats to the training dataset. Most existing defenses leverage differential privacy when training the target classifier or regularize the training process of the target classifier. These defenses suffer from two key limitations: 1) they do not have formal utility-loss guarantees of the confidence score vectors, and 2) they achieve suboptimal privacy-utility tradeoffs. In this work, we propose MemGuard, the first defense with formal utility-loss guarantees against black-box membership inference attacks. Instead of tampering the training process of the target classifier, MemGuard adds noise to each confidence score vector predicted by the target classifier. Our key observation is that attacker uses a classifier to predict member or non-member and classifier is vulnerable to adversarial examples. Based on the observation, we propose to add a carefully crafted noise vector to a confidence score vector to turn it into an adversarial example that misleads the attacker's classifier. Our experimental results on three datasets show that MemGuard can effectively defend against membership inference attacks and achieve better privacy-utility tradeoffs than existing defenses. Our work is the first one to show that adversarial examples can be used as defensive mechanisms to defend against membership inference attacks.

研究动机与目标

  • 激励对黑箱分类器的成员推断攻击的威胁,以及对训练数据集的隐私风险。
  • 提出 MemGuard,一种对置信分数添加噪声的防御,具有正式的效用损失保证。
  • 提供一种两阶段方法,在保持标签完整性的同时设计和应用噪声。
  • 表明 MemGuard 在实际数据集上比此前防御提供更好的隐私-效用权衡。

提出的方法

  • MemGuard 在不重新训练目标分类器的情况下,将噪声向量添加到预测的置信分数向量。
  • Phase I 设计一个噪声向量,使置信向量变成对抗样例,促使防御者自己的分类器进入随机猜测状态,使用一个在约束下最小化畸变的噪声向量 r。
  • Phase II 以一个在 epsilon 预算下界定期望畸变的概率应用噪声向量,同时保持正确的预测标签。
  • 该方法使用一个优化框架,目标是最小化攻击者的推断准确性并在保持有效概率分布的同时,限制以 L1 距离衡量的效用损失。
  • 该防御依赖对抗性样例的转移性,以影响黑-box 攻击者,而无需知道确切的攻击分类器。

实验结果

研究问题

  • RQ1MemGuard 是否能在防御黑箱成员推断攻击的同时提供正式的效用损失保证?
  • RQ2如何设计对抗性样例以满足置信分数的效用约束并保持预测标签?
  • RQ3在真实数据上应用 MemGuard 是否比现有防御在隐私-效用权衡方面有改进?
  • RQ4当攻击者使用黑箱成员推断攻击且可能经过对抗性训练时,该防御是否有效?

主要发现

  • MemGuard 对抗最新的黑箱成员推断攻击取得了有效防御。
  • 在允许的噪声更大的情况下,MemGuard 在相同平均畸变下降低攻击者推断准确性比现有防御更明显。
  • 在所测试的数据集上,MemGuard 提供了比此前防御更好的隐私-效用权衡。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。