Skip to main content
QUICK REVIEW

[论文解读] Are Your Sensitive Attributes Private? Novel Model Inversion Attribute Inference Attacks on Classification Models

Shagufta Mehnaz, Sayanton V. Dibbo|arXiv (Cornell University)|Jan 23, 2022
Adversarial Robustness in Machine Learning被引用 20
一句话总结

本文提出两种新的黑盒模型反演属性推断攻击(基于置信分数的和仅标签的),优于现有工作,并将它们扩展到部分知识和多属性,以及研究跨群体的差异性脆弱性。

ABSTRACT

Increasing use of machine learning (ML) technologies in privacy-sensitive domains such as medical diagnoses, lifestyle predictions, and business decisions highlights the need to better understand if these ML technologies are introducing leakage of sensitive and proprietary training data. In this paper, we focus on model inversion attacks where the adversary knows non-sensitive attributes about records in the training data and aims to infer the value of a sensitive attribute unknown to the adversary, using only black-box access to the target classification model. We first devise a novel confidence score-based model inversion attribute inference attack that significantly outperforms the state-of-the-art. We then introduce a label-only model inversion attack that relies only on the model's predicted labels but still matches our confidence score-based attack in terms of attack effectiveness. We also extend our attacks to the scenario where some of the other (non-sensitive) attributes of a target record are unknown to the adversary. We evaluate our attacks on two types of machine learning models, decision tree and deep neural network, trained on three real datasets. Moreover, we empirically demonstrate the disparate vulnerability of model inversion attacks, i.e., specific groups in the training dataset (grouped by gender, race, etc.) could be more vulnerable to model inversion attacks.

研究动机与目标

  • 研究对分类模型的黑盒访问是否能够从训练数据推断敏感属性。
  • 提出两种新型的MIAI攻击(基于置信分数的和仅标签的),并在性能上优于先前方法。
  • 将攻击扩展到对非敏感属性部分已知和可推断多种敏感属性的场景。
  • 在真实表格数据集上的决策树和深度神经网络中评估攻击,以评估隐私风险和群体差异。

提出的方法

  • 设计并实现一个基于置信分数的MIAI(CSMIA),利用模型的置信分数来推断敏感属性的取值。
  • 开发一个仅标签的MIAI(LOMIA),仅依赖预测标签,不依赖置信分数,并证明其有效性与CSMIA相当。
  • 将攻击扩展到处理对非敏感属性的部分知识以及推断多种敏感属性。
  • 提出超越准确率的评估指标(G-mean、MCC),以更好地评估反演脆弱性。
  • 与基线攻击(NaiveA、RandGA、FJRMIA)进行比较,并在GSS、Adult、FiveThirtyEight数据集上的决策树和深度网络上评估性能。

实验结果

研究问题

  • RQ1对目标分类器的黑盒访问是否能显著提升对目标个体敏感属性的推断,相较于非模型基线?
  • RQ2基于置信分数的和仅标签的MIAI策略在有效性上是否可比?
  • RQ3非敏感属性部分知识以及多重敏感属性如何影响攻击效果?
  • RQ4模型反演攻击是否存在对不同人口统计群体的差异化脆弱性?
  • RQ5攻击是否能迁移到来自同一分布但不在训练集中的数据(分布式隐私)?

主要发现

  • CSMIA和LOMIA在测试数据集和模型上显著优于现有最先进的攻击。
  • LOMIA仅使用预测标签,其效果等同于CSMIA。
  • 当一些非敏感属性未知时,攻击仍然有效。
  • 观察到差异化脆弱性:某些群体(如按性别、种族划分)对反演攻击可能更易受攻击。
  • 攻击不仅会侵害训练数据隐私,还可能侵害来自同一分布数据的分布隐私。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。