[论文解读] You are who you know and how you behave: attribute inference attacks via users' social friends and behaviors
本文提出了一种新颖的属性推断攻击方法,通过结合用户的社交好友关系和行为记录(例如点赞的页面或应用)来推断其隐私属性,如地理位置、职业和兴趣。通过将社交数据与行为数据统一建模,该攻击在包含110万名用户的现实世界数据集中,实现了57%的用户城市推断成功率,当仅针对预测置信度最高的半数用户时,成功率超过90%。
We propose new privacy attacks to infer attributes (e.g., locations, occupations, and interests) of online social network users. Our attacks leverage seemingly innocent user information that is publicly available in online social networks to infer missing attributes of targeted users. Given the increasing availability of (seemingly innocent) user information online, our results have serious implications for Internet privacy - private attributes can be inferred from users' publicly available data unless we take steps to protect users from such inference attacks. To infer attributes of a targeted user, existing inference attacks leverage either the user's publicly available social friends or the user's behavioral records (e.g., the webpages that the user has liked on Facebook, the apps that the user has reviewed on Google Play), but not both. As we will show, such inference attacks achieve limited success rates. However, the problem becomes qualitatively different if we consider both social friends and behavioral records. To address this challenge, we develop a novel model to integrate social friends and behavioral records and design new attacks based on our model. We theoretically and experimentally demonstrate the effectiveness of our attacks. For instance, we observe that, in a real-world large-scale dataset with 1.1 million users, our attack can correctly infer the cities a user lived in for 57% of the users; via confidence estimation, we are able to increase the attack success rate to over 90% if the attacker selectively attacks a half of the users. Moreover, we show that our attack can correctly infer attributes for significantly more users than previous attacks.
研究动机与目标
- 解决现有属性推断攻击方法仅依赖社交好友或行为数据中的一种而存在的局限性。
- 探究结合社交网络结构与用户行为是否能提升属性推断的准确性。
- 设计一种统一模型,有效整合社交与行为数据,以实现更高效的推断攻击。
- 评估此类联合攻击在大规模社交网络数据上的现实可行性与成功率。
提出的方法
- 开发一种新颖的机器学习模型,将用户社交好友关系和行为记录(例如点赞的页面、应用)联合编码为统一表征。
- 利用集成表征预测目标用户的隐私属性,如地理位置、职业和兴趣。
- 应用置信度估计技术,识别并优先针对预测置信度较高的用户实施选择性攻击。
- 利用大规模真实世界数据集(110万名用户)训练并评估攻击模型。
- 将所提出的联合模型性能与仅使用社交或行为数据的现有方法进行对比。
- 采用标准评估指标(如准确率和AUC)衡量攻击的成功率。
实验结果
研究问题
- RQ1与单独使用任一数据类型相比,结合社交好友和行为数据是否能显著提升属性推断攻击的成功率?
- RQ2所提出的模型在推断用户居住城市、职业和兴趣等隐私属性方面效果如何?
- RQ3通过聚焦于最易预测的用户,置信度估计在多大程度上能提升攻击性能?
- RQ4该攻击在现实世界大规模社交网络数据上的准确率和覆盖范围方面具有怎样的可扩展性?
主要发现
- 所提出的攻击在包含110万名用户的现实世界数据集中,成功推断用户所在城市的准确率达到57%。
- 通过使用置信度估计技术,选择性地针对预测置信度最高的半数用户,攻击成功率提升至90%以上。
- 该攻击显著优于以往仅依赖社交好友或行为记录的方法,在推断更多用户属性方面表现更优。
- 社交与行为数据的整合带来了推断性能的质的提升,证明了多模态数据在隐私攻击中的价值。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。