QUICK REVIEW

[论文解读] Adversarial Image Perturbation for Privacy Protection -- A Game Theory Perspective

Seong Joon Oh, Mario Fritz|arXiv (Cornell University)|Mar 28, 2017

Adversarial Robustness in Machine Learning参考文献 35被引用 22

一句话总结

本文提出了一种博弈论框架，将对抗性图像扰动（AIP）建模为一种隐私保护技术，以对抗自动化识别系统。通过将用户与识别者视为目标相反的战略参与者，该框架推导出最优的AIP策略，确保无论识别者采取何种反制措施，用户都能获得最大识别率的保障，并引入了选择性AIP，以干扰恶意模型的识别，同时保持对良性模型的识别准确性。

ABSTRACT

Users like sharing personal photos with others through social media. At the same time, they might want to make automatic identification in such photos difficult or even impossible. Classic obfuscation methods such as blurring are not only unpleasant but also not as effective as one would expect. Recent studies on adversarial image perturbations (AIP) suggest that it is possible to confuse recognition systems effectively without unpleasant artifacts. However, in the presence of counter measures against AIPs, it is unclear how effective AIP would be in particular when the choice of counter measure is unknown. Game theory provides tools for studying the interaction between agents with uncertainties in the strategies. We introduce a general game theoretical framework for the user-recogniser dynamics, and present a case study that involves current state of the art AIP and person recognition techniques. We derive the optimal strategy for the user that assures an upper bound on the recognition rate independent of the recogniser's counter measure. Code is available at https://goo.gl/hgvbNK.

研究动机与目标

解决传统混淆方法（如模糊处理）的局限性，这些方法在视觉上不美观且对基于深度学习的识别系统无效。
分析社交媒体用户寻求隐私保护与识别系统试图识别图像中个体之间的战略互动。
利用对抗性图像扰动（AIP）开发一种鲁棒的隐私保护机制，即使识别者采用反制措施，该机制仍保持有效性。
通过将用户-识别者互动建模为具有不确定策略的双人博弈，推导出对用户具有可证明隐私保障的机制。
设计选择性AIP，使其损害恶意模型的识别能力，同时保持对授权（良性）模型的识别准确性。

提出的方法

将用户-识别者互动形式化为不完全信息的双人零和博弈，其中用户施加AIP，识别者施加反制措施。
引入博弈论收益矩阵，以建模用户AIP策略与识别者反制措施各种组合下的识别率结果。
利用博弈论均衡概念推导用户的最优混合策略，确保识别率的上界独立于识别者的选择。
通过优化扰动以降低在一组恶意模型（$\mathcal{M}$）上的性能，同时保持在良性模型（$\mathcal{B}$）上的性能，提出选择性AIP，采用联合优化目标。
采用多模型优化框架，并引入正则化项（$\lambda_k$），以在多个模型之间平衡扰动影响，实现选择性鲁棒性。
通过最先进的模型（如AlexNet、VGG、GoogleNet、ResNet152）和标准图像处理反制措施（如模糊、噪声、缩放）对框架进行实证验证。

实验结果

研究问题

RQ1对抗性图像扰动能否作为一种对基于深度学习的识别系统具有鲁棒性且视觉上令人愉悦的隐私保护技术？
RQ2用户如何确保在识别者选择任意反制措施的情况下，识别率保持在有界范围内？
RQ3对识别者策略空间的有限了解如何影响用户的隐私保障？
RQ4能否设计出选择性AIP，以损害恶意模型的识别能力，同时保持对良性模型的功能性？
RQ5在现实世界图像处理反制措施（如模糊和噪声添加）下，选择性AIP的有效性如何？

主要发现

博弈论框架确保用户的隐私得到保护，即使识别者采用最佳反制措施，识别率的上限也保证在5.8%。
当识别者在其策略空间中随机化时，用户的最优策略可将识别率降低至3.4%，表明在不确定性下隐私保护效果更优。
在对识别者策略空间了解有限的情况下，若识别者利用未知反制措施，用户面临更高的识别率风险（例如8.6%）。
选择性AIP在恶意模型（如GoogleNet）上成功将识别率降低至8.7%，同时在良性模型（如AlexNet）上保持97.9%的准确率（经图像处理后）。
在包含两个恶意模型和两个良性模型的多模型设置中，选择性AIP使恶意模型的识别率降低至17.7%，同时在良性模型上保持97.7%的准确率（处理后）。
将扰动预算从1000增加到2000可提升鲁棒性，在多模型情况下将恶意模型的识别率降低至3.8%。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。