QUICK REVIEW

[论文解读] Not My Deepfake: Towards Plausible Deniability for Machine-Generated Media

Baiwu Zhang, Jin Zhou|arXiv (Cornell University)|Aug 20, 2020

Generative Adversarial Networks and Image Synthesis被引用 10

一句话总结

本文提出一种双框架方法，为生成式AI模型的开发者提供可信否认能力，尤其针对深度伪造篡改情境。该方法利用基于概率的熵归因（准确率达97.62%）和密码学日志记录，证明在被指控生成恶意媒体时的清白，为虚假归因提供透明且可验证的防御。

ABSTRACT

Progress in generative modelling, especially generative adversarial networks, have made it possible to efficiently synthesize and alter media at scale. Malicious individuals now rely on these machine-generated media, or deepfakes, to manipulate social discourse. In order to ensure media authenticity, existing research is focused on deepfake detection. Yet, the very nature of frameworks used for generative modeling suggests that progress towards detecting deepfakes will enable more realistic deepfake generation. Therefore, it comes at no surprise that developers of generative models are under the scrutiny of stakeholders dealing with misinformation campaigns. As such, there is a clear need to develop tools that ensure the transparent use of generative modeling, while minimizing the harm caused by malicious applications. We propose a framework to provide developers of generative models with plausible deniability. We introduce two techniques to provide evidence that a model developer did not produce media that they are being accused of. The first optimizes over the source of entropy of each generative model to probabilistically attribute a deepfake to one of the models. The second involves cryptography to maintain a tamper-proof and publicly-broadcasted record of all legitimate uses of the model. We evaluate our approaches on the seminal example of face synthesis, demonstrating that our first approach achieves 97.62% attribution accuracy, and is less sensitive to perturbations and adversarial examples. In cases where a machine learning approach is unable to provide plausible deniability, we find that involving cryptography as done in our second approach is required. We also discuss the ethical implications of our work, and highlight that a more meaningful legislative framework is required for a more transparent and ethical use of generative modeling.

研究动机与目标

应对由先进生成模型生成的恶意深度伪造日益增长的威胁。
为开发者提供可验证的工具，以抵御虚假的媒体篡改指控。
通过实现可信否认，降低恶意行为者利用生成模型的动机。
建立透明且防篡改的机制，用于追踪合法的模型使用行为。
为负责任的生成式AI部署提供伦理与立法框架的参考。

提出的方法

第一项技术通过在生成模型的熵源上应用概率建模，以高置信度将深度伪造归因于特定模型。
第二项技术利用密码学技术，创建公开广播、防篡改的合法模型输出日志。
基于熵的归因方法经过优化，对扰动和对抗性样本具有更强的鲁棒性。
密码学日志确保任何对使用记录的篡改行为均可被第三方检测并验证。
该框架在人脸合成（一种典型的深度伪造应用场景）上进行评估，以证明其在现实世界中的可行性。
该系统结合机器学习与密码学，为模型开发者构建纵深防御策略。

实验结果

研究问题

RQ1基于概率的归因系统能否可靠识别出生成给定深度伪造的生成式模型？
RQ2该归因系统对输入扰动和对抗性样本的鲁棒性如何？
RQ3在缺乏基于机器学习的否认机制时，密码学日志能否提供可验证的合法使用证明？
RQ4此类可信否认框架在准确性、鲁棒性与实际部署之间的权衡如何？
RQ5此类系统如何支持生成式AI模型的伦理与合法使用？

主要发现

基于熵的归因方法在识别深度伪造的源模型方面达到了97.62%的准确率。
与标准检测模型相比，该归因方法对扰动和对抗性样本的敏感性显著降低。
当基于机器学习的归因失败时，密码学日志方法可作为可信否认的必要后备方案。
密码学集成确保使用记录无法被篡改，从而支持公众对合法使用的验证。
该框架突显了纯机器学习检测的局限性，强调了技术与法律相结合的混合解决方案的必要性。
本研究强调了制定更强有力的立法与伦理框架以规范生成式AI使用的重要性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。