QUICK REVIEW

[論文レビュー] Not My Deepfake: Towards Plausible Deniability for Machine-Generated Media

Baiwu Zhang, Jin Zhou|arXiv (Cornell University)|Aug 20, 2020

Generative Adversarial Networks and Image Synthesis被引用数 10

ひとこと要約

本論文は、特にディープフェイク操作の文脈において、生成AIモデルの開発者に対して説得力ある否認を提供する二重フレームワークを提案する。確率的エントロピー帰属（97.62％の正確性を達成）と暗号的ログ記録を用い、悪意あるメディア生成の疑いをかけられた際の無実を証明する。これは、誤った帰属に対する透明性と検証可能性のある防御を提供する。

ABSTRACT

Progress in generative modelling, especially generative adversarial networks, have made it possible to efficiently synthesize and alter media at scale. Malicious individuals now rely on these machine-generated media, or deepfakes, to manipulate social discourse. In order to ensure media authenticity, existing research is focused on deepfake detection. Yet, the very nature of frameworks used for generative modeling suggests that progress towards detecting deepfakes will enable more realistic deepfake generation. Therefore, it comes at no surprise that developers of generative models are under the scrutiny of stakeholders dealing with misinformation campaigns. As such, there is a clear need to develop tools that ensure the transparent use of generative modeling, while minimizing the harm caused by malicious applications. We propose a framework to provide developers of generative models with plausible deniability. We introduce two techniques to provide evidence that a model developer did not produce media that they are being accused of. The first optimizes over the source of entropy of each generative model to probabilistically attribute a deepfake to one of the models. The second involves cryptography to maintain a tamper-proof and publicly-broadcasted record of all legitimate uses of the model. We evaluate our approaches on the seminal example of face synthesis, demonstrating that our first approach achieves 97.62% attribution accuracy, and is less sensitive to perturbations and adversarial examples. In cases where a machine learning approach is unable to provide plausible deniability, we find that involving cryptography as done in our second approach is required. We also discuss the ethical implications of our work, and highlight that a more meaningful legislative framework is required for a more transparent and ethical use of generative modeling.

研究の動機と目的

高度な生成モデルを用いた悪意あるディープフェイクの増加する脅威に対処すること。
改ざんされたメディアの作成に関する誤った告発に対して、開発者が検証可能なツールを提供すること。
信ぴょう性ある否認を可能にすることで、悪意ある行動が生成モデルを悪用するインcentiveを減らすこと。
正当なモデル利用を追跡する透明性があり改ざん防止の仕組みを確立すること。
責任ある生成AIの展開を支援する倫理的および立法的枠組みを提示すること。

提案手法

第一の手法は、生成モデルにおけるエントロピーの源に関する確率的モデリングを用い、高い信頼性で特定のモデルが生成したディープフェイクを帰属づける。
第二の手法は、すべての正当なモデル出力を公開して配布され、改ざんできないログを暗号技術で作成する。
エントロピーに基づく帰属は、摂動や敵対的例に対して耐性があるように最適化されている。
暗号的ログ記録により、使用記録が改ざんされた場合でも、第三者が検出可能かつ検証可能であることが保証される。
本フレームワークは、ディープフェイクの代表的応用分野である顔合成を対象として評価され、実用的妥当性が示された。
本システムは機械学習と暗号技術を統合し、モデル開発者に防御の多層的戦略を提供する。

実験結果

リサーチクエスチョン

RQ1確率的帰属システムは、与えられたディープフェイクを生成した生成モデルを信頼性高く特定できるか？
RQ2この帰属システムは、入力の摂動や敵対的例に対してどれほど耐性を示すか？
RQ3機械学習に基づく否認が失敗した場合、暗号的ログ記録は正当なモデル利用の証明を可能にするか？
RQ4このような否認フレームワークの正確性、耐性、実用的展開の間には、どのようなトレードオフがあるか？
RQ5このようなシステムは、生成AIモデルの倫理的かつ法的に適切な利用をどのように支援できるか？

主な発見

エントロピーに基づく帰属手法は、ディープフェイクの出典モデルを特定する際、97.62％の正確性を達成した。
標準の検出モデルと比較して、摂動や敵対的例に対する感受性が低減した。
機械学習に基づく帰属が失敗した場合、暗号的ログ記録は説得力ある否認のための必須のバックアップを提供する。
暗号技術の統合により、使用記録が改ざんできず、正当な利用が第三者によって公開検証可能である。
本フレームワークは、純粋なMLベースの検出の限界を浮き彫りにし、ハイブリッドな技術的・法的解決策の必要性を示している。
本研究は、生成AIの利用を規制するより強固な立法的および倫理的枠組みの必要性を強調している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。