QUICK REVIEW

[论文解读] Exposing DeepFake Videos By Detecting Face Warping Artifacts

Yuezun Li, Siwei Lyu|arXiv (Cornell University)|Nov 1, 2018

Digital Media Forensic Detection参考文献 36被引用 571

一句话总结

本文提出基于CNN的方法，通过利用仿射人脸扭曲产生的伪影来检测DeepFake视频，使用通过图像处理生成的合成负样本而非训练DeepFakes。

ABSTRACT

In this work, we describe a new deep learning based method that can effectively distinguish AI-generated fake videos (referred to as {\em DeepFake} videos hereafter) from real videos. Our method is based on the observations that current DeepFake algorithm can only generate images of limited resolutions, which need to be further warped to match the original faces in the source video. Such transforms leave distinctive artifacts in the resulting DeepFake videos, and we show that they can be effectively captured by convolutional neural networks (CNNs). Compared to previous methods which use a large amount of real and DeepFake generated images to train CNN classifier, our method does not need DeepFake generated images as negative training examples since we target the artifacts in affine face warping as the distinctive feature to distinguish real and fake images. The advantages of our method are two-fold: (1) Such artifacts can be simulated directly using simple image processing operations on a image to make it as negative example. Since training a DeepFake model to generate negative examples is time-consuming and resource-demanding, our method saves a plenty of time and resources in training data collection; (2) Since such artifacts are general existed in DeepFake videos from different sources, our method is more robust compared to others. Our method is evaluated on two sets of DeepFake video datasets for its effectiveness in practice.

研究动机与目标

通过针对人脸合成流程中的伪影来提升鲁棒的DeepFake检测。
利用这样一个洞见：DeepFake人脸合成涉及固定大小的图像，被扭曲以匹配目标人脸。
通过图像处理模拟扭曲伪影，消除对真实负样本DeepFake数据的需求。
通过关注普遍存在的扭曲伪影，展示对不同DeepFake来源的鲁棒性。

提出的方法

检测人脸并用特征点提取人脸区域，以识别仿射变换矩阵。
通过将人脸对齐到多种尺度、应用高斯模糊、再仿射扭曲回原始大小来模拟负样本。
通过多样化的颜色、亮度、对比度、失真以及基于多边形的人脸形状来增强数据的真实感。
裁剪面部周围及周边区域的感兴趣区域，重新缩放到224x224，并训练CNN（VGG16、ResNet50/101/152）。
推理阶段，对每张图像应用ROI采样10次并对CNN输出取平均，以获得最终的伪造概率。

实验结果

研究问题

RQ1基于DeepFake流程中的仿射人脸扭曲伪影，是否能被CNN可靠地检测到？
RQ2合成的（非DeepFake）负样本生成是否足以训练出鲁棒的检测器？
RQ3哪些CNN架构最善于利用扭曲伪影线索，在公开的DeepFake数据集上取得高检测性能？

主要发现

ResNet50在UADFV的图像基准AUC为97.4%，视频基准为98.7%，表现最佳。
ResNet101和ResNet152也表现良好，在UADFV图像上的AUC约为95–99%，在视频测试中为97–99%。
在DeepfakeTIMIT HQ，ResNet50达到99.9%的AUC（图像），显著优于其他方法。
在HQ，ResNet152达到91.2%的AUC（HQ）而ResNet50达到93.2%（LQ为99.9%），显示在不同质量设置下的鲁棒性能。
该方法在两个数据集上均优于Two-stream NN、MesoNet变体和HeadPose，凸显对DeepFake变体的鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。