Skip to main content
QUICK REVIEW

[论文解读] Unmasking DeepFakes with simple Features

Ricard Durall, Margret Keuper|arXiv (Cornell University)|Nov 2, 2019
Digital Media Forensic Detection参考文献 27被引用 174
一句话总结

论文使用一种简单的频域特征(通过DFT的1D功率谱并进行方位平均)以及轻量级分类器来检测DeepFakes,在高分辨率/中等分辨率上达到非常高的准确率,在无监督设置下表现鲁棒。

ABSTRACT

Deep generative models have recently achieved impressive results for many real-world applications, successfully generating high-resolution and diverse samples from complex datasets. Due to this improvement, fake digital contents have proliferated growing concern and spreading distrust in image content, leading to an urgent need for automated ways to detect these AI-generated fake images. Despite the fact that many face editing algorithms seem to produce realistic human faces, upon closer examination, they do exhibit artifacts in certain domains which are often hidden to the naked eye. In this work, we present a simple way to detect such fake face images - so-called DeepFakes. Our method is based on a classical frequency domain analysis followed by basic classifier. Compared to previous systems, which need to be fed with large amounts of labeled data, our approach showed very good results using only a few annotated training samples and even achieved good accuracies in fully unsupervised scenarios. For the evaluation on high resolution face images, we combined several public datasets of real and fake faces into a new benchmark: Faces-HQ. Given such high-resolution images, our approach reaches a perfect classification accuracy of 100% when it is trained on as little as 20 annotated samples. In a second experiment, in the evaluation of the medium-resolution images of the CelebA dataset, our method achieves 100% accuracy supervised and 96% in an unsupervised setting. Finally, evaluating a low-resolution video sequences of the FaceForensics++ dataset, our method achieves 91% accuracy detecting manipulated videos. Source Code: https://github.com/cc-hpc-itwm/DeepFakeDetection

研究动机与目标

  • 激发一种轻量、数据高效的检测AI生成假脸的方法。
  • 利用频域伪影在没有大量标注数据的情况下区分真实与伪造图像。
  • 引入Faces-HQ,一个用于评估的高分辨率真实/假脸数据集。
  • 展示在高、中、低分辨率数据(图像和视频)上的鲁棒性。

提出的方法

  • 对灰度人脸图像计算离散傅里叶变换。
  • 计算FFT功率谱的方位平均以获得1D特征向量(722特征)。
  • 在1D功率谱特征上训练简单分类器(SVM带RBF、逻辑回归和K-Means)。
  • 在多个数据集(Faces-HQ、CelebA、FaceForensics++)下在监督与无监督设置进行评估。
  • 对于视频数据,在分类前将1D谱插值为固定大小。

实验结果

研究问题

  • RQ1简单的频域特征能在不同分辨率下揭示GAN生成人脸的伪影吗?
  • RQ2基于1D功率谱特征的轻量级分类器的数据效率与准确性如何?
  • RQ3相对于先前的深度学习检测器,该方法在高、中、低分辨率的数据(图像和视频)上的表现如何?

主要发现

  • 在Faces-HQ的高分辨率评估在仅需20个标注样本的情况下达到100%准确率。
  • 中分辨率的CelebA结果在监督学习中达到100%准确率,在无监督设置中达到96%。
  • 低分辨率的FaceForensics++视频评估在逐帧检测中达到90%准确率。
  • SVM和逻辑回归在样本量充足时持续实现近乎完美的性能;K-Means表现较差但在某些设置仍具竞争力。
  • 将频率分量分组到子区间显示某些高频带驱动判别(例如100–300范围在某些设置下可达到0.86–1.00的准确率)。
  • 该方法在数据来源和GAN类型上仍然鲁棒,依赖频域伪影而非大量标注训练。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。