QUICK REVIEW

[论文解读] Learning Rich Features for Image Manipulation Detection

Peng Zhou, Xintong Han|arXiv (Cornell University)|May 13, 2018

Digital Media Forensic Detection参考文献 24被引用 82

一句话总结

一个双流 Faster R-CNN 通过将 RGB 篡改伪迹与基于 SRM 的噪声特征相结合来检测篡改区域，在多个数据集上达到最先进的结果，并对调整尺寸/压缩具有鲁棒性。

ABSTRACT

Image manipulation detection is different from traditional semantic object detection because it pays more attention to tampering artifacts than to image content, which suggests that richer features need to be learned. We propose a two-stream Faster R-CNN network and train it endto- end to detect the tampered regions given a manipulated image. One of the two streams is an RGB stream whose purpose is to extract features from the RGB image input to find tampering artifacts like strong contrast difference, unnatural tampered boundaries, and so on. The other is a noise stream that leverages the noise features extracted from a steganalysis rich model filter layer to discover the noise inconsistency between authentic and tampered regions. We then fuse features from the two streams through a bilinear pooling layer to further incorporate spatial co-occurrence of these two modalities. Experiments on four standard image manipulation datasets demonstrate that our two-stream framework outperforms each individual stream, and also achieves state-of-the-art performance compared to alternative methods with robustness to resizing and compression.

研究动机与目标

Sense: 以丰富的特征学习来检测篡改伪迹，而不是仅依赖图像内容。
提出一个将 RGB 可视线索和基于噪声的特征整合在一起以进行篡改定位的双流架构。
端到端训练以定位篡改区域并对篡改类型进行分类。
证明对常见后处理如调整大小和 JPEG 压缩具有鲁棒性。

提出的方法

基于 SRM 滤波的 Noise 流的 RGB 流的双流 Faster R-CNN。
从 RGB 特征生成 RPN 提案以定位可能被篡改的区域。
Noise 流通过一个 SRM 滤波层对 RGB 输入进行处理以提取局部噪声特征。
fRGB^T fN 的双线性池化将来自两个流的 RoI 特征用于篡改分类。
使用紧凑双线性池化以在保留特征交互的同时降低内存占用。
损失函数将 RPN 损失、篡改分类损失和边界框回归损失结合起来。

实验结果

研究问题

RQ1一个利用 RGB 篡改伪迹与局部噪声不一致性的双流架构能否在图像篡改检测中超越单流方法？
RQ2通过双线性池化融合 RGB 与噪声特征在定位和篡改分类上的好处是什么？
RQ3所提出的方法对常见后处理如调整大小和 JPEG 压缩的鲁棒性如何？
RQ4模型是否能够在不同数据集上区分不同的篡改技术（拼接、移除、拷贝-移动）？

主要发现

双流 RGB-N 网络在四个标准数据集上均优于各自单独的流。
在该设置中，RGB 特征比噪声特征更适合用于 RPN 提案生成。
通过双线性池化的融合在篡改分类和定位方面优于后融合基线。
在合成的预训练中，RGB-N 的 AP 为 0.627，而 RGB-only 为 0.445，噪声单独为 0.461（基于 COCO 的预训练设置）。
跨数据集，RGB-N 在像素级 F1 和 AUC 上普遍高于若干基线，在 NIST16、Columbia、COVER 和 CASIA 数据集上尤有显著提升。
与基线相比，该方法对 JPEG 品质变化和调整大小攻击表现出鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。