QUICK REVIEW

[論文レビュー] Progressive Feedback-Enhanced Transformer for Image Forgery Localization

Haochen Zhu, Gang Cao|arXiv (Cornell University)|Nov 15, 2023

Digital Media Forensic Detection被引用数 13

ひとこと要約

ProFactは、漸進的でフィードバック駆動型のTransformerフレームワークを用いて粗から細へと画像偽造局所化を実現し、現実的なMBH生成トレーニングデータの支援を得て、9つのデータセットで最先端の結果を達成します。

ABSTRACT

Blind detection of the forged regions in digital images is an effective authentication means to counter the malicious use of local image editing techniques. Existing encoder-decoder forensic networks overlook the fact that detecting complex and subtle tampered regions typically requires more feedback information. In this paper, we propose a Progressive FeedbACk-enhanced Transformer (ProFact) network to achieve coarse-to-fine image forgery localization. Specifically, the coarse localization map generated by an initial branch network is adaptively fed back to the early transformer encoder layers, which can enhance the representation of positive features while suppressing interference factors. The cascaded transformer network, combined with a contextual spatial pyramid module, is designed to refine discriminative forensic features for improving the forgery localization accuracy and reliability. Furthermore, we present an effective strategy to automatically generate large-scale forged image samples close to real-world forensic scenarios, especially in realistic and coherent processing. Leveraging on such samples, a progressive and cost-effective two-stage training protocol is applied to the ProFact network. The extensive experimental results on nine public forensic datasets show that our proposed localizer greatly outperforms the state-of-the-art on the generalization ability and robustness of image forgery localization. Code will be publicly available at https://github.com/multimediaFor/ProFact.

研究の動機と目的

微細な痕跡の検出が難しい改ざん画像における偽造領域の堅牢な局所化を動機づける。
中間表現を洗練させるためのフィードバックを用いた粗から細への局所化フレームワークを開発する。
多尺度の手がかりを捉えるため、文脈的空間ピラミッドモジュールを用いて特徴学習を強化する。
現実的で大規模な偽造画像を生成し、2段階の漸進的トレーニングプロトコルを採用することにより、トレーニングデータのギャップを埋める。

提案手法

ProFactは、連続した二つのブランチを使用します：粗局所化ブランチ(CLB)とフィードバック強化ブランチ(FEB)で、漸進的なフィードバック機構を介して接続されています。
CLBはSegFormer（MiTブロック）に依存して粗いマップMcを生成し、特徴を強化するためにContextual Spatial Pyramid Module (CSPM)を組み込みます。
FEBはMcを受け取り、CLBの特徴とともに holistic attention module (HAM)を適用して表現を洗練させ、最終マップMpを予測します。
Contextual Spatial Pyramid Module (CSPM)は、Contextual Transformer (CoT)ブロックとマルチスケールのダイレーション畳み込みピラミッドを組み合わせて、局所的および文脈的特徴を豊かにします。
トレーニングデータはMBH (Matting, Blending, Harmonization) を用いて生成され、MBH-COCOおよびMBH-RAISEデータセットを含む、large-scaleで現実的な偽造画像を作成します。
2段階のトレーニングプロトコルでは、まずMBH-COCOで訓練し、より大きな入力サイズでMBH-RAISEをファインチューニングして一般化を改善します。

実験結果

リサーチクエスチョン

RQ1フィードバック強化型トランスフォーマーは、従来のエンコーダ-デコーダーネットワークを超えて偽造領域の局所化精度を向上させられるか。
RQ2中間特徴の洗練を伴う粗から細への戦略は、多様な偽造タイプと解像度にわたる検出の堅牢性にどのように影響するか。
RQ3現実的な偽造トレーニングサンプル（MBH）は、データセットを跨ぐ一般化と偽造局所化手法の堅牢性を向上させるか。
RQ4多尺度の文脈的特徴（CSPM）が微細な改ざん痕跡の検出に与える影響は何か。

主な発見

データセット	Noiseprint	ManTra-Net	DFCN	MVSS-Net	PSCC-Net	OSN	CAT-Net	ProFact	平均
Columbia	36.4 (7)	35.6 (8)	38.1 (6)	68.4 (4)	61.5 (5)	71.3 (3)	79.3 (2)	84.5 (1)	55.2 (1)
CASIAv1	12.9 (7)	13.0 (6)	8.3 (8)	45.1 (5)	46.3 (4)	50.9 (3)	71.0 (1)	56.4 (2)	54.7 (3)
NIST16	12.2 (6)	9.2 (7)	-	29.4 (4)	18.7 (5)	33.1 (2)	30.2 (3)	43.1 (1)	28.9 (6)
DSO-1	33.9 (6)	33.2 (7)	68.4 (1)	27.1 (8)	41.1 (5)	44.5 (4)	47.9 (2)	46.4 (3)	40.4 (7)
IMD	17.9 (5)	18.3 (4)	17.3 (6)	26.0 (3)	15.8 (7)	49.1 (2)	-	53.8 (1)	25.8 (5)
Korus	14.7 (4)	17.9 (3)	10.8 (5)	9.5 (7)	10.2 (6)	29.9 (2)	6.1 (8)	31.5 (1)	16.2 (6)
Coverage	14.7 (8)	27.5 (5)	-	44.5 (2)	44.4 (3)	26.0 (6)	28.9 (4)	51.1 (1)	25.0 (8)
In the Wild	16.7 (6)	15.6 (7)	-	-	10.8 (8)	50.5 (2)	34.1 (3)	64.5 (1)	25.6 (7)
AutoSplice	33.0 (7)	18.2 (8)	-	64.6 (3)	60.4 (4)	50.9 (5)	86.2 (1)	65.5 (2)	39.0 (5)
Average	21.4 (7)	20.9 (8)	31.2 (6)	34.8 (4)	34.3 (5)	45.1 (3)	48.0 (2)	55.2 (1)

ProFactは9データセットで最も高い平均局所化性能を達成し、2位のCAT-NetをF1で7.2%、IoUで5.6%上回る。
本手法はデータセット全体で常に上位2位にランクし、高解像度データや未見のAutoSpliceデータを含むことから、強力な一般化能力を示している。
MBH生成データと大きなテストサイズを用いた2段階トレーニングは、スケールと境界の現実感に対する堅牢性を向上させる。
提案手法ProFactはDSO-1のような難易度の高いデータセットで顕著な改善を示し、トップ3に近い性能を達成。
定性的結果は、フィードバックの洗練後に偽検出が減り、Mpが洗練されたことを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。