QUICK REVIEW

[論文レビュー] Application of DenseNet in Camera Model Identification and Post-processing Detection

Abdul Muntakim Rafi, Uday Kamal|arXiv (Cornell University)|Sep 3, 2018

Digital Media Forensic Detection参考文献 34被引用数 35

ひとこと要約

本論文は、マルチスケールのパッチ抽出と経験的モード分解（EMD）を組み合わせたDenseNet-201ベースのパイプラインを提案し、メタデータの欠落した画像やJPEG圧縮、リサイズ、ガンマ補正などのポストプロセッシングを受けても耐性のあるカメラ機種同定とポストプロセッシング検出を実現する。IEEE SP Cup 2018データセットでは98.37%の精度を達成し、ドレスデンデータベースでは99%を超える精度を示し、単一モデルによる画像フォレンジックス分野で最先端の性能と一般化能力を示した。

ABSTRACT

Camera model identification has earned paramount importance in the field of image forensics with an upsurge of digitally altered images which are constantly being shared through websites, media, and social applications. But, the task of identification becomes quite challenging if metadata are absent from the image and/or if the image has been post-processed. In this paper, we present a DenseNet pipeline to solve the problem of identifying the source camera-model of an image. Our approach is to extract patches of 256*256 from a labeled image dataset and apply augmentations, i.e., Empirical Mode Decomposition (EMD). We use this extended dataset to train a Neural Network with the DenseNet-201 architecture. We concatenate the output features for 3 different sizes (64*64, 128*128, 256*256) and pass them to a secondary network to make the final prediction. This strategy proves to be very robust for identifying the source camera model, even when the original image is post-processed. Our model has been trained and tested on the Forensic Camera-Model Identification Dataset provided for the IEEE Signal Processing (SP) Cup 2018. During testing we achieved an overall accuracy of 98.37%, which is the current state-of-the-art on this dataset using a single model. We used transfer learning and tested our model on the Dresden Database for Camera Model Identification, with an overall test accuracy of over 99% for 19 models. In addition, we demonstrate that the proposed pipeline is suitable for other image-forensic classification tasks, such as, detecting the type of post-processing applied to an image with an accuracy of 96.66% -- which indicates the generality of our approach.

研究の動機と目的

メタデータの欠落した画像やJPEG圧縮、リサイズ、ガンマ補正などのポストプロセッシングを受けても、カメラ機種を同定する課題に対処すること。
複数のデータセットに一般化可能で、画像改ざんに対しても耐性を持つディーブラーニングパイプラインの開発。
学習された特徴量の転送可能性を示し、画像に適用されたポストプロセッシングの種別を同定するなどの関連フォレンジックタスクへの応用を検証すること。
単一モデルでIEEE SP Cup 2018カメラ機種同定ベンチマークで最先端の性能を達成すること。

提案手法

ラベル付き画像から256×256のパッチを抽出し、経験的モード分解（EMD）を用いてデータ拡張を実施し、トレーニングデータセットを拡大する。
複数スケール（64×64、128×128、256×256）のパッチからカメラ固有の特徴を学習できるように、拡張済みデータセット上でDenseNet-201モデルをトレーニングする。
3種類のパッチサイズに対応する特徴マップを連結し、最終予測のために二次分類器を通過させる。
ドレステンデータベースでクロスデータセット評価を実施するため、転移学習を用いてモデルを微調整する。
同じトレーニング済みモデルを用いて、4種類の画像改ざん（変更なし、JPEG圧縮、ガンマ補正、リサイズ）を分類する。
DenseNet-201アーキテクチャ内にSqueeze-and-Excitation（SE）モジュールを統合し、特徴表現学習を強化する。

実験結果

リサーチクエスチョン

RQ1メタデータに依存せずに、1つのディーブラーニングモデルがカメラ機種同定で最先端の精度を達成できるか？
RQ2経験的モード分解（EMD）は、カメラ機種同定におけるポストプロセッシングへの耐性強化のための有効なデータ拡張技術とみなせるか？
RQ3IEEE SP Cup 2018で学習した特徴量が、ドレステンデータベースのような別のデータセットにどの程度一般化可能か？
RQ4同じモデルアーキテクチャが、さまざまなタイプの画像改ざんを高い精度で検出できるか？

主な発見

提案されたパイプラインは、IEEE SP Cup 2018データセットで98.37%のテスト精度を達成し、単一モデル性能として新たな最先端を記録した。
ドレステンデータベースでは、19種類のカメラ機種について99%を超える精度を達成し、カメラ製造会社の同定では100%の精度を示した。
モデルは強力な一般化性能を示し、4種類の画像改ざん（変更なし、JPEG圧縮、ガンマ補正、リサイズ）の検出で96.66%の精度を達成した。
誤検出は主に同じメーカーの機種同士の間で発生しており、共通の補間法やCFAパターンが混乱の原因である可能性が示唆された。
転移学習により、トレーニングセットが著しく小さいドレステンデータセットでも高い性能を達成でき、有効な特徴転送性を示した。
EMDを新たなデータ拡張技術として採用したことで、特にポストプロセッシング条件下での耐性向上に寄与した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。