QUICK REVIEW

[論文レビュー] RealStats: A Rigorous Real-Only Statistical Framework for Fake Image Detection

Haim Zisman, Uri Shaham|arXiv (Cornell University)|Jan 26, 2026

Generative Adversarial Networks and Image Synthesis被引用数 0

ひとこと要約

RealStatsは、複数の実画像のみのp値を集約して実画像分布との整合性を検定することで、統計的に校正された解釈可能な出力を提供するトレーニング不要のフレームワークを提示する。

ABSTRACT

As generative models continue to evolve, detecting AI-generated images remains a critical challenge. While effective detection methods exist, they often lack formal interpretability and may rely on implicit assumptions about fake content, potentially limiting robustness to distributional shifts. In this work, we introduce a rigorous, statistically grounded framework for fake image detection that focuses on producing a probability score interpretable with respect to the real-image population. Our method leverages the strengths of multiple existing detectors by combining training-free statistics. We compute p-values over a range of test statistics and aggregate them using classical statistical ensembling to assess alignment with the unified real-image distribution. This framework is generic, flexible, and training-free, making it well-suited for robust fake image detection across diverse and evolving settings.

研究の動機と目的

進化する生成モデルの下で、解釈性が高く適応性のある偽画像検出の必要性を動機づける。
実画像分布に基づく統計的仮説検定に基づくトレーニング不要フレームワークを開発する。
独立性を意識した統合を通じて複数の統計量を選択し、校正されたp値を生成する。
配布シフトに対する頑健性とモジュール性を確保し、新しい統計量の組み込みを可能にする。

提案手法

凍結特徴量抽出器を用いて実画像から多様なスカラー統計量を抽出する。
各統計量を実画像から推定された経験分布関数（empirical CDF）を介して二側p値に対応付ける。
独立性グラフを構築し一様性制約の下で最大クリークを抽出することにより、独立な統計量の部分集合を選択する。
選択されたp値をStouffer検定やmin-pのような手法で集約し、帰無仮説下の統一p値を得る。
推論は選択された統計量のみを用いてp値を計算し、所定の有意水準で決定を行う。

Figure 1: Illustration of the score interpretability gap between a supervised classifier Wang et al. ( 2020 ) and our statistical method. Top: A supervised model outputs scores that can separate real from fake images, but these scores are not inherently interpretable, as they lack clear statistical

実験結果

リサーチクエスチョン

RQ1実画像ベースのトレーニング不要フレームワークは、実画像と偽画像の可能性を意味のある形で定量化する校正済みp値を生み出せるか。
RQ2独立した実画像のみの統計量を複数集約することで、進化する生成モデルに対する分布シフトの頑健性が向上するか。
RQ3RealStatsは解釈性と競合的検出性能を、トレーニング不要のベースラインと比較してどのようにバランスするか。
RQ4再学習なしに新しい統計量を組み込んで、難易度の高い生成器での性能を向上させることができるか。

主な発見

Model	AUC	AP
Manifold Bias	0.761 ± 0.179	0.753 ± 0.169
RIGID	0.769 ± 0.194	0.765 ± 0.189
AEROBLADE	0.697 ± 0.161	0.697 ± 0.163
Ours (Stouffer)	0.756 ± 0.135	0.743 ± 0.133
Ours (Min-p)	0.775 ± 0.126	0.756 ± 0.119

本手法は、最先端のトレーニング不要検出器と比較して競争力のあるAUCとAPを達成（例: Min-pエンサンブルAUC 0.775、AP 0.756、生成器間の分散が低い）。
生成器ごとの分析は、いくつかのベースラインよりもバランスの取れた性能と頑健性を示し、多様な統計量を組み込むと改善（例: ManifoldBiasをMin-pに追加してGauGAN、CycleGAN、SANでAUCを向上）。
フレームワークは各推論に対して校正済みp値を返すことで解釈可能な出力を提供し、標準的な有意水準で原理的な判断を可能にする。
手法は高速でスケーラブル、メモリ効率が高く、独立性検定によるオーバーヘッドはフォワードパスに比べて小さい。
一般的な破損（例: ガウシアンブラー；JPEG圧縮は中程度の低下）に対して頑健で、基準分布のずれがあっても識別信号を維持する適応性を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。