QUICK REVIEW

[论文解读] Your AI-Generated Image Detector Can Secretly Achieve SOTA Accuracy, If Calibrated

Muli Yang, Gabriel James Goenawan|arXiv (Cornell University)|Feb 2, 2026

Adversarial Robustness in Machine Learning被引用 0

一句话总结

这篇论文在测试时分布改变下显示探测器存在偏差，并提出一种轻量的事后对数逻辑校准（可有标签也可无标签）以在不重新训练的情况下恢复贝叶斯最优决策边界。

ABSTRACT

Despite being trained on balanced datasets, existing AI-generated image detectors often exhibit systematic bias at test time, frequently misclassifying fake images as real. We hypothesize that this behavior stems from distributional shift in fake samples and implicit priors learned during training. Specifically, models tend to overfit to superficial artifacts that do not generalize well across different generation methods, leading to a misaligned decision threshold when faced with test-time distribution shift. To address this, we propose a theoretically grounded post-hoc calibration framework based on Bayesian decision theory. In particular, we introduce a learnable scalar correction to the model's logits, optimized on a small validation set from the target distribution while keeping the backbone frozen. This parametric adjustment compensates for distributional shift in model output, realigning the decision boundary even without requiring ground-truth labels. Experiments on challenging benchmarks show that our approach significantly improves robustness without retraining, offering a lightweight and principled solution for reliable and adaptive AI-generated image detection in the open world. Code is available at https://github.com/muliyangm/AIGI-Det-Calib.

研究动机与目标

在分布 shift 下识别 AI 生成图像探测器中的系统性测试时偏差。
提供一个理论基础的事后校准框架以纠正偏置的对数输出。
在最小数据和计算开销下实现对未见生成模型的鲁棒检测。
证明简单的标量对数校正在 shift 下可接近贝叶斯最优表现。

提出的方法

在训练和测试分布下对检测问题建模，考虑类别条件输入偏移和标签先验偏移。
推导在 shift 下默认的零阈值非贝叶斯最优，以及全局标量对数校正可重新对齐决策边界。
提出两种校准策略：使用少量带标签目标数据的 KDE 进行有监督的对数校准；以及基于对数输出分布对称性的无监督校准（双峰假设）。
给出一个实际可行的步骤，在不重新训练骨干网络的情况下估计标量 alpha（或经校准的 f(x) = f(x) - alpha）。

Figure 1: Logit distributions of a popular AI-generated image detector, CNNSpot ( wang2020cnn ) , pretrained on ProGAN-generated fake images and evaluated on previously unseen fake images from StyleGAN2, WhichFaceIsReal (WFIR), and Midjourney, reveal a tendency to misclassify these unfamiliar fake s

实验结果

研究问题

RQ1测试时分布 shifting 是否会导致 AI 生成图像探测器出现系统性偏差，使假图像被误判为真实图像？
RQ2在不重新训练的情况下，是否可以通过轻量级的事后标量对数校正恢复贝叶斯最优决策？
RQ3有监督与无监督校准方法在不同生成器和基准上的效果如何？

主要发现

在不重新训练主干网络的前提下，校准显著提升了对未见生成器的检测准确性。
在真实的测试时条件下，恒定的对数偏移可以同时补偿标签先验和输入偏移。
使用少量带标签目标样本的有监督 KDE 校准相比基线有显著提升。
在未标记目标对数输出中基于对称性的无监督校准，当对数输出呈双峰结构时，可以恢复稳健阈值。
所提校准方法在 AIGCDetectBenchmark 和 GenImage 上对多种探测器和生成器提高了鲁棒性。

Figure 2: Conceptual illustration of our proposed (a) supervised and (b) unsupervised calibration methods, both designed to identify an optimal scalar $\alpha$ that achieves an ideal separation between real and fake distributions, with or without access to ground-truth labels.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。