QUICK REVIEW

[论文解读] On the Computation of PSNR for a Set of Images or Video

Onur Keleş, M. Akın Yılmaz|arXiv (Cornell University)|Apr 30, 2021

Image and Signal Denoising Methods被引用 4

一句话总结

本文研究并比较了在图像和视频集合中计算PSNR的不同方法，表明在均方误差（MSE）值的算术平均与几何平均之间进行选择会显著影响报告的PSNR值，尤其是在MSE分布呈指数分布的图像恢复任务中。其主要贡献是一个清晰的PSNR报告一致性框架，呼吁研究社区明确记录计算方法，以实现对学习型图像与视频处理方法的公平比较。

ABSTRACT

When comparing learned image/video restoration and compression methods, it is common to report peak-signal to noise ratio (PSNR) results. However, there does not exist a generally agreed upon practice to compute PSNR for sets of images or video. Some authors report average of individual image/frame PSNR, which is equivalent to computing a single PSNR from the geometric mean of individual image/frame mean-square error (MSE). Others compute a single PSNR from the arithmetic mean of frame MSEs for each video. Furthermore, some compute the MSE/PSNR of Y-channel only, while others compute MSE/PSNR for RGB channels. This paper investigates different approaches to computing PSNR for sets of images, single video, and sets of video and the relation between them. We show the difference between computing the PSNR based on arithmetic vs. geometric mean of MSE depends on the distribution of MSE over the set of images or video, and that this distribution is task-dependent. In particular, these two methods yield larger differences in restoration problems, where the MSE is exponentially distributed and smaller differences in compression problems, where the MSE distribution is narrower. We hope this paper will motivate the community to clearly describe how they compute reported PSNR values to enable consistent comparison.

研究动机与目标

调查并澄清在学习型图像/视频恢复与压缩研究中，图像和视频数据集中PSNR计算中的不一致性。
比较两种主要方法：对单个PSNR值取平均（即MSE的几何平均）与基于MSE算术平均计算PSNR。
证明两种方法之间的差异取决于MSE值的分布，而该分布因任务类型（如恢复与压缩）而异。
倡导对PSNR计算方法进行标准化、透明化报告，以实现不同模型与数据集之间评估的公平性与一致性。

提出的方法

提出两种PSNR估计方法：基于单个图像/帧MSE几何平均的PSNR（等价于对单个PSNR值取平均），以及基于MSE算术平均的PSNR。
利用詹森不等式从理论上证明：基于几何平均的PSNR始终大于或等于基于MSE算术平均的PSNR。
通过MSE算术平均与几何平均的比值，分析两种PSNR估计之间的定量差异。
对不同任务中的MSE分布进行建模：在恢复/超分任务（如EDSR、EDVR）中为指数分布，在压缩任务（如H.264、H.265）中分布更集中。
使用真实数据集（UVG、MPEG）验证理论发现，并在多个任务中对比PSNR-1（PSNR的算术平均）、PSNR-2（MSE的算术平均）与PSNR-3（MSE的几何平均）。
建议先对每个图像或视频样本计算MSE，然后基于MSE的算术平均报告PSNR，以确保一致性，尤其在多视频评估中。

实验结果

研究问题

RQ1在图像和视频集合中，PSNR计算方法（MSE的算术平均 vs. 几何平均）在实践中如何比较？
RQ2MSE分布（如指数分布与窄分布）对两种PSNR估计差异的影响是什么？
RQ3为何在使用不同计算方法时，同一模型的报告PSNR值会存在显著差异？
RQ4在视频序列中，基于单帧PSNR计算的PSNR与基于平均MSE计算的PSNR相比如何？
RQ5在学习型图像与视频处理任务中，报告PSNR的最佳实践是什么，以确保公平比较？

主要发现

基于MSE算术平均与几何平均计算的PSNR差异，随MSE值方差增大而增加，这在图像恢复与超分任务中尤为常见。
在恢复任务（如EDSR、EDVR）中，MSE值呈指数分布，PSNR-1（PSNR的算术平均）与PSNR-3（MSE的几何平均）之间的差异可能非常显著——在某些情况下可达1.3 dB。
在视频压缩任务（如H.264、H.265）中，MSE分布更集中，两种PSNR估计值之间的差异较小，通常小于0.2 dB。
在MPEG数据集的下一帧预测任务中，PSNR-1（PSNR的算术平均）为32.88 dB，PSNR-2（MSE的算术平均）为30.03 dB，PSNR-3（MSE的几何平均）为29.08 dB，表明PSNR-1与PSNR-3之间存在1.8 dB的差距。
基于MSE分布（指数分布）的PSNR差异理论预测值与实证结果高度吻合，验证了模型的准确性。
本文结论认为，对视频集合中所有帧的PSNR进行逐帧平均在技术上是不合理的，因为它未能考虑不同运动特性视频中PSNR范围的差异。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。