QUICK REVIEW

[论文解读] A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction

Hui Zeng, Lei Zhang|arXiv (Cornell University)|Aug 28, 2017

Image and Video Quality Assessment参考文献 41被引用 61

一句话总结

引入概率质量表示（PQR）来建模主观图像质量的分布，使深度 BIQA 训练更稳健，准确性优于标量分数回归。使用锚点、软映射和 KL 散度损失在 PQR 目标上训练 CNN 并将其映射回标量分数。

ABSTRACT

Blind image quality assessment (BIQA) remains a very challenging problem due to the unavailability of a reference image. Deep learning based BIQA methods have been attracting increasing attention in recent years, yet it remains a difficult task to train a robust deep BIQA model because of the very limited number of training samples with human subjective scores. Most existing methods learn a regression network to minimize the prediction error of a scalar image quality score. However, such a scheme ignores the fact that an image will receive divergent subjective scores from different subjects, which cannot be adequately represented by a single scalar number. This is particularly true on complex, real-world distorted images. Moreover, images may broadly differ in their distributions of assigned subjective scores. Recognizing this, we propose a new representation of perceptual image quality, called probabilistic quality representation (PQR), to describe the image subjective score distribution, whereby a more robust loss function can be employed to train a deep BIQA model. The proposed PQR method is shown to not only speed up the convergence of deep model training, but to also greatly improve the achievable level of quality prediction accuracy relative to scalar quality score regression methods. The source code is available at https://github.com/HuiZeng/BIQA_Toolbox.

研究动机与目标

在盲IQA中说明需要比单一标量分数更丰富的图像质量表示的动机。
提出PQR，用质量锚点和概率映射来描述主观质量的分布。
开发使用KL散度（带软最大化交叉熵）来学习输出 PQR 向量的 CNN 的训练策略。
证明PQR 能加速收敛并在多个 IQA 数据集上提高预测准确性。

提出的方法

在分数范围内定义 M 个质量锚点（均匀量化或 Lloyd-Max 量化）。
通过软映射将每个图像 MOS y 转换为 PQR 向量 q：q^m = exp(-β||y-c^m||^2)/sum_i exp(-β||y-c^i||^2)。
学习一个逆向映射 h(q)，通过在训练数据上最小化平方误差将 PQR 映射回标量分数。
训练 CNN 使其输出 PQR 向量，目标 q 与网络输出之间使用 KL 散度（带 softmax 输出的交叉熵）进行对齐。
对补丁级预测进行池化（平均池化），得到整张图像的质量分数。
尝试使用预训练的 AlexNet 和 ResNet50（微调），以及用于基于补丁输入的浅层 S-CNN。

实验结果

研究问题

RQ1相比标量回归，基于锚点的概率化感知图像质量表示是否能在深度 BIQA 中提高学习稳定性和准确性？
RQ2锚点和平滑参数 β 应如何选择，PQR 在真实与合成失真以及多个数据库上的表现如何？
RQ3在不同 CNN 架构下，使用 PQR 对收敛速度和最终预测性能有什么影响？
RQ4逆向映射 h(·) 从 PQR 预测恢复标量 MOS 的效果如何？
RQ5基于补丁的 PQR 训练结合平均池化是否具有与传统标量回归 BIQA 方法的竞争力或更优表现？

主要发现

基于 PQR 的模型在所有数据集（LIVE Challenge、LIVE IQA、CSIQ、TID2013）的 SRCC 和 PLCC 指标上始终优于标量回归基线。
β = 64 在各数据库上表现鲁棒；PQR 对锚点密度 M 相对不敏感（通常 M=5 即可有效）。
在所有报告的数据库上，使用 PQR 的 AlexNet 和 ResNet50 的 SRCC/PLCC 均高于对应的 SQR 版本。
由于更丰富的监督信号和基于 KL 散度的损失，PQR 实现更快收敛和更好的泛化。
从 PQR 到标量 MOS 的逆映射 h(q) 在合理的 β 和 M 值下，平均误差在 MOS 调刻度 [0,1] 上小于 0.01，结果准确。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。