QUICK REVIEW

[论文解读] Towards High-quality HDR Deghosting with Conditional Diffusion Models

Qingsen Yan, Tao Hu|arXiv (Cornell University)|Nov 2, 2023

Image Enhancement Techniques被引用 17

一句话总结

本文将 HDR 去重影建模为条件扩散式图像生成，利用学习的 LDR 特征来引导 HDR 重建，并引入滑动窗口噪声估计以降低伪影。

ABSTRACT

High Dynamic Range (HDR) images can be recovered from several Low Dynamic Range (LDR) images by existing Deep Neural Networks (DNNs) techniques. Despite the remarkable progress, DNN-based methods still generate ghosting artifacts when LDR images have saturation and large motion, which hinders potential applications in real-world scenarios. To address this challenge, we formulate the HDR deghosting problem as an image generation that leverages LDR features as the diffusion model's condition, consisting of the feature condition generator and the noise predictor. Feature condition generator employs attention and Domain Feature Alignment (DFA) layer to transform the intermediate features to avoid ghosting artifacts. With the learned features as conditions, the noise predictor leverages a stochastic iterative denoising process for diffusion models to generate an HDR image by steering the sampling process. Furthermore, to mitigate semantic confusion caused by the saturation problem of LDR images, we design a sliding window noise estimator to sample smooth noise in a patch-based manner. In addition, an image space loss is proposed to avoid the color distortion of the estimated HDR results. We empirically evaluate our model on benchmark datasets for HDR imaging. The results demonstrate that our approach achieves state-of-the-art performances and well generalization to real-world images.

研究动机与目标

在存在饱和和运动引起的鬼影时，动机是从多曝光的 LDR 图像进行 HDR 重建。
提出一个 DDPM 基金架构，其中 LDR 特征对反向过程进行条件化以引导 HDR 生成。
通过特征对齐、基于补丁的噪声估计和图像空间损失来缓解语义混淆和颜色失真。
在标准 HDR 数据集上展示最先进的性能，并展示对真实世界图像的鲁棒性。

提出的方法

使用扩散模型学习 p(x|y)，其中 y 包含三张 LDR 图像，x 为 HDR 目标。
引入一个 Feature Condition Generator (FCG)，输出仿射调制参数（η, γ），将 DFA (Domain Feature Align) 应用于噪声预测器中的中间特征。
通过 Attention Network 引入隐式对齐的 LDR 特征，引导噪声预测器通过对特征图的仿射变换。
采用滑动窗口噪声估计器（SWNE）对平滑补丁进行采样，降低语义混淆。
添加图像空间损失，使扩散模型的输出在去噪后与实际图像对齐，从而改善颜色保真度。
在色调映射 HDR 域和 gamma 校正输入上进行训练；使用修改后的 UNet，具 WideResNet 块和自注意力机制的扩散骨干。

实验结果

研究问题

RQ1条件扩散模型是否能够从饱和且存在运动的 LDR 序列中无鬼影地重建高质量的 HDR 图像？
RQ2是否通过域对齐变换学习 LDR 条件特征能降低 HDR 重建中的语义混淆？
RQ3基于补丁的滑动窗口噪声估计和图像空间损失对 HDR 扩散模型的颜色保真度和感知质量有何影响？

主要发现

所提出的方法在 Kalantari 等人的 HDR 数据集上在多项指标上达到最先进的性能。
基于扩散的 DFA 条件特征方法可降低鬼影并在饱和和动态区域保持细节。
滑动窗口噪声估计降低了语义混淆，在补丁之间获得更平滑且更连贯的 HDR 重建。
图像空间损失通过在像素空间约束扩散输出，帮助缓解颜色失真。
在 Kalantari 数据集上，该方法达成 PSNR-μ = 44.11, PSNR-L = 41.73, SSIM-μ = 0.9911, SSIM-L = 0.9885, HDR-VDP-2 = 65.52, LPIPS = 0.0109, FID = 6.20。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。