QUICK REVIEW

[论文解读] Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method

Tao Wang, Kaihao Zhang|arXiv (Cornell University)|Dec 22, 2022

Image Enhancement Techniques被引用 20

一句话总结

引入 UHD-LOL 基准用于 4K/8K 低光图像增强，并提出 LLFormer，这是一个基于 transformer 的方法，具备轴向注意力和跨层融合，在 UHD-LLIE 和公开的 LLIE 数据集上达到最先进的结果。

ABSTRACT

As the quality of optical sensors improves, there is a need for processing large-scale images. In particular, the ability of devices to capture ultra-high definition (UHD) images and video places new demands on the image processing pipeline. In this paper, we consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution. We conduct systematic benchmarking studies and provide a comparison of current LLIE algorithms. As a second contribution, we introduce LLFormer, a transformer-based low-light enhancement method. The core components of LLFormer are the axis-based multi-head self-attention and cross-layer attention fusion block, which significantly reduces the linear complexity. Extensive experiments on the new dataset and existing public datasets show that LLFormer outperforms state-of-the-art methods. We also show that employing existing LLIE methods trained on our benchmark as a pre-processing step significantly improves the performance of downstream tasks, e.g., face detection in low-light conditions. The source code and pre-trained models are available at https://github.com/TaoWangzj/LLFormer.

研究动机与目标

由于 4K/8K 传感器与流媒体的兴起，提升 UHD 友好型 LLIE 的必要性。
创建首个大规模 UHD 低光图像增强基准（UHD-LOL），包含 4K 和 8K 子集。
在 UHD-LLIE 上评估现有 LLIE 方法，并识别在 UHD 设置中的局限性。
提出 LLFormer，一种为 UHD-LLIE 设计且计算量较小的 transformer 模型。
证明 UHD-LLIE 的改进能提升下游任务（如人脸检测）。

提出的方法

引入基于轴向的多头自注意力（A-MSA），实现沿空间维的线性复杂度。
提出 Dual Gated Feed-Forward Network（DGFN）以提升特征表示。
应用 Cross-layer Attention Fusion Block（CAFB）以自适应地跨层融合特征。
使用含跳跃连接的分层编码-解码结构，并配合像素重排/像素反重排操作。
以平滑的 L1 损失进行训练，并在 UHD-LOL、LOL 和 MIT-Adobe FiveK 数据集上进行评估。

实验结果

研究问题

RQ1在保持计算效率的同时，如何在超高清（4K/8K）图像上有效执行 LLIE？
RQ2专用于 UHD-LLIE 的 transformer 架构在 UHD-LOL 和公开数据集上相比现有最先进的 LLIE 方法有何差异？
RQ3通过 LLFormer 在低光条件下的 UHD-LLIE 改进是否会转化为下游任务（如人脸检测）的提升？

主要发现

LLFormer 在 UHD-LOL4K 与 UHD-LOL8K 基准上达到最先进性能，在 UHD-LOL4K 的 PSNR 上比 Restormer 高出 0.42 dB。
基于 transformer 的方法（Uformer、Restormer、LLFormer）在 UHD 数据集上优于传统和基于 CNN 的 LLIE 方法，LLFormer 在性能与效率之间提供最佳权衡。
在公开数据集 LOL 和 MIT-Adobe FiveK 上，LLFormer 在 PSNR、SSIM、LPIPS、MAE 等指标上均名列前茅，若干指标超越 Uformer 和 Restormer。
消融研究表明 Axis-based MSA 与 DGFN 显著提升 PSNR/SSIM，CAFB 加权跳跃连接进一步改进结果。
用顶级 LLIE 方法（含 LLFormer）进行预处理可显著提升下游人脸检测的 AP（如 LLFormer 提升约 71.2% 的 AP）。
LLFormer 在 MACs/参数方面具有有利的效率，且推理速度较快（示例：22.52G MACs，24.52M 参数，0.063s），与更宽更深的变体相比具有竞争力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。