[论文解读] Enhancing Underwater Images via Adaptive Semantic-aware Codebook Learning
SUCode 引入语义感知的像素级码本和三阶段训练范式,以适应区域级退化,在全参考指标上达到最先进结果,在无参考性能上具竞争力。
Underwater Image Enhancement (UIE) is an ill-posed problem where natural clean references are not available, and the degradation levels vary significantly across semantic regions. Existing UIE methods treat images with a single global model and ignore the inconsistent degradation of different scene components. This oversight leads to significant color distortions and loss of fine details in heterogeneous underwater scenes, especially where degradation varies significantly across different image regions. Therefore, we propose SUCode (Semantic-aware Underwater Codebook Network), which achieves adaptive UIE from semantic-aware discrete codebook representation. Compared with one-shot codebook-based methods, SUCode exploits semantic-aware, pixel-level codebook representation tailored to heterogeneous underwater degradation. A three-stage training paradigm is employed to represent raw underwater image features to avoid pseudo ground-truth contamination. Gated Channel Attention Module (GCAM) and Frequency-Aware Feature Fusion (FAFF) jointly integrate channel and frequency cues for faithful color restoration and texture recovery. Extensive experiments on multiple benchmarks demonstrate that SUCode achieves state-of-the-art performance, outperforming recent UIE methods on both reference and no-reference metrics. The code will be made public available at https://github.com/oucailab/SUCode.
研究动机与目标
- 通过引入语义感知的离散表示来解决水下图像的病态性和区域级退化问题。
- 学习由语义掩码引导的像素级、类别特定的码本以进行联合修复与增强。
- 通过三阶段训练范式分离码本学习、表征与增强,减轻伪真值污染。
- 提出 GCAM 和 FAFF,以在保持语义一致性的同时改善色彩保真度与纹理细节。
提出的方法
- 使用原始水下图像和语义掩码学习 C 个语义类别的语义类别特定码本 Z_c。
- Stage II 自我恢复:通过权重预测器对各类别量化特征进行加权聚合,合成统一的离散表示。
- Stage III 增强:通过 FAFF 的域自适应特征调制融合原始与增强特征,并使用带 GCAM 的双解码器进行颜色感知的修复。
- 采用双解码器架构,其中 G_q 恢复原始图像,G_r 通过一个基于 Swin Transformer 的权重预测器支持域转换。
- GCAM 重新加权颜色通道以解决水下色偏并保持颜色真实感。
- FAFF 通过实部快速傅里叶变换、相位保持、幅度调制和仿射特征调制进行频域特征融合,以在保持结构的同时转移增强纹理。

实验结果
研究问题
- RQ1如何将语义信息融入离散码本学习以应对UIE中的区域特定退化?
- RQ2三阶段训练范式在UIE中是否能在学习鲁棒表征的同时缓解对伪真值参考的依赖?
- RQ3面向像素级且语义感知的码本是否比单 shot 或类别无关的码本能提升修复质量?
- RQ4频域特征融合(FAFF)和 GCAM 是否能在水下图像中提供更好的色彩保真度与纹理恢复?
主要发现
| Method | SUIM-E SSIM | SUIM-E PSNR | SUIM-E LPIPS | SUIM-E UCIQE | SUIM-E UIQM | UIEB SSIM | UIEB PSNR | UIEB LPIPS | UIEB UCIQE | UIEB UIQM |
|---|---|---|---|---|---|---|---|---|---|---|
| Fusion | 0.876 | 16.824 | 0.226 | 58.413 | 2.811 | 0.907 | 18.483 | 0.211 | 52.823 | 3.251 |
| IBLA | 0.788 | 16.019 | 0.221 | 62.498 | 1.870 | 0.771 | 15.009 | 0.341 | 53.816 | 2.346 |
| ULAP | 0.860 | 16.574 | 0.232 | 59.746 | 2.174 | 0.902 | 17.871 | 0.233 | 52.620 | 3.309 |
| UDCP | 0.581 | 11.694 | 0.308 | 62.172 | 1.815 | 0.603 | 11.001 | 0.399 | 59.492 | 2.147 |
| WaterNet | 0.907 | 22.295 | 0.144 | 60.999 | 2.807 | 0.898 | 21.566 | 0.237 | 61.805 | 3.314 |
| UColor | 0.898 | 22.860 | 0.145 | 62.436 | 2.860 | 0.906 | 22.266 | 0.187 | 59.176 | 3.316 |
| UShape | 0.851 | 21.369 | 0.147 | 53.451 | 2.969 | 0.819 | 20.266 | 0.219 | 48.406 | 3.296 |
| CCMSR | 0.896 | 22.028 | 0.161 | 60.129 | 2.875 | 0.914 | 22.761 | 0.180 | 57.084 | 3.274 |
| WfDiff | 0.853 | 16.176 | 0.184 | 57.052 | 2.701 | 0.888 | 18.994 | 0.214 | 53.269 | 3.255 |
| SMDR-IS | 0.896 | 22.082 | 0.146 | 62.600 | 2.749 | 0.924 | 22.232 | 0.166 | 61.559 | 2.952 |
| AMSIN | 0.902 | 21.923 | 0.125 | 61.399 | 2.762 | 0.921 | 22.635 | 0.146 | 62.332 | 3.309 |
| RUE-Net | 0.923 | 22.902 | 0.121 | 62.500 | 2.776 | 0.923 | 22.743 | 0.164 | 62.357 | 3.260 |
| HCLR-Net | 0.902 | 22.317 | 0.124 | 58.765 | 3.360 | 0.902 | 22.317 | 0.124 | 58.599 | 3.279 |
| FDCE-Net | 0.923 | 23.039 | 0.141 | 58.765 | 3.360 | 0.923 | 23.039 | 0.141 | 58.765 | 3.360 |
| SS-UIE | 0.871 | 21.713 | 0.182 | 59.538 | 2.815 | 0.850 | 21.006 | 0.255 | 58.919 | 3.066 |
| CDF-UIE | 0.892 | 22.089 | 0.116 | 54.826 | 2.838 | 0.886 | 21.592 | 0.159 | 54.219 | 3.333 |
| FeMaSR | 0.908 | 22.749 | 0.100 | 62.605 | 2.841 | 0.883 | 22.733 | 0.137 | 62.675 | 3.301 |
| AdaCode | 0.886 | 22.329 | 0.105 | 62.409 | 2.812 | 0.818 | 21.792 | 0.156 | 60.835 | 3.216 |
| RIDCP | 0.509 | 13.407 | 0.572 | 42.184 | 2.533 | 0.573 | 14.915 | 0.487 | 48.679 | 2.246 |
| IPC-Dehaze | 0.823 | 13.869 | 0.381 | 50.837 | 2.252 | 0.852 | 16.923 | 0.226 | 54.777 | 2.352 |
| CodeUNet | 0.590 | 17.349 | 0.447 | 54.769 | 2.705 | 0.836 | 21.468 | 0.196 | 59.650 | 3.383 |
| SUCode(Ours) | 0.939 | 23.908 | 0.087 | 62.618 | 2.878 | 0.925 | 23.857 | 0.124 | 63.136 | 3.174 |
- SUCode 在 SUIM-E 与 UIEB 数据集上实现了全参考指标(PSNR、SSIM、LPIPS)的最先进性能。
- SUCode 在无参考指标(UCIQE、UIQM)上表现竞争力,尤其在 SUIM-E 与 UIEB 上达到最高的 UCIQE。
- 跨数据集评估显示出较强的泛化能力,在以 UIEB 训练、在 LSUI 与 UFO-120 上测试时超越了若干基线。
- 语义感知的码本相比非语义码本方法能带来更清晰、自然的色彩恢复与更好的纹理保持。
- 三阶段训练策略有效处理病态的伪真值问题,并使增强与语义内容对齐。

更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。