QUICK REVIEW

[論文レビュー] Enhancing Underwater Images via Adaptive Semantic-aware Codebook Learning

Bosen Lin, Feng Gao|arXiv (Cornell University)|Feb 11, 2026

Image Enhancement Techniques被引用数 0

ひとこと要約

SUCode は意味認識付きのピクセルレベルのコードブックと、領域ごとの劣化に適応する3段階の訓練パラダイムを導入し、全参照指標で最先端、無参照性能でも競争力を発揮します。

ABSTRACT

Underwater Image Enhancement (UIE) is an ill-posed problem where natural clean references are not available, and the degradation levels vary significantly across semantic regions. Existing UIE methods treat images with a single global model and ignore the inconsistent degradation of different scene components. This oversight leads to significant color distortions and loss of fine details in heterogeneous underwater scenes, especially where degradation varies significantly across different image regions. Therefore, we propose SUCode (Semantic-aware Underwater Codebook Network), which achieves adaptive UIE from semantic-aware discrete codebook representation. Compared with one-shot codebook-based methods, SUCode exploits semantic-aware, pixel-level codebook representation tailored to heterogeneous underwater degradation. A three-stage training paradigm is employed to represent raw underwater image features to avoid pseudo ground-truth contamination. Gated Channel Attention Module (GCAM) and Frequency-Aware Feature Fusion (FAFF) jointly integrate channel and frequency cues for faithful color restoration and texture recovery. Extensive experiments on multiple benchmarks demonstrate that SUCode achieves state-of-the-art performance, outperforming recent UIE methods on both reference and no-reference metrics. The code will be made public available at https://github.com/oucailab/SUCode.

研究の動機と目的

海中画像の ill-posed 問題と領域-wise の劣化に対処するため、意味認識付き離散表現を導入する。
意味マスクに guided されたピクセルレベルのカテゴリ固有コードブックを学習し、同時復元と強調を行う。
コードブック学習、表現、強調を分離する3段階の訓練パラダイムによって pseudo ground-truth 汚染を緩和する。
GCAM および FAFF を提案し、意味的一貫性を保ちつつ色忠実度と質感のディテールを向上させる。

提案手法

RAW な水中画像と意味マスクを用いて C 種 semantic クラスに対する sematic-category 特異的コードブック Z_c を学習する。
Stage II 自己復元：クラスごとに量子化特徴を重み付けして統一された離散表現を合成するウェイト予測器を用いて統合する。
Stage III 強調：FAFF によるドメイン適応機能モジュレーションで生データと強調特徴を融合し、GCAM を用いた色認識復元を伴う二重デコーダーを用いる。
双重デコーダー構造を使用し、G_q が生画像を復元し、G_r が Swin Transformer ベースのウェイト予測器でドメイン変換をサポートする。
GCAM は色チャネルを再重み付けして水中の色かぶりに対応し、色の現実感を維持する。
FAFF は実部 FFT、位相保持、振幅変調、およびアフィン特徴モジュレーションを用いた周波数領域フュージョンを行い、強調テクスチャを伝達しつつ構造を保持する。

Figure 1: The comparison of the training and testing pipeline and enhance results between different codebook generation methods. The proposed SUCode’s result is sharper and clearer, with more natural color.

実験結果

リサーチクエスチョン

RQ1UIE における離散コードブック学習へ意味情報をどう組み込んで領域固有の劣化を扱えるか。
RQ23段階の訓練パラダイムは UIE における pseudo ground-truth 参照への依存を緩和しつつ頑健な表現を学習できるか。
RQ3意味情報付きピクセルレベルのコードブックは、ワンショットまたはカテゴリ非依存コードブックと比較して復元品質を改善するか。
RQ4FAFF と GCAM は水中画像の色忠実度とテクスチャ復元を向上させるか。

主な発見

Method	SUIM-E SSIM	SUIM-E PSNR	SUIM-E LPIPS	SUIM-E UCIQE	SUIM-E UIQM	UIEB SSIM	UIEB PSNR	UIEB LPIPS	UIEB UCIQE	UIEB UIQM
Fusion	0.876	16.824	0.226	58.413	2.811	0.907	18.483	0.211	52.823	3.251
IBLA	0.788	16.019	0.221	62.498	1.870	0.771	15.009	0.341	53.816	2.346
ULAP	0.860	16.574	0.232	59.746	2.174	0.902	17.871	0.233	52.620	3.309
UDCP	0.581	11.694	0.308	62.172	1.815	0.603	11.001	0.399	59.492	2.147
WaterNet	0.907	22.295	0.144	60.999	2.807	0.898	21.566	0.237	61.805	3.314
UColor	0.898	22.860	0.145	62.436	2.860	0.906	22.266	0.187	59.176	3.316
UShape	0.851	21.369	0.147	53.451	2.969	0.819	20.266	0.219	48.406	3.296
CCMSR	0.896	22.028	0.161	60.129	2.875	0.914	22.761	0.180	57.084	3.274
WfDiff	0.853	16.176	0.184	57.052	2.701	0.888	18.994	0.214	53.269	3.255
SMDR-IS	0.896	22.082	0.146	62.600	2.749	0.924	22.232	0.166	61.559	2.952
AMSIN	0.902	21.923	0.125	61.399	2.762	0.921	22.635	0.146	62.332	3.309
RUE-Net	0.923	22.902	0.121	62.500	2.776	0.923	22.743	0.164	62.357	3.260
HCLR-Net	0.902	22.317	0.124	58.765	3.360	0.902	22.317	0.124	58.599	3.279
FDCE-Net	0.923	23.039	0.141	58.765	3.360	0.923	23.039	0.141	58.765	3.360
SS-UIE	0.871	21.713	0.182	59.538	2.815	0.850	21.006	0.255	58.919	3.066
CDF-UIE	0.892	22.089	0.116	54.826	2.838	0.886	21.592	0.159	54.219	3.333
FeMaSR	0.908	22.749	0.100	62.605	2.841	0.883	22.733	0.137	62.675	3.301
AdaCode	0.886	22.329	0.105	62.409	2.812	0.818	21.792	0.156	60.835	3.216
RIDCP	0.509	13.407	0.572	42.184	2.533	0.573	14.915	0.487	48.679	2.246
IPC-Dehaze	0.823	13.869	0.381	50.837	2.252	0.852	16.923	0.226	54.777	2.352
CodeUNet	0.590	17.349	0.447	54.769	2.705	0.836	21.468	0.196	59.650	3.383
SUCode(Ours)	0.939	23.908	0.087	62.618	2.878	0.925	23.857	0.124	63.136	3.174

SUCode は SUIM-E および UIEB データセット全体で全参照指標（PSNR、SSIM、LPIPS）において最先端の性能を達成。
SUCode はノーリファレンス指標（UCIQE、UIQM）でも競争力を示し、特に SUIM-E および UIEB の両方で最高の UCIQE を達成。
データセット横断評価では UIEB で学習し LSUI や UFO-120 で評価した場合に他のベースラインを凌駕する強い汎化性能を示す。
意味認識付きコードブックは、非意味的コードブック手法より鋭敏で自然な色復元と質感保持を実現。
3段階の訓練戦略は ill-posed な ground-truth 問題を効果的に処理し、強調と意味コンテンツを整合させる。

Figure 2: The overall structure of the proposed SUCode. In stage I, the semantic-aware category‑specific codebooks are updated with the mask $m$ . Stage II is a partition and synthesis process of the codebook, achieved through the self-reconstruction of raw underwater images. In stage III, domain co

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。