[論文レビュー] UCL-Dehaze: Towards Real-world Image Dehazing via Unsupervised Contrastive Learning
tldr: UCL-Dehaze は、実世界の画像デヘイズのための教師なし対照学習フレームワークを adversarial training と組み合わせ、ペアなしの hazy and clean images を用いて、ペアデータなしで最先端の結果を達成します。
While the wisdom of training an image dehazing model on synthetic hazy data can alleviate the difficulty of collecting real-world hazy/clean image pairs, it brings the well-known domain shift problem. From a different yet new perspective, this paper explores contrastive learning with an adversarial training effort to leverage unpaired real-world hazy and clean images, thus bridging the gap between synthetic and real-world haze is avoided. We propose an effective unsupervised contrastive learning paradigm for image dehazing, dubbed UCL-Dehaze. Unpaired real-world clean and hazy images are easily captured, and will serve as the important positive and negative samples respectively when training our UCL-Dehaze network. To train the network more effectively, we formulate a new self-contrastive perceptual loss function, which encourages the restored images to approach the positive samples and keep away from the negative samples in the embedding space. Besides the overall network architecture of UCL-Dehaze, adversarial training is utilized to align the distributions between the positive samples and the dehazed images. Compared with recent image dehazing works, UCL-Dehaze does not require paired data during training and utilizes unpaired positive/negative data to better enhance the dehazing performance. We conduct comprehensive experiments to evaluate our UCL-Dehaze and demonstrate its superiority over the state-of-the-arts, even only 1,800 unpaired real-world images are used to train our network. Source code has been available at https://github.com/yz-wang/UCL-Dehaze.
研究の動機と目的
- Motivate dehazing without paired real-world data to bridge the synthetic-real domain gap.
- Leverage unpaired real-world hazy and clean images as negative/positive samples in training.
- Propose a pixel-wise self-contrastive perceptual loss to guide restoration in embedding space.
- Develop a UNet-like generator with spectral normalization and self-calibrated convolutions to improve haze removal.
- Evaluate against state-of-the-art dehazing methods using full-reference and perceptual quality metrics.
提案手法
- Formulate dehazing as image-to-image translation trained in an unsupervised setting with unpaired data.
- Use a UNet-based generator with nine residual blocks and SC Conv for multi-scale feature extraction.
- Incorporate PatchGAN discriminator with LSGAN loss for stable adversarial training.
- Introduce patch-wise contrastive loss (L_PC) across multiple encoder layers to pull corresponding patches together and push negatives apart.
- Introduce pixel-wise self-contrastive perceptual loss (L_SCP) using VGG-16 features to align restored images with clean samples while distancing hazy samples.
- Combine losses with an identity loss to preserve structure, yielding L_Total = lambda1 L_adv + lambda2 L_PC + lambda3 L_SCP + lambda4 L_ide.
実験結果
リサーチクエスチョン
- RQ1Can unsupervised contrastive learning with unpaired real-world hazy and clean images bridge the gap between synthetic and real-world hazes?
- RQ2Does incorporating negative (hazy) samples in contrastive learning improve dehazing performance compared to using only positive (clean) samples?
- RQ3Can a pixel-wise self-contrastive perceptual loss enhance restoration quality when combined with patch-wise contrastive learning and adversarial training?
- RQ4How does the proposed framework perform against state-of-the-art dehazing methods on real-world and synthetic datasets?
- RQ5Is the method robust with limited unpaired real-world data (e.g., 1,800 images)?
主な発見
- UCL-Dehaze achieves the highest PSNR/SSIM on SOTS outdoor and HSTS among 18 methods.
- It also attains the best (or near-best) color fidelity and contrast measures (e.g., CIEDE2000, Contrast gain) compared to baselines.
- Qualitative results show clearer, more natural dehazed images with preserved details on real-world hazy images.
- The method operates in an unsupervised fashion using 1,800 unpaired real-world images for training, without paired data.
- Ablation and comparative results indicate the effectiveness of integrating self-contrastive perceptual loss and adversarial training.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。