Skip to main content
QUICK REVIEW

[论文解读] SPECIAL: Zero-shot Hyperspectral Image Classification With CLIP

Pang Li, Jing Yao|ArXiv.org|Jan 27, 2025
Remote-Sensing Image Classification被引用 9
一句话总结

SPECIAL 引入基于 CLIP 的零样本高光谱图像分类框架,包含基于 CLIP 的伪标签生成与分辨率缩放策略,以及利用光谱信息和基于 GMM 的标签细化。

ABSTRACT

Hyperspectral image (HSI) classification aims to categorize each pixel in an HSI into a specific land cover class, which is crucial for applications such as remote sensing, environmental monitoring, and agriculture. Although deep learning-based HSI classification methods have achieved significant advancements, existing methods still rely on manually labeled data for training, which is both time-consuming and labor-intensive. To address this limitation, we introduce a novel zero-shot hyperspectral image classification framework based on CLIP (SPECIAL), aiming to eliminate the need for manual annotations. The SPECIAL framework consists of two main stages: (1) CLIP-based pseudo-label generation, and (2) noisy label learning. In the first stage, HSI is spectrally interpolated to produce RGB bands. These bands are subsequently classified using CLIP, resulting in noisy pseudo-labels that are accompanied by confidence scores. To improve the quality of these labels, we propose a scaling strategy that fuses predictions from multiple spatial scales. In the second stage, spectral information and a label refinement technique are incorporated to mitigate label noise and further enhance classification accuracy. Experimental results on three benchmark datasets demonstrate that our SPECIAL outperforms existing methods in zero-shot HSI classification, showing its potential for more practical applications. The code is available at https://github.com/LiPang/SPECIAL.

研究动机与目标

  • Motivate zero-shot hyperspectral image classification to reduce reliance on manually annotated data.
  • Leverage CLIP to generate pseudo-labels for HSIs by simulating RGB bands from spectral data.
  • Incorporate spectral information and robust label refinement to mitigate pseudo-label noise and improve accuracy.
  • Demonstrate superiority over existing CLIP-based methods on multiple hyperspectral datasets.

提出的方法

  • Spectrally interpolate HSI to RGB bands to enable CLIP-based classification and obtain pseudo-labels with confidence scores.
  • Apply a resolution scaling (RS) strategy to fuse predictions from multiple image scales for better handling of objects at different sizes.
  • Use a hyperspectral classifier (MambaHSI) for spectral-based training in a warmup phase with class-balanced sampling guided by CLIP priors.
  • Partition predicted samples into confident and hard sets via Gaussian Mixture Models and BvSB-based scoring, then generate soft labels with class-specific GMMs.
  • Optimize a composite loss combining cross-entropy on random, confident, and hard sets with soft labels to refine training.
  • Incorporate PCA-reduced spectral features into GMM-based soft-label estimation to refine pseudo-labels and training data.
  • Replace RGB-based inputs with HSIs to show spectral information improves performance over RGB in ablation studies.

实验结果

研究问题

  • RQ1Can CLIP-based pseudo-labels be effectively used for zero-shot HSI classification without manual annotations?
  • RQ2Does incorporating spectral information and label refinement improve zero-shot HSI accuracy beyond CLIP-based pseudo-labels alone?
  • RQ3What is the impact of resolution scaling on handling objects of varying sizes in CLIP-based HSI classification?
  • RQ4How does the proposed label refinement strategy mitigate noise in pseudo-labels and improve training stability?

主要发现

  • SPECIAL outperforms existing CLIP-based zero-shot approaches on three publicly available HSI datasets.
  • Incorporating spectral information via HSIs yields better performance than RGB-only training.
  • The resolution scaling strategy consistently improves recognition of objects at different scales, enhancing pseudo-label quality.
  • The label refinement stage using GMMs and soft labels reduces the impact of noisy pseudo-labels and boosts overall accuracy.
  • Ablation studies show that combining random, confident, and hard subsets with soft labels yields the best performance across datasets.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。