QUICK REVIEW

[论文解读] Unbalanced Optimal Transport Dictionary Learning for Unsupervised Hyperspectral Image Clustering

Joshua Lentz, Nicholas Karris|arXiv (Cornell University)|Mar 10, 2026

Remote-Sensing Image Classification被引用 0

一句话总结

该论文提出Unbalanced Optimal Transport Dictionary Learning (UBOT-DL)用于无监督高光谱图像聚类，通过学习带有非平衡Wasserstein重心的基表示，然后在学习到的权重上进行光谱聚类。

ABSTRACT

Hyperspectral images capture vast amounts of high-dimensional spectral information about a scene, making labeling an intensive task that is resistant to out-of-the-box statistical methods. Unsupervised learning of clusters allows for automated segmentation of the scene, enabling a more rapid understanding of the image. Partitioning the spectral information contained within the data via dictionary learning in Wasserstein space has proven an effective method for unsupervised clustering. However, this approach requires balancing the spectral profiles of the data, blurring the classes, and sacrificing robustness to outliers and noise. In this paper, we suggest improving this approach by utilizing unbalanced Wasserstein barycenters to learn a lower-dimensional representation of the underlying data. The deployment of spectral clustering on the learned representation results in an effective approach for the unsupervised learning of labels.

研究动机与目标

在有限或无标签样本的情况下，激发对高维高光谱数据的无监督标注。
将每个像素表示为光谱带的分布，并学习一个重心字典以重构像素。
通过使用非平衡Wasserstein重心而非归一化（平衡）分布来提高对异常值和噪声的鲁棒性。
在学习到的重心权重上实现后续的光谱聚类，以获得像素标签。

提出的方法

将像素建模为光谱带的分布，并寻求字典D和权重矩阵Λ，使P(D,Λ)通过非平衡Wasserstein重心来近似像素。
使用熵正则化以实现类Sinkhorn的算法，用于高效计算带边际项和熵项（τ，ε）的非平衡最优传输（UOT）。
在重心重构P(D,Λ)与原始数据X之间最小化损失L（实验中使用二次损失）。
通过梯度优化迭代更新D和Λ，并对D强制非负性/下界，对Λ进行softmax归一化。
进行两阶段聚类：先通过UBOT-DL学习Λ，再对Λ进行光谱聚类（通过最近邻图和拉普拉斯特征映射），随后通过匈牙利算法与真实标签进行比对匹配。

Fig. 1 : Pictured here is the balanced (left) and unbalanced (right) barycentric interpolation between two Gaussian distributions with the same variance using $\tau=0.5$ and $\epsilon=0.001$ . Notice that the unbalanced barycenters do not obey strict mass conservation, but still take the general sha

实验结果

研究问题

RQ1如何利用非平衡最优传输来学习高光谱数据的紧凑、鲁棒表示，而不需要大量归一化？
RQ2在学习到的重心权重上进行光谱聚类是否能在标准数据集上提供准确的无监督标注？
RQ3超参数（τ、ε、原子数量k、NN）对不同数据集的聚类准确性与纯度有何影响？
RQ4在准确性和对异常值的鲁棒性方面，UBOT-DL与平衡Wasserstein字典学习相比有何差异？

主要发现

Data set	Accuracy	τ	ε	Atoms	NN
Salinas A	0.89	1000	0.1	24	25
Salinas A	0.86	10000	0.05	30	20
Pavia Centre	0.84	1000	0.07	27	15
Pavia Centre	0.82	100	0.07	27	5
Pavia U	0.63	10000	0.07	18	5
Pavia U	0.58	100000	0.1	36	5
Indian Pines	0.34	1000	0.06	32	5
Indian Pines	0.34	100000	0.06	48	5

在不同超参数设置下，UBOT-DL在标准高光谱数据集（如Salinas A、Pavia Centre、Pavia University、Indian Pines）上表现出具有竞争力的准确性。
最佳情形下的准确性包括Salinas A约0.89（τ=1000、ε=0.1、k=24、NN=25）；Pavia Centre约0.84（τ=1000、ε=0.07、k=27、NN=15）；Indian Pines约0.34（τ=1000、ε=0.06、k=32、NN=5）。
当聚类比真实类别数多出少量簇时，纯度分数提升（如Salinas A纯度0.92，c=7，τ=1000、ε=0.1、k=60、NN=45）。
在相同超参数下，UBOT-DL通常优于平衡Wasserstein字典学习（BCSC）（例如Salinas A：0.89 对 0.86）。
由于非平衡OT计算，UBOT-DL通常比BCSC慢，例如在Salinas A的最佳情形下运行时间约226秒（k=24、τ=1000、ε=0.1、500次迭代）。
该方法通过学习权重Λ实现数据维度降低，从而实现有效的后续光谱聚类。

Fig. 2 : This image shows the in-painting process of a trial run of the Salinas A data set with $24$ atoms, $\tau=1000$ , and $\epsilon=0.1$ achieving an accuracy before in-painting of $89\%$ . We note that the bottom right corner of the image accounts for the majority of the mislabeling, and that t

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。