QUICK REVIEW

[论文解读] Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning

Tsung-Wei Ke, Jyh-Jing Hwang|arXiv (Cornell University)|May 3, 2021

Domain Adaptation and Few-Shot Learning参考文献 53被引用 36

一句话总结

该论文将弱监督分割表述为半监督像素级度量学习，并引入四种像素到分割的对比关系，以从部分标注中学习通用特征，在 Pascal VOC 和 DensePose 上实现显著提升。

ABSTRACT

Weakly supervised segmentation requires assigning a label to every pixel based on training instances with partial annotations such as image-level tags, object bounding boxes, labeled points and scribbles. This task is challenging, as coarse annotations (tags, boxes) lack precise pixel localization whereas sparse annotations (points, scribbles) lack broad region coverage. Existing methods tackle these two types of weak supervision differently: Class activation maps are used to localize coarse labels and iteratively refine the segmentation model, whereas conditional random fields are used to propagate sparse labels to the entire image. We formulate weakly supervised segmentation as a semi-supervised metric learning problem, where pixels of the same (different) semantics need to be mapped to the same (distinctive) features. We propose 4 types of contrastive relationships between pixels and segments in the feature space, capturing low-level image similarity, semantic annotation, co-occurrence, and feature affinity They act as priors; the pixel-wise feature can be learned from training images with any partial annotations in a data-driven fashion. In particular, unlabeled pixels in training images participate not only in data-driven grouping within each image, but also in discriminative feature learning within and across images. We deliver a universal weakly supervised segmenter with significant gains on Pascal VOC and DensePose. Our code is publicly available at https://github.com/twke18/SPML.

研究动机与目标

从部分标注的训练图像（标签、框、点、涂鸦）中激发学习语义分割器。
通过统一的对比学习框架，利用未标注数据有效地传播和细化像素级语义。
将一个判别式、非参数的 SegSort 基于方法扩展到弱监督。
在 Pascal VOC 和 DensePose 的不同标注类型下，展示对SOTA的一致提升。

提出的方法

将弱监督分割框架化为半监督像素级度量学习。
提出四种像素到分割的对比关系：低层图像相似性、语义标注、语义共现、特征亲和性。
使用这些关系为每个像素定义正/负分割集，并将监督扩展到标注像素之外。
优化一个统一的像素级对比损失 L(i)，该损失通过权重 λI、λC、λO、λA 聚合四项。
在训练中利用未标记像素和分割来学习具有辨别力的跨图像特征结构。

实验结果

研究问题

RQ1一个像素到分割的对比框架是否可以处理语义分割中的所有形式的弱监督（标签、框、点、涂鸦）？
RQ2未标记像素和分割通过多关系先验对学习是否有 meaningful contrib？
RQ3在不同弱监督设置下，所提 SPML 相较于 SOTA 的表现如何？
RQ4所学习的特征空间在图内及跨图像上是否具有良好可区分性，以实现准确分割？

主要发现

SPML 在 Pascal VOC 的图像标签上实现了 SOTA 或显著提升（+4.4% 在无显著区域时、+5.1% 在无显著区域时）和框标注上 +3.2%。
在 Pascal VOC 的涂鸦监督下，SPML 在验证集达到 74.2% mIoU、在测试集达到 76.1%，分别达到全监督的 97.5% 与 98.4%。
在 DensePose 的点监督下，SPML 达到 77.1% WvF 和 44.2 mIoU，比之前的基线提升 12.9% mIoU（达到 77.1% WvF）。
SPML 展示了对标注稀疏性的鲁棒性，随着监督变得更稀疏（如从涂鸦到点），仍能保持较高的全监督性能份额。
定性结果显示更好地与区域边界对齐，并在视觉相似性方面优于全监督方法，随着引入更多正则化关系，改进效果持续扩大。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。