QUICK REVIEW

[论文解读] SSAP: Single-Shot Instance Segmentation With Affinity Pyramid

Naiyu Gao, Yanhu Shan|arXiv (Cornell University)|Sep 4, 2019

Advanced Image and Video Retrieval Techniques参考文献 50被引用 30

一句话总结

该论文提出SSAP，一种单次推理、无需候选框的实例分割方法，通过层次化的像素对亲和力金字塔，在一次前向传播中联合学习语义类别标注与实例感知特征学习。通过将级联图分割模块与多尺度亲和力结合，该方法在先前方法的基础上实现了5倍加速和9%的相对AP提升，在Cityscapes数据集上达到了36.9% PQ的新SOTA性能。

ABSTRACT

Recently, proposal-free instance segmentation has received increasing attention due to its concise and efficient pipeline. Generally, proposal-free methods generate instance-agnostic semantic segmentation labels and instance-aware features to group pixels into different object instances. However, previous methods mostly employ separate modules for these two sub-tasks and require multiple passes for inference. We argue that treating these two sub-tasks separately is suboptimal. In fact, employing multiple separate modules significantly reduces the potential for application. The mutual benefits between the two complementary sub-tasks are also unexplored. To this end, this work proposes a single-shot proposal-free instance segmentation method that requires only one single pass for prediction. Our method is based on a pixel-pair affinity pyramid, which computes the probability that two pixels belong to the same instance in a hierarchical manner. The affinity pyramid can also be jointly learned with the semantic class labeling and achieve mutual benefits. Moreover, incorporating with the learned affinity pyramid, a novel cascaded graph partition module is presented to sequentially generate instances from coarse to fine. Unlike previous time-consuming graph partition methods, this module achieves $5 imes$ speedup and 9% relative improvement on Average-Precision (AP). Our approach achieves state-of-the-art results on the challenging Cityscapes dataset.

研究动机与目标

为解决现有无候选框实例分割方法中语义标注与实例分组模块分离所导致的效率低下与性能不足问题。
实现语义类别预测与实例感知亲和力计算的联合学习，以充分利用两项任务之间的相互增益。
通过设计一种级联图分割模块，利用从低到高分辨率特征图的层次亲和力，加速推理过程。
在不依赖区域候选框的前提下，在Cityscapes等挑战性基准上实现最先进性能。

提出的方法

提出一种像素对亲和力金字塔，通过同时利用短程与长程亲和力，计算两个像素属于同一实例的概率。
使用密集的小感受野提取短程亲和力，利用稀疏、低分辨率特征图提取长程亲和力，且在多尺度U-Net特征上解耦处理。
将亲和力金字塔与语义分割预测集成于统一的单主干网络中，实现端到端的联合优化。
引入一种级联图分割模块，从亲和力构建图结构，并在粗到细的分辨率层级上逐步优化实例预测。
利用低层特征的高分辨率预测结果，减少高层分辨率图分割中的节点数量，从而实现5倍加速。
采用分层的自底向上图分割策略，利用低分辨率结果的置信度指导并加速高分辨率分割过程。

实验结果

研究问题

RQ1联合学习语义分割与实例感知亲和力计算是否能提升无候选框实例分割的性能？
RQ2层次化的亲和力金字塔是否能有效建模像素间局部与长程关系，以支持实例分组？
RQ3在不同分辨率层级上采用级联图分割是否能显著降低推理时间，同时保持或提升精度？
RQ4单次、端到端可训练的框架是否能在基准数据集上超越多阶段、基于候选框或多次推理的无候选框方法？

主要发现

所提方法在Cityscapes验证集上达到36.9% PQ，成为无候选框方法中的新SOTA。
语义标注与亲和力预测的联合学习带来相互增益，相比非联合基线方法，平均精度（AP）相对提升9%。
级联图分割模块相比非级联版本实现5倍推理加速，同时AP相对提升9%。
在COCO测试开发集上，该方法对汽车实例的AP达到32.7%，PQ达到55.9%，优于此前的无候选框方法（如DeeperLab）。
可视化结果表明，该方法能精确分组被遮挡或碎片化的物体，如被行人或灯柱部分遮挡的汽车。
该方法在城市场景之外也具备良好泛化能力，在COCO数据集上表现强劲，尽管未使用候选框生成机制。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。