QUICK REVIEW

[论文解读] Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs

Wei Shen, Kai Zhao|arXiv (Cornell University)|Mar 31, 2016

Advanced Neural Network Applications参考文献 31被引用 23

一句话总结

本文提出一种带有尺度关联辅助输出的全卷积网络，用于从自然图像中提取物体骨架，通过针对每个网络阶段施加特定监督，将各阶段与特定骨架尺度关联，从而实现多尺度特征学习。该方法在不同阶段融合特定尺度的响应，最终在两个基准数据集上取得最先进性能，SK506/WH-SYMMAX 和 WH-SYMMAX/SK506 的 F-measure 分别达到 0.692 和 0.529。

ABSTRACT

Object skeleton is a useful cue for object detection, complementary to the object contour, as it provides a structural representation to describe the relationship among object parts. While object skeleton extraction in natural images is a very challenging problem, as it requires the extractor to be able to capture both local and global image context to determine the intrinsic scale of each skeleton pixel. Existing methods rely on per-pixel based multi-scale feature computation, which results in difficult modeling and high time consumption. In this paper, we present a fully convolutional network with multiple scale-associated side outputs to address this problem. By observing the relationship between the receptive field sizes of the sequential stages in the network and the skeleton scales they can capture, we introduce a scale-associated side output to each stage. We impose supervision to different stages by guiding the scale-associated side outputs toward groundtruth skeletons of different scales. The responses of the multiple scale-associated side outputs are then fused in a scale-specific way to localize skeleton pixels with multiple scales effectively. Our method achieves promising results on two skeleton extraction datasets, and significantly outperforms other competitors.

研究动机与目标

解决在复杂自然图像中物体部件尺度与结构多变所带来的物体骨架提取挑战。
克服现有方法依赖逐像素多尺度特征而带来的高计算成本与泛化能力差的局限性。
通过全卷积架构中的尺度感知特征学习，建模局部与全局上下文，实现精确的骨架提取。
通过提取的骨架提升下游任务（如对称部件分割与物体提议检测）的性能。

提出的方法

设计一种带有多个尺度关联辅助输出的全卷积网络（FCN），每个网络阶段均附加辅助输出。
每个辅助输出通过对应特定量化尺度的真实骨架图进行监督，仅保留接收野尺寸范围内、尺度小于该阶段接收野大小的骨架像素。
各网络阶段的接收野尺寸依次增大，使网络能够捕捉不同内在尺度的骨架。
每个辅助输出生成特定尺度的得分图，并以尺度特定方式融合，生成最终的骨架预测结果。
采用多任务学习策略，每个辅助输出分别针对特定尺度的骨架图进行优化，从而增强多尺度特征学习能力。
最终的骨架图通过融合所有辅助输出的响应获得，每个输出根据其关联的尺度范围贡献相应响应。

实验结果

研究问题

RQ1带有尺度关联辅助输出的全卷积网络是否能有效建模自然图像中的多尺度骨架特征？
RQ2通过使用特定尺度的真实骨架图对每个网络阶段进行监督，是否能提升骨架提取的准确率与鲁棒性？
RQ3所提方法在基准数据集上的速度与性能是否均优于现有的学习型与传统方法？
RQ4所提取的骨架在多大程度上能支持下游任务，如对称部件分割与物体提议检测？

主要发现

在 SK506/WH-SYMMAX 数据集上，该方法取得 0.692 的 F-measure，显著优于次优方法 HED 的 0.637。
在 WH-SYMMAX/SK506 基准测试中，该方法取得 0.529 的 F-measure，超越 HED（0.492）与 MIL（0.387）。
在 BSDS-Parts 数据集上的对称部件分割任务中，该方法的精确率-召回率曲线优于 Lee 的方法与 Levinshtein 的方法，表现出更优的部件定位能力。
通过将骨架衍生的部件掩码与 Edge Boxes 结合，该方法实现了有效的物体提议检测，显著提升了 IoU 分数与提议准确性。
网络能够为每个骨架像素预测其尺度，从而通过基于圆盘的扩展方法可靠地重建物体部件，该方法通过定量置信度评分得到验证。
消融实验表明，特定尺度监督与多阶段融合对性能至关重要，若移除辅助输出则性能出现显著下降。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。