Skip to main content
QUICK REVIEW

[論文レビュー] Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations

Jiwoon Ahn, Sunghyun Cho|arXiv (Cornell University)|Apr 10, 2019
Advanced Neural Network Applications参考文献 51被引用数 68
ひとこと要約

本論文は image-level supervision から pseudo instance segmentation ラベルを生成するために、クラス非依存のインスタンスマップとピクセル間アフィニティを学習することで IRNet を提案し、追加の proposals やアノテーションなしで完全に教師付きモデルを訓練できるようにする。

ABSTRACT

This paper presents a novel approach for learning instance segmentation with image-level class labels as supervision. Our approach generates pseudo instance segmentation labels of training images, which are used to train a fully supervised model. For generating the pseudo labels, we first identify confident seed areas of object classes from attention maps of an image classification model, and propagate them to discover the entire instance areas with accurate boundaries. To this end, we propose IRNet, which estimates rough areas of individual instances and detects boundaries between different object classes. It thus enables to assign instance labels to the seeds and to propagate them within the boundaries so that the entire areas of instances can be estimated accurately. Furthermore, IRNet is trained with inter-pixel relations on the attention maps, thus no extra supervision is required. Our method with IRNet achieves an outstanding performance on the PASCAL VOC 2012 dataset, surpassing not only previous state-of-the-art trained with the same level of supervision, but also some of previous models relying on stronger supervision.

研究の動機と目的

  • Motivate learning instance segmentation using only image-level class labels.
  • Develop a method to produce pseudo instance segmentation labels without external proposals or extra supervision.
  • Leverage class attention maps to derive inter-pixel relations for accurate instance delineation.
  • Enable training of standard segmentation models (e.g., Mask R-CNN) with pseudo labels.

提案手法

  • Use Class Attention Maps (CAMs) from an image classifier to seed instance areas.
  • Introduce IRNet with two branches: (i) a displacement field predicting a centroid-directed vector for each pixel, (ii) a class boundary detector producing a boundary map.
  • Train IRNet using inter-pixel relations derived from CAMs: pixel displacements for same-instance pairs and class equivalence labels for neighboring pixel pairs.
  • Refine the displacement field iteratively to converge toward centroids and generate a class-agnostic instance map.
  • Compute pixel-wise semantic affinities from the boundary map and propagate CAM scores via random-walk-based propagation to form instance-aware CAMs.
  • Synthesize pseudo instance masks by combining instance maps with refined instance-wise CAMs and affinities, then train standard detectors/segmenters on these pseudo labels.

実験結果

リサーチクエスチョン

  • RQ1Can image-level labels be leveraged to recover per-instance segmentation without external proposals?
  • RQ2How can inter-pixel relations derived from CAMs be learned to produce reliable pseudo instance masks?
  • RQ3Does incorporating class boundaries and displacement fields improve pseudo-label quality and downstream segmentation accuracy?
  • RQ4How does the proposed approach compare to state-of-the-art weakly supervised methods on PASCAL VOC 2012?
  • RQ5Can pseudo labels trained from IRNet yield competitive results for Mask R-CNN and DeepLab trained with weak supervision?

主な発見

  • IRNet with CAMs and inter-pixel relations yields higher-quality pseudo instance labels than prior image-level supervision methods (e.g., CAM alone).
  • Incorporating class boundaries significantly improves pseudo-label quality, increasing APr50 by over 25% in their ablation study.
  • Adding the displacement field helps distinguish multiple instances of the same class and further improves performance.
  • Pseudo labels enable Mask R-CNN trained with them to outperform several methods using stronger supervision on PASCAL VOC 2012.
  • Pseudo semantic segmentation labels produced by IRNet surpass AffinityNet in mIoU on PASCAL VOC 2012 train/val sets, indicating better pixel-level affinities.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。