QUICK REVIEW

[論文レビュー] SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation

Xiaolin Zhang, Yunchao Wei|arXiv (Cornell University)|Oct 22, 2018

Domain Adaptation and Few-Shot Learning参考文献 54被引用数 84

ひとこと要約

SG-Oneは、マスク付き平均プーリングでオブジェクト centered なガイダンスベクトルとコサイン類似性マップを生成し、 unseen クラスのワンショットセマンティックセグメンテーションを導く統一ネットワークを導入し、PASCAL-5iで最先端の mean IoU を達成します。

ABSTRACT

One-shot image semantic segmentation poses a challenging task of recognizing the object regions from unseen categories with only one annotated example as supervision. In this paper, we propose a simple yet effective Similarity Guidance network to tackle the One-shot (SG-One) segmentation problem. We aim at predicting the segmentation mask of a query image with the reference to one densely labeled support image of the same category. To obtain the robust representative feature of the support image, we firstly adopt a masked average pooling strategy for producing the guidance features by only taking the pixels belonging to the support image into account. We then leverage the cosine similarity to build the relationship between the guidance features and features of pixels from the query image. In this way, the possibilities embedded in the produced similarity maps can be adapted to guide the process of segmenting objects. Furthermore, our SG-One is a unified framework which can efficiently process both support and query images within one network and be learned in an end-to-end manner. We conduct extensive experiments on Pascal VOC 2012. In particular, our SGOne achieves the mIoU score of 46.3%, surpassing the baseline methods.

研究の動機と目的

単一の annotated 例から unseen カテゴリをセグメントするための一 shot セマンティックセグメンテーションの動機づけ。
ネットワーク入力を変更せずに robust なサポートオブジェクト表現を開発。
ピクセル単位のコサイン類似性を活用してクエリ画像のセグメンテーションをガイド。
サポートとクエリ処理を単一のエンドツーエンド学習可能なネットワークに統合。
PASCAL-5i における従来法より性能改善を実証。

提案手法

サポート画像とクエリ画像から共有ステムネットワークを用いて高レベル特徴を抽出。
サポートマスク上のマスクド平均プーリングを通じて堅牢なオブジェクト表現を得る。
サポート表現とクエリ特徴間のピクセル単位コサイン類似性を計算し、類似性ガイダンスマップを形成。
クエリ特徴に類似性ガイダンスマップを掛け合わせてターゲットオブジェクトへのセグメンテーションを誘導。
ガイダンスとクエリ特徴を入力とするセグメンテーションブランチを用い、エンドツーエンドのフレームワークで最終マスクを予測。
クロスエントロピー損失で訓練；ファインチューニング不要のワンショットテストを有効化。

実験結果

リサーチクエスチョン

RQ1統一ネットワークと類似性ガイド付き注意が unseen クラスのワンショットセグメンテーションを改善できるか。
RQ2マスク処理と連結法よりマスクド平均プーリングがより優れたオブジェクト表現を生むか。
RQ3コサイン類似性ガイダンスはPASCAL-5i のフォールド全体でセグメンテーション性能にどう影響するか。
RQ4再学習なしにマルチクラスクエリ画像や少数ショット拡張（K-shot）に対してロバストか。

主な発見

手法	PASCAL-5 0	PASCAL-5 1	PASCAL-5 2	PASCAL-5 3	平均
1-NN	25.3	44.9	41.7	18.4	32.6
LogReg	26.9	42.9	37.1	18.4	31.4
Siamese	28.1	39.9	31.8	25.8	31.4
OSVOS [37]	24.9	38.8	36.5	30.1	32.6
OSLSM [15]	33.6	55.3	40.9	33.5	40.8
co-FCN [16]	36.7	50.6	44.9	32.4	41.1
SG-One(Ours)	40.2	58.4	48.4	38.4	46.3

SG-OneはワンショットセグメンテーションでPASCAL-5iの mean IoU を46.3%に達成し、ベースラインを上回る。
サポートマスクのマスクド平均プーリングはマスク化や連結法より良い代表ベクトルを提供。
Five-shot の結果（サポートベクトルの平均）は47.1% mean IoU を示し、ワンショットよりわずかに高いが大差はない。
OSLSMおよび co-FCN と比較して、SG-One は4つのPASCAL-5iフォールド全体で顕著な改善を示す。
SG-Oneはマルチクラスクエリシナリオにおいてロバストであり、マルチクラス設定では_baseline_ co-FCN を上回る。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。