QUICK REVIEW

[論文レビュー] Improving Fast Segmentation With Teacher-student Learning

Jiafeng Xie, Bing Shuai|arXiv (Cornell University)|Oct 19, 2018

Advanced Neural Network Applications参考文献 16被引用数 51

ひとこと要約

本論文は、重い教師から高速な生徒へゼロ次および一階知識を転移し、ラベルありデータとラベルなしデータの両方を用いて追加の推論コストをかけずに高速なセマンティックセグメンテーションモデルを向上させる教師-生徒学習フレームワークを提案する。

ABSTRACT

Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks. However, these models are very heavy and generally suffer from low inference speed, which limits their application scenarios in practice. Meanwhile, existing fast segmentation models usually fail to obtain satisfactory segmentation accuracies on public benchmarks. In this paper, we propose a teacher-student learning framework that transfers the knowledge gained by a heavy and better performed segmentation network (i.e. teacher) to guide the learning of fast segmentation networks (i.e. student). Specifically, both zero-order and first-order knowledge depicted in the fine annotated images and unlabeled auxiliary data are transferred to regularize our student learning. The proposed method can improve existing fast segmentation models without incurring extra computational overhead, so it can still process images with the same fast speed. Extensive experiments on the Pascal Context, Cityscape and VOC 2012 datasets demonstrate that the proposed teacher-student learning framework is able to significantly boost the performance of student network.

研究の動機と目的

リアルタイムまたはリソース制約下で、迅速かつ高精度なセグメンテーションの必要性を動機づける。
教師由来の知識を用いて迅速な生徒の学習を正則化する教師-生徒フレームワークを提案する。
教師から生成された疑似真値を用いてラベルなしデータを活用するようフレームワークを拡張する。
推論コストを増やすことなく、複数のベンチマーク（Pascal Context、Cityscapes、VOC 2012）で性能向上を示す。

提案手法

速い学生 S と固定された重い教師 T を定義し、L = L_S + r(S,T) を最適化する。
教師と学生の出力間の確率損失 L_p によって S をゼロ次の知識で正則化する。
教師と学生の出力間の境界情報に対する一貫性損失 L_c を用いて S を一階の知識で正則化する。
細注釈データで知識を蒸留し、教師生成の擬似ラベルを真値として用いることでラベルなしデータへ拡張する。
ラベルなしデータでは L = L_LabeledData + λ L_unlabeledData で訓練し、両データレジームを同時に最適化する。

実験結果

リサーチクエスチョン

RQ1推論コストを増やさずに、教師ネットワークの知識が高速なセグメンテーションモデルを改善できるか？
RQ2ゼロ次（確率）と一階（一貫性）知識を組み合わせることで、単独で用いる場合よりも学生の学習が向上するか？
RQ3手動アノテーションなしで、教師生成の監督を介してラベルなしデータが性能をさらに向上させるか？
RQ4方法は標準的なセグメンテーションベンチマークと異なる教師/学生のバックボーンでどう機能するか？

主な発見

Model	mIoU (%)	speed (FPS)
ResNet-101-DeepLab-v2 (teacher)	48.5	16.7
MobileNet-1.0-DeepLab-v2	40.9	46.5
MobileNet-1.0-DeepLab-v2 (Lp)	42.3	46.5
MobileNet-1.0-DeepLab-v2 (Lp+Lc)	42.8	46.5
MobileNet-1.0-DeepLab-v2 (Lp+Lc+UnlabeledData)	43.8	46.5
FCN-8s	37.8	N/A
ParseNet	40.4	N/A
UoA-Context + CRF	43.3	< 1
DAG-RNN	42.6	9.8
DAG-RNN + CRF	43.7	< 1

Pascal Context では、Enhanced MobileNet-1.0-DeepLab-v2（Lp、Lc、および unlabeled data を含む）は、46.5 FPS で 43.8% mIoU を達成し、基礎の 40.9% mIoU と比較して高い。
アブレーションでは、L_p 単独で 42.3% に、L_c を加えて 42.8%；ラベルなしデータによりさらに 1.0% の利得で 43.8% mIoU。
固定高容量の教師（ResNet-101 DeepLab-v2）と MobileNet ベースの学生を使用すると、Cityscapes バリデーションで 71.9% mIoU に達し、推論は高速のまま（20.6 FPS、67.3% から向上）。
VOC 2012 バリデーションでは、enhanced MobileNet-1.0-DeepLab-v2 は 69.6% mIoU を達成し、基礎の Student より 2.3% 上回る。
3つのデータセット全体で、追加の計算オーバーヘッドなしに一貫して学生の性能を向上させる。
改善の規模は、教師-学生間の性能ギャップが大きいほど一般に大きくなり、知識移転が効果的であることを示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。