QUICK REVIEW

[論文レビュー] RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free

Cheng-Yang Fu, Mykhailo Shvets|arXiv (Cornell University)|Jan 10, 2019

Advanced Neural Network Applications参考文献 38被引用数 119

ひとこと要約

RetinaMask は RetinaNet に追加のインスタンスマスクヘッド、適応的自己調整 Smooth L1 損失、難例サンプリングを追加することで、推論コストを増やすことなく検出精度を向上させます。

ABSTRACT

Recently two-stage detectors have surged ahead of single-shot detectors in the accuracy-vs-speed trade-off. Nevertheless single-shot detectors are immensely popular in embedded vision applications. This paper brings single-shot detectors up to the same level as current two-stage techniques. We do this by improving training for the state-of-the-art single-shot detector, RetinaNet, in three ways: integrating instance mask prediction for the first time, making the loss function adaptive and more stable, and including additional hard examples in training. We call the resulting augmented network RetinaMask. The detection component of RetinaMask has the same computational cost as the original RetinaNet, but is more accurate. COCO test-dev results are up to 41.4 mAP for RetinaMask-101 vs 39.1mAP for RetinaNet-101, while the runtime is the same during evaluation. Adding Group Normalization increases the performance of RetinaMask-101 to 41.7 mAP. Code is at:https://github.com/chengyangfu/retinamask

研究の動機と目的

推論コストを一定にしたまま単発検出器の精度を向上させる

提案手法

訓練中に RetinaNet にインスタンスマスク予測ヘッドを追加する
ランニング平均/分散を用いて適応する自己調整 Smooth L1 損失を導入する
正のアンカーを割り当てるためIOU閾値を緩和するベストマッチングポリシーを採用する
マスク提案を適切な FPN 層に分配し、マスク予測のために ROI-Align を適用する
マスクモジュールのためにマルチスケールスケジュールと拡張イテレーションで訓練する
COCO 上で RetinaMask を RetinaNet および Mask R-CNN と比較する

実験結果

リサーチクエスチョン

RQ1訓練中にマスク予測タスクを追加してテスト時のコストを変更せずに単発検出器の精度を改善できるか？
RQ2適応損失と拡張された正のアンカーサンプリングは訓練の安定性と最終性能を改善するか？
RQ3COCO における境界ボックスとマスク精度の両方で RetinaMask は RetinaNet および Mask R-CNN とどう比較されるか？

主な発見

GN を用いた RetinaMask-101 は COCO test-dev で 41.7 bbox AP と 52.8 mask AP を達成し、RetinaNet-101 を顕著な差で上回る
ResNeXt-101-FPN-GN ベースの RetinaMask は 42.6 bbox AP と 53.8 mask AP に達し、より強力なバックボーンでさらなる利得を示す
マスク予測ヘッドは 1.5x スケジュールと適切な特徴割り当て（マスク用に P2–P5）で検出性能を向上させる
自己調整 Smooth L1 損失は設定を問わず堅牢な境界ボックス回帰性能を提供し、固定ベータ構成を上回る
Best Matching Policy（最良マッチングアンカーの IOU を緩和）により精度が向上し重複検出を減らす
報告設定では Mask R-CNN と比べ境界ボックス性能は競合的で、マスク性能はわずかに低い

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。