QUICK REVIEW

[論文レビュー] Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation

Zhaohui Zheng, Ping Wang|arXiv (Cornell University)|May 7, 2020

Advanced Neural Network Applications参考文献 55被引用数 83

ひとこと要約

本論文は、Complete-IoU (CIoU) loss と Cluster-NMS を導入し、境界ボックス回帰と NMS に3つの幾何学的要因を組み込み、検出およびセグメンテーションモデル全体でリアルタイム推論を可能にし、AP と AR を改善する。

ABSTRACT

Deep learning-based object detection and instance segmentation have achieved unprecedented progress. In this paper, we propose Complete-IoU (CIoU) loss and Cluster-NMS for enhancing geometric factors in both bounding box regression and Non-Maximum Suppression (NMS), leading to notable gains of average precision (AP) and average recall (AR), without the sacrifice of inference efficiency. In particular, we consider three geometric factors, i.e., overlap area, normalized central point distance and aspect ratio, which are crucial for measuring bounding box regression in object detection and instance segmentation. The three geometric factors are then incorporated into CIoU loss for better distinguishing difficult regression cases. The training of deep models using CIoU loss results in consistent AP and AR improvements in comparison to widely adopted $\ell_n$-norm loss and IoU-based loss. Furthermore, we propose Cluster-NMS, where NMS during inference is done by implicitly clustering detected boxes and usually requires less iterations. Cluster-NMS is very efficient due to its pure GPU implementation, and geometric factors can be incorporated to improve both AP and AR. In the experiments, CIoU loss and Cluster-NMS have been applied to state-of-the-art instance segmentation (e.g., YOLACT and BlendMask-RT), and object detection (e.g., YOLO v3, SSD and Faster R-CNN) models. Taking YOLACT on MS COCO as an example, our method achieves performance gains as +1.7 AP and +6.2 AR$_{100}$ for object detection, and +0.9 AP and +3.5 AR$_{100}$ for instance segmentation, with 27.1 FPS on one NVIDIA GTX 1080Ti GPU. All the source code and trained models are available at https://github.com/Zzh-tju/CIoU

研究の動機と目的

検出とセグメンテーションの IoU ベース損失における境界ボックス回帰の制約を動機づける。
重なり、中心距離、アスペクト比項を含む完全な幾何学的要因損失（CIoU）を提案する。
幾何学的要因の統合を可能にしつつ、非極大抑制を加速する Cluster-NMS を開発する。
最先端の検出器とセグメンテータの訓練および推論の利得を示す。
精度を損なうことなく GPU でのリアルタイム性能を示す。

提案手法

CIoU ロスを、1 - IoU に正規化された中心距離と適応ウェイト（alpha）を持つアスペクト比項を加えたものとして定義する。
三つの幾何学的要因 S（重なり）, D（距離）, V（アスペクト比）をスケール不変かつ [0,1] に正規化して定式化する。
CIoU を IoU および GIoU 損失と比較する解析とシミュレーションを提供し、極端なケースでより速い収束とより良い回帰を示す。
クラスタにボックスをグループ化して、より少ない反復回数で GPU 上で NMS を実行する Cluster-NMS を導入する。
幾何学的要因を、スコアペナルティと距離ベース項を介して Cluster-NMS に組み込む（Cluster-NMS_S, Cluster-NMS_S+D, Cluster-NMS_W, Cluster-NMS_W+D）。
CIoU と Cluster-NMS を YOLACT、BlendMask-RT、YOLOv3、SSD、Faster R-CNN に適用し、利得を検証する。

実験結果

リサーチクエスチョン

RQ1CIoU 損失と Cluster-NMS は、従来の損失やNMS系と比較して境界ボックス回帰と抑制の品質を改善しますか？
RQ2三つの幾何学要因（重なり領域、中心距離、アスペクト比）は、学習ダイナミクスと収束にどのように影響しますか？
RQ3最先端の検出器とセグメンテータに統合した場合、CIoU と Cluster-NMS は推論速度を維持または向上させることができますか？
RQ4これらの手法は物体検出とインスタンス分割の両方のタスクで有効ですか？

主な発見

CIoU 損失は l1-norm および IoU ベース損失より一貫して AP および AR の改善をもたらす。
Cluster-NMS はリアルタイム推論を維持しつつ、顕著な AP および AR の改善を提供する。
MS COCO の YOLACT に適用すると、物体検出で +1.7 AP および +6.2 AR100、インスタンス分割で +1.1 AP および +3.5 AR100 を達成し、GTX 1080Ti で 27.1 FPS となる。
他のモデル（YOLOv3、SSD、Faster R-CNN）に適用して、利得を観察した。
CIoU は GIoU よりも早く収束し、極端なアスペクト比にもより適切に対処する。
Cluster-NMS は純粋に GPU 上で実装でき、元の NMS の結果を、反復回数を減らして再現できる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。