QUICK REVIEW

[論文レビュー] FSSD: Feature Fusion Single Shot Multibox Detector

Zuoxin Li, Yang, Lu|arXiv (Cornell University)|Dec 4, 2017

Advanced Neural Network Applications参考文献 7被引用数 391

ひとこと要約

FSSDは、軽量な特徴融合モジュールを導入して多層特徴を結合し新しい特徴ピラミッドを構築することでSSDを強化し、検出精度を向上させる（特に小型物体で）わずかな速度低下を伴う。

ABSTRACT

SSD (Single Shot Multibox Detector) is one of the best object detection algorithms with both high accuracy and fast speed. However, SSD's feature pyramid detection method makes it hard to fuse the features from different scales. In this paper, we proposed FSSD (Feature Fusion Single Shot Multibox Detector), an enhanced SSD with a novel and lightweight feature fusion module which can improve the performance significantly over SSD with just a little speed drop. In the feature fusion module, features from different layers with different scales are concatenated together, followed by some down-sampling blocks to generate new feature pyramid, which will be fed to multibox detectors to predict the final detection results. On the Pascal VOC 2007 test, our network can achieve 82.7 mAP (mean average precision) at the speed of 65.8 FPS (frame per second) with the input size 300$ imes$300 using a single Nvidia 1080Ti GPU. In addition, our result on COCO is also better than the conventional SSD with a large margin. Our FSSD outperforms a lot of state-of-the-art object detection algorithms in both aspects of accuracy and speed. Code is available at https://github.com/lzx1413/CAFFE_SSD/tree/fssd.

研究の動機と目的

SSDベースの検出器におけるマルチスケール物体検出の課題に対処する。
異なる層からの特徴を結合しダウンサンプリングする軽量な特徴融合モジュールを提案する。
融合特徴から新しい特徴ピラミッドを生成し、それをマルチボックス検出器に入力する。
FSSDをPASCAL VOCとMS COCOで評価し、精度と速度の向上を定量化する。

提案手法

共通の空間サイズにリサイズした後、選択した層の射影特徴を結合する特徴融合フレームワークを定義する（1x1畳み込みを介して）。
SSD300バックボーンのために、conv3_3、conv4_3、fc7、conv7_2 からの特徴を結合（要素ごとの和ではなく結合）して融合する（conv3_3を省略することもある）。
融合後にBatch Normalizationを適用して特徴スケールを正規化する。
融合特徴マップに対してダウンサンプリングブロック（stride-2の畳み込み）を適用してピラミッド特徴抽出器を構築する。
SSDスタイルの損失とハードネガティブマイニングを用いて、VGG16/SSD事前学習またはCOCO事前学習モデルから微調整してFSSDを学習する。

実験結果

リサーチクエスチョン

RQ1単一の軽量な特徴融��モジュールがマルチスケール特徴を活用することでSSDの性能を改善できるか？
RQ2結合ベースの融合は、要素ごとの和を用いた融合よりもマルチスケール特徴の統合で優れているか？
RQ3VOCとCOCOデータセットにおける融合特徴設計の精度と速度のトレードオフに与える影響は何か？

主な発見

FSSDはVOC2007テストで300x300入力、65.8 FPSを単一の1080Tiで達成し、 mAPは82.7。COCOプリトレーニングモデル。
VOC2012では、COCO事前学習付きのFSSD300が82.0% mAP、FSSD512が84.2% mAPに達し、SSDベースラインを上回る。
COCO test-devの結果、FSSD300は27.1% APを達成、SSD300*（25.1%）を上回り、FSSD512は31.8% APを達成。
アブレーション研究では、結合が要素ごとの和より優れており、融合後のBatch NormalizationがmAPを約0.7%向上させる。
提案された融合ピラミッド設計は小型物体検出で顕著な利得をもたらし、標準SSDと比べて多部検出を削減する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。