QUICK REVIEW

[論文レビュー] Improved Selective Refinement Network for Face Detection

Shifeng Zhang, Rui Zhu|arXiv (Cornell University)|Jan 20, 2019

Face recognition and analysis参考文献 55被引用数 32

ひとこと要約

この論文は、Selective Refinement Network (SRN)を顔検出向けに改良するため、データ拡張、より強力なバックボーン、MS COCO pretraining、デカップル化分類モジュール、セグメンテーションブランチ、および SE ブロックを統合し、WIDER FACEで最先端の結果を達成している。

ABSTRACT

As a long-standing problem in computer vision, face detection has attracted much attention in recent decades for its practical applications. With the availability of face detection benchmark WIDER FACE dataset, much of the progresses have been made by various algorithms in recent years. Among them, the Selective Refinement Network (SRN) face detector introduces the two-step classification and regression operations selectively into an anchor-based face detector to reduce false positives and improve location accuracy simultaneously. Moreover, it designs a receptive field enhancement block to provide more diverse receptive field. In this report, to further improve the performance of SRN, we exploit some existing techniques via extensive experiments, including new data augmentation strategy, improved backbone network, MS COCO pretraining, decoupled classification module, segmentation branch and Squeeze-and-Excitation block. Some of these techniques bring performance improvements, while few of them do not well adapt to our baseline. As a consequence, we present an improved SRN face detector by combining these useful techniques together and obtain the best performance on widely used face detection benchmark WIDER FACE dataset.

研究の動機と目的

SRNの難しいWIDER FACEベンチマーク、特に tiny および occluded な顔に対する性能向上を動機づける。
SRNの性能に対するアーキテクチャと訓練の改良の影響を検討する。
SRNベースラインと組み合わせた場合、どの技術が有効／無効であるかを特定する。

提案手法

フォトメトリック歪み、ランダムパッチクロップ、任意のデータアンサンプリングを含むデータ拡張戦略を採用する。
ResNet-50をRoot-ResNetベースの構造に変更し、DRN風の調整を加えてバックボーンを改善する。
改変したバックボーンをMS COCOで事前訓練し、その後WIDER FACEでファインチューニングする。訓練をスクラッチから可能にするためにGroup Normalizationを使用する。
デカップル化分類モジュールを適用し、セグメンテーションブランチとSEブロックを検討して性能向上を探る。
SRN内でSTCとSTRを使用し、RFEで多様な受容野を実現し、推論はトップ検出とNMSで行う。
SGDで訓練し、特定の学習率スケジュールと大入力1024x1024を用い、5-epoch warm-upと260 total epochsを実施する。

実験結果

リサーチクエスチョン

RQ1データ拡張とバックボーンの改善はWIDER FACEでのSRN性能を大幅に向上させるか。
RQ2MS COCO前訓練、デカップル化分類、セグメンテーション監督、およびSEブロックはEasy/Medium/Hardの各サブセットで一貫してSRNを改善するか。
RQ3検討された技術のうち、SRNを顔検出に組み込んだ場合、有益/中立/有害となるものはどれか。

主な発見

提案されたISRNは、Easy、Medium、Hardの各サブセットで検証セットおよびテストセットの平均精度において最先端を達成している。
検証AP: Easy 96.7%、Medium 95.8%、Hard 90.9%；テストAP: Easy 96.3%、Medium 95.4%、Hard 90.3%。
改善されたバックボーンと事前訓練により、特に小さな顔（Hardサブセット）での利得に寄与するSTC/STRの効果が顕著。
いくつかの技術（例：セグメンテーションブランチ、SEブロック）は、構成によってはこのベースラインで必ずしも性能を向上させないことがあり得る。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。