QUICK REVIEW

[论文解读] Improved Selective Refinement Network for Face Detection

Shifeng Zhang, Rui Zhu|arXiv (Cornell University)|Jan 20, 2019

Face recognition and analysis参考文献 55被引用 32

一句话总结

本论文通过将数据增强、更强 backbone、MS COCO 预训练、解耦分类模块、分割分支以及 SE 块整合到 Selective Refinement Network (SRN) 来提升人脸检测性能，在 WIDER FACE 上达到最先进结果。

ABSTRACT

As a long-standing problem in computer vision, face detection has attracted much attention in recent decades for its practical applications. With the availability of face detection benchmark WIDER FACE dataset, much of the progresses have been made by various algorithms in recent years. Among them, the Selective Refinement Network (SRN) face detector introduces the two-step classification and regression operations selectively into an anchor-based face detector to reduce false positives and improve location accuracy simultaneously. Moreover, it designs a receptive field enhancement block to provide more diverse receptive field. In this report, to further improve the performance of SRN, we exploit some existing techniques via extensive experiments, including new data augmentation strategy, improved backbone network, MS COCO pretraining, decoupled classification module, segmentation branch and Squeeze-and-Excitation block. Some of these techniques bring performance improvements, while few of them do not well adapt to our baseline. As a consequence, we present an improved SRN face detector by combining these useful techniques together and obtain the best performance on widely used face detection benchmark WIDER FACE dataset.

研究动机与目标

Motivate improving SRN performance on the challenging WIDER FACE benchmark with tiny and occluded faces.
Investigate the impact of architectural and training enhancements on SRN performance.
Identify which techniques are effective or ineffective when combined with the SRN baseline.

提出的方法

Adopt data augmentation strategies including photometric distortions, random patch cropping, and optional data-anchor-sampling.
Improve the backbone by modifying ResNet-50 into a Root-ResNet-based structure with DRN-inspired adjustments.
Pretrain the modified backbone on MS COCO and then finetune on WIDER FACE, with Group Normalization to enable training from scratch.
Apply a decoupled classification module and consider a segmentation branch and SE blocks to explore performance gains.
Use STC and STR within SRN, with RFE for diversified receptive fields, and perform inference with top detections and NMS.
Train with SGD, specific learning rate schedule, and large-input 1024x1024, using a 5-epoch warm-up and 260 total epochs.

实验结果

研究问题

RQ1Can data augmentation and backbone improvements substantially boost SRN performance on WIDER FACE?
RQ2Do MS COCO pretraining, decoupled classification, segmentation supervision, and SE blocks consistently improve SRN across Easy/Medium/Hard subsets?
RQ3Which of the explored techniques are beneficial, neutral, or detrimental when integrated with SRN for face detection?

主要发现

The proposed ISRN achieves state-of-the-art average precision on WIDER FACE across Easy, Medium, and Hard on both validation and testing sets.
Validation APs: Easy 96.7%, Medium 95.8%, Hard 90.9%; Testing APs: Easy 96.3%, Medium 95.4%, Hard 90.3%.
STC/STR with the improved backbone and pretraining contribute to gains, especially for tiny faces (Hard subset).
Some techniques (e.g., segmentation branch, SE blocks) may not always improve performance in this baseline, depending on configuration.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。