QUICK REVIEW

[论文解读] Selective Refinement Network for High Performance Face Detection

Cheng Chi, Shifeng Zhang|arXiv (Cornell University)|Sep 7, 2018

Face recognition and analysis参考文献 38被引用 23

一句话总结

本文提出选择性精炼网络（SRN），一种单阶段人脸检测器，通过选择性地应用两步分类与回归，减少误检并提升定位精度。SRN在AFW、PASCAL Face、FDDB和WIDER FACE数据集上达到最先进性能，WIDER FACE验证集上达到96.4% AP，Hard子集上达到90.2%。

ABSTRACT

High performance face detection remains a very challenging problem, especially when there exists many tiny faces. This paper presents a novel single-shot face detector, named Selective Refinement Network (SRN), which introduces novel two-step classification and regression operations selectively into an anchor-based face detector to reduce false positives and improve location accuracy simultaneously. In particular, the SRN consists of two modules: the Selective Two-step Classification (STC) module and the Selective Two-step Regression (STR) module. The STC aims to filter out most simple negative anchors from low level detection layers to reduce the search space for the subsequent classifier, while the STR is designed to coarsely adjust the locations and sizes of anchors from high level detection layers to provide better initialization for the subsequent regressor. Moreover, we design a Receptive Field Enhancement (RFE) block to provide more diverse receptive field, which helps to better capture faces in some extreme poses. As a consequence, the proposed SRN detector achieves state-of-the-art performance on all the widely used face detection benchmarks, including AFW, PASCAL face, FDDB, and WIDER FACE datasets. Codes will be released to facilitate further studies on the face detection problem.

研究动机与目标

为解决高召回率下人脸检测中误检率过高的问题，特别是针对极小人脸。
提升边界框定位精度，尤其在不断提高IoU阈值时表现更优。
通过在网络早期过滤简单负样本锚框，降低计算成本与搜索空间。
通过多样化感受野提升极端姿态下人脸检测的鲁棒性。
在不依赖多阶段推理的前提下，实现在多个基准数据集上的最先进性能。

提出的方法

选择性两步分类（STC）模块从低层特征图中过滤出简单负样本锚框，减少后续分类的搜索空间。
选择性两步回归（STR）模块在高层特征图中粗略调整锚框的位置与尺寸，为最终回归器提供更优初始化。
感受野增强（RFE）模块被集成到特征层中，以多样化感受野大小，提升对极端姿态人脸的检测能力。
SRN框架基于基于锚框的单阶段检测器与特征金字塔网络，选择性地在特定特征层应用STC与STR模块。
STC与STR模块分别通过焦点损失与平滑L1损失进行端到端训练，以应对类别不平衡与回归精度问题。
使用标准指标（包括AP与PR曲线）在多个基准上评估网络性能，IoU阈值最高达0.8，以评估定位精度。

实验结果

研究问题

RQ1选择性两步分类是否能在不牺牲召回率的前提下减少人脸检测中的误检，尤其针对极小人脸？
RQ2在高层特征中粗略优化锚框位置是否能带来更精确的最终边界框？
RQ3通过RFE集成多样化感受野，对极端姿态下人脸检测性能有何影响？
RQ4所提出的SRN框架是否能在多个基准上实现最先进性能，包括遮挡与模糊等极端挑战场景？
RQ5各组件（STC、STR、RFE）对整体检测精度与精确率-召回率权衡的贡献如何？

主要发现

SRN在WIDER FACE验证集上达到96.4% AP，Hard子集上达到90.2%，优于所有先前方法。
在WIDER FACE Hard子集测试集上，SRN达到89.7% AP，显著超越此前最先进方法。
仅使用STR模块时，SRN在高IoU阈值下AP显著提升：IoU=0.8时达到38.2%，而RetinaNet仅为28.5%，表明定位精度更高。
STC模块使正负样本比提升38倍，显著提高召回效率，并在高召回率下有效降低误检率。
RFE模块在Easy、Medium与Hard子集上分别提升AP 0.3%、0.3%与0.1%，证明其在极端姿态下检测中的有效性。
当STC与STR联合使用时，SRN在Hard子集上达到96.1% AP，表明两模块间存在协同增益效应。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。