QUICK REVIEW

[论文解读] BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

Valentin Bazarevsky, Yury Kartynnik|arXiv (Cornell University)|Jul 11, 2019

Face recognition and analysis参考文献 7被引用 248

一句话总结

BlazeFace 提供一个为移动端 GPU 推理优化的轻量级人脸检测器，在旗舰设备上使用 GPU 友好型 SSD 风格锚点方案和新颖的 tie-resolution 方法，达到 200–1000+ FPS。它还为 AR 管线提供 6 个面部关键点用于旋转感知的裁剪。

ABSTRACT

We present BlazeFace, a lightweight and well-performing face detector tailored for mobile GPU inference. It runs at a speed of 200-1000+ FPS on flagship devices. This super-realtime performance enables it to be applied to any augmented reality pipeline that requires an accurate facial region of interest as an input for task-specific models, such as 2D/3D facial keypoint or geometry estimation, facial features or expression classification, and face region segmentation. Our contributions include a lightweight feature extraction network inspired by, but distinct from MobileNetV1/V2, a GPU-friendly anchor scheme modified from Single Shot MultiBox Detector (SSD), and an improved tie resolution strategy alternative to non-maximum suppression.

研究动机与目标

开发一个紧凑、对 GPU 友好的移动设备人脸检测器，优化用于 AR 管线。
在保持高人脸检测精度的同时提高推理速度。
引入架构调整（锚点、tie-resolution）以适应移动 GPU 并减少视频流中的抖动。
通过六个关键点实现旋转感知的面部裁剪，以提升下游任务。

提出的方法

设计一个受 MobileNetV1/V2 启发、但为快速检测量身定制的轻量级特征提取器。
引入一个对 GPU 友好的锚点方案，在 8x8 特征图处停止，每个像素在 8x8 上有 6 个锚点。
提出一种 tie-resolution 策略，作为非极大抑制的替代，以稳定重叠的预测。
生成六个面部关键点（眼睛中心、耳轮切点、嘴部中心、鼻尖），用于旋转估计。
保持一个全分辨率的 8x8 特征图，以减少锚点重叠引起的抖动并实现更平滑的时序预测。

实验结果

研究问题

RQ1紧凑型 CNN 主干网和对 GPU 友好的锚点是否能够在移动 GPU 上实现实时人脸检测？
RQ2在密集锚点场景中，新的 tie-resolution 方法是否比传统的 NMS 提高稳定性？
RQ3BlazeFace 的准确率和延迟相比移动端 GPU 上的 MobileNetV2-SSD 如何？
RQ4额外的面部关键点是否能实现旋转感知裁剪，从而改善下游的 AR 任务？

主要发现

模型	平均精度	推理时间，ms
MobileNetV2-SSD	97.95%	2.1
Ours	98.61%	0.6

BlazeFace 在正脸上实现 98.61% 的平均精度，使用 TensorFlow Lite GPU 在 iPhone XS 上以 FP16 推理时间 0.6 ms。
MobileNetV2-SSD 在同一框架下达到 97.95% 的 AP，推理时间 2.1 ms。
在各设备上，BlazeFace 在推理速度方面显著超过 MobileNetV2-SSD（例如 iPhone XS：0.6 ms vs 2.1 ms）。
所提出的 tie-resolution 策略在前置和后置摄像头数据集上，时序抖动分别减少最多 40% 和 30%。
BlazeFace 的回归参数误差是眶间距离的 10.4%（相对于 MobileNetV2-SSD 的 7.4%），抖动指标为 5.3%。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。