QUICK REVIEW

[论文解读] Crystal Loss and Quality Pooling for Unconstrained Face Verification and Recognition

Rajeev Ranjan, Ankan Bansal|arXiv (Cornell University)|Apr 3, 2018

Face recognition and analysis参考文献 49被引用 46

一句话总结

本文介绍 Crystal Loss，通过将特征约束在超球面上来提升人脸验证/识别的性能，并通过 Quality Pooling/Attenuation 使用人脸质量分数生成和调整紧凑的视频/模板表示。

ABSTRACT

In recent years, the performance of face verification and recognition systems based on deep convolutional neural networks (DCNNs) has significantly improved. A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of face images or videos. The softmax loss function does not optimize the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap. In this paper, we propose a new loss function, called Crystal Loss, that restricts the features to lie on a hypersphere of a fixed radius. The loss can be easily implemented using existing deep learning frameworks. We show that integrating this simple step in the training pipeline significantly improves the performance of face verification and recognition systems. We achieve state-of-the-art performance for face verification and recognition on challenging LFW, IJB-A, IJB-B and IJB-C datasets over a large range of false alarm rates (10-1 to 10-7).

研究动机与目标

弥补 softmax 损失未能在验证任务中优化正样本与负样本对相似度的这一缺口。
对特征强制固定的 L2 范数，以在归一化/角度空间中提升验证边界。
通过 Quality Pooling 使用人脸检测分数构建紧凑的视频/模板表示。
引入 Quality Attenuation ，基于人脸质量重新缩放相似度分数以在低 FAR 下提升性能。
在具有挑战性的非受控数据集上展示了最先进的性能。

提出的方法

通过添加一个 L2 范数约束，使特征落在半径为 alpha 的超球面上来引入 Crystal Loss。
用一个 L2 归一化层后跟一个在训练过程中可调的缩放层实现该约束。
将 Crystal Loss 解释为 von Mises-Fisher 分布的一种特例，并分析尺度参数 alpha 的作用。
提出 Quality Pooling 通过人脸检测分数对帧特征进行加权，以形成紧凑的视频/模板描述符。
引入 Quality Attenuation，在成对的人脸质量较低时对验证分数进行重新缩放。
在 LFW、IJB-A、IJB-B 和 IJB-C 等数据集上进行验证，并与 softmax 基线及其他方法进行比较。

实验结果

研究问题

RQ1将特征约束为固定的 L2 范数是否能在非受控场景中提高人脸表示的判别力？
RQ2尺度参数 alpha 如何影响性能，稳定训练的实际界限是什么？
RQ3通过人脸检测质量对帧特征进行加权是否能提高视频/模板的验证性能？
RQ4基于人脸质量重新缩放相似度分数是否能在非常低的 FAR 下减少误识别？
RQ5在如 LFW 与 IJB 系列等前沿数据集上的观察结果是什么？

主要发现

Crystal Loss 在具有挑战性的数据集上显著优于常规 softmax 的验证和识别性能。
对特征固定的 L2 范数降低了类内角度变异性，增大了类间角度边距。
Quality Pooling 通过用人脸检测分数对帧进行加权，得到更具判别性的视频/模板表示。
Quality Attenuation 在低质量的验证对上降低分数，在极低 FAR 时提升 TAR。
该方法在 LFW、IJB-A、IJB-B 和 IJB-C 数据集上达到最先进的结果，并与其他度量学习方法互补。
该框架无需额外多网络或多损失即可集成，且保持端到端可训练。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。