QUICK REVIEW

[论文解读] Deeply learned face representations are sparse, selective, and robust

Yi Sun, Xiaogang Wang|arXiv (Cornell University)|Dec 3, 2014

Face recognition and analysis参考文献 31被引用 45

一句话总结

本文提出 DeepID2+，一种深度卷积网络，在 LFW 和 YouTube Faces 基准测试中通过联合识别-验证监督学习人脸表征，实现了最先进性能。该模型即使在二值化激活下仍保持高准确率，表明其深层特征本身具有稀疏性、对身份和属性的选择性，且对遮挡具有鲁棒性，无需显式正则化。

ABSTRACT

This paper designs a high-performance deep convolutional network (DeepID2+) for face recognition. It is learned with the identification-verification supervisory signal. By increasing the dimension of hidden representations and adding supervision to early convolutional layers, DeepID2+ achieves new state-of-the-art on LFW and YouTube Faces benchmarks. Through empirical studies, we have discovered three properties of its deep neural activations critical for the high performance: sparsity, selectiveness and robustness. (1) It is observed that neural activations are moderately sparse. Moderate sparsity maximizes the discriminative power of the deep net as well as the distance between images. It is surprising that DeepID2+ still can achieve high recognition accuracy even after the neural responses are binarized. (2) Its neurons in higher layers are highly selective to identities and identity-related attributes. We can identify different subsets of neurons which are either constantly excited or inhibited when different identities or attributes are present. Although DeepID2+ is not taught to distinguish attributes during training, it has implicitly learned such high-level concepts. (3) It is much more robust to occlusions, although occlusion patterns are not included in the training set.

研究动机与目标

设计一种高性能的深度卷积神经网络用于人脸识别，使其在标准基准测试中超越现有模型。
研究深层神经网络激活的内在特性——特别是稀疏性、选择性和鲁棒性——如何促进高性能表现。
理解这些有益特性是否在大规模训练中自然涌现，而无需显式正则化或架构修改。
评估二值化深层特征在高效大规模人脸识别中的有效性。

提出的方法

DeepID2+ 网络增加了隐藏表征的维度，并在早期卷积层添加监督以改善特征学习。
通过联合识别-验证损失进行模型训练，以增强不同身份之间的判别能力。
通过分析各层的神经激活模式，评估稀疏性、选择性和对遮挡的鲁棒性。
通过阈值化顶层激活提取二值化表征，实现高效人脸识别，且准确率下降极小。
在部分遮挡和随机块遮挡下评估鲁棒性，将 DeepID2+ 表征与手工设计的 LBP 特征进行比较。
将多个 DeepID2+ 网络组合于不同人脸区域，以增强对遮挡的鲁棒性。

实验结果

研究问题

RQ1在人脸识别模型中，深层神经网络激活是否在无显式正则化的情况下自然表现出稀疏性、选择性和鲁棒性？
RQ2与全精度激活相比，二值化深层特征表征在多大程度上能保持高识别准确率？
RQ3在对遮挡和图像退化方面的鲁棒性方面，深层特征与手工特征（如 LBP）相比如何？
RQ4在未显式针对这些属性进行训练的情况下，高层神经元是否能作为特定身份或属性的强指示器？
RQ5网络深度如何影响在图像退化情况下的特征表征稳定性？

主要发现

DeepID2+ 在 LFW 上实现 98.70% 的验证准确率，创下新最先进水平；当组合 25 个网络时，准确率提升至 99.47%。
顶层隐藏层激活具有中等稀疏性，每张图像约一半神经元激活，每条神经元约一半图像激活，从而最大化判别能力。
对顶层激活进行二值化后，LFW 验证准确率仅下降 1% 或更少，证明二进制编码在识别中的有效性。
高层神经元具有高度选择性：特定神经元子集在特定身份或属性上持续被激活或抑制，即使未显式训练于这些属性。
DeepID2+ 在 LFW 上保持超过 90% 的验证准确率（40% 遮挡），而 LBP 特征准确率降至 70% 以下，显示出更优的鲁棒性。
将 25 个 DeepID2+ 网络在人脸区域上的特征进行组合，在 40% 遮挡下实现 93.9% 准确率，仅保留额头和头发时达到 88.2%，优于单网络和 LBP 基线。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。