QUICK REVIEW

[论文解读] AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations

Xiao Zhang, Rui Zhao|arXiv (Cornell University)|May 1, 2019

Face recognition and analysis参考文献 44被引用 27

一句话总结

该论文提出 AdaCos，一种无需超参数调节的基于余弦的 Softmax 损失，通过自适应地缩放 logits 以使预测的分类概率与角度边界对齐，从而提升深度人脸表征学习性能。通过在训练过程中动态调整缩放参数，该方法在 LFW、MegaFace 和 IJB-C 基准测试中实现了最先进性能，确保了无需人工调参的稳定且高效的优化。

ABSTRACT

The cosine-based softmax losses and their variants achieve great success in deep learning based face recognition. However, hyperparameter settings in these losses have significant influences on the optimization path as well as the final recognition performance. Manually tuning those hyperparameters heavily relies on user experience and requires many training tricks. In this paper, we investigate in depth the effects of two important hyperparameters of cosine-based softmax losses, the scale parameter and angular margin parameter, by analyzing how they modulate the predicted classification probability. Based on these analysis, we propose a novel cosine-based softmax loss, AdaCos, which is hyperparameter-free and leverages an adaptive scale parameter to automatically strengthen the training supervisions during the training process. We apply the proposed AdaCos loss to large-scale face verification and identification datasets, including LFW, MegaFace, and IJB-C 1:1 Verification. Our results show that training deep neural networks with the AdaCos loss is stable and able to achieve high face recognition accuracy. Our method outperforms state-of-the-art softmax losses on all the three datasets.

研究动机与目标

解决现有基于余弦的 Softmax 损失在人脸识别中因超参数调优导致的不稳定性与敏感性问题。
分析尺度和边界参数如何影响基于余弦损失的分类概率预测。
开发一种可自动适应其缩放机制的损失函数，以提升训练监督效果与泛化能力。
在不依赖人工调参的前提下，保持或提升在大规模人脸数据集上的识别准确率。

提出的方法

提出 AdaCos，一种新型基于余弦的 Softmax 损失，其缩放参数可根据类别数量和特征的角度分布动态调整。
引入自适应缩放参数 $\tilde{s}_d^{(t)}$，以确保正确类别预测概率与真实余弦相似度保持一致。
通过解析推导缩放参数，确保在训练过程中保持类间与类内角度边距之间的平衡。
结合特征归一化与改进的交叉熵损失，在保留余弦相似度几何解释的同时优化角度边界。
采用闭式解法确定缩放参数，避免迭代调优，降低计算开销。
通过内置运算无缝集成至标准深度学习框架，支持便捷部署。

实验结果

研究问题

RQ1基于余弦的损失中，尺度与边界超参数如何影响预测的分类概率？
RQ2为何现有基于余弦的损失需要大量超参数调优且易出现训练不稳定？
RQ3是否可在训练过程中自动适应缩放参数，以提升监督效果与识别性能？
RQ4自适应缩放机制是否能带来更好的泛化能力与收敛性？

主要发现

AdaCos 在 LFW 基准测试中达到最先进性能，优于 ArcFace 和 CosFace 等现有损失函数。
在 MegaFace 1M 识别基准测试中，AdaCos 在相同训练数据与网络架构下，准确率高于所有对比损失函数，包括 ArcFace 和 CosFace。
在 IJB-C 1:1 验证协议下，动态 AdaCos 在 10^-7 误接受率下实现 99.06% 的真实接受率，优于 ArcFace 及其他最先进方法。
动态 AdaCos 变体在 IJB-C 上实现 83.28% 的真实接受率（10^-6 FAR），优于固定 AdaCos 及其他损失函数。
AdaCos 训练收敛更快且更稳定，无需人工调参。
所提出的自适应缩放参数能有效对齐预测概率与余弦相似度的几何意义，缩小训练与推理之间的差距。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。