[论文解读] Face.evoLVe: A High-Performance Face Recognition Library
tldr: face.evoLVe 是一个面向 PyTorch 和 PaddlePaddle 的全面、模块化高性能人脸识别库,提供多种骨干网络、损失函数、数据技巧和预训练模型,以实现公平、可重复的比较。
In this paper, we develop face.evoLVe -- a comprehensive library that collects and implements a wide range of popular deep learning-based methods for face recognition. First of all, face.evoLVe is composed of key components that cover the full process of face analytics, including face alignment, data processing, various backbones, losses, and alternatives with bags of tricks for improving performance. Later, face.evoLVe supports multi-GPU training on top of different deep learning platforms, such as PyTorch and PaddlePaddle, which facilitates researchers to work on both large-scale datasets with millions of images and low-shot counterparts with limited well-annotated data. More importantly, along with face.evoLVe, images before & after alignment in the common benchmark datasets are released with source codes and trained models provided. All these efforts lower the technical burdens in reproducing the existing methods for comparison, while users of our library could focus on developing advanced approaches more efficiently. Last but not least, face.evoLVe is well designed and vibrantly evolving, so that new face recognition approaches can be easily plugged into our framework. Note that we have used face.evoLVe to participate in a number of face recognition competitions and secured the first place. The version that supports PyTorch is publicly available at https://github.com/ZhaoJ9014/face.evoLVe.PyTorch and the PaddlePaddle version is available at https://github.com/ZhaoJ9014/face.evoLVe.PyTorch/tree/master/paddle. Face.evoLVe has been widely used for face analytics, receiving 2.4K stars and 622 forks.
研究动机与目标
- Provide a unified, reproducible framework for state-of-the-art face recognition methods.
- Cover the full face recognition pipeline: detection, alignment, feature extraction, and loss heads.
- Support multiple platforms (PyTorch and PaddlePaddle) with distributed multi-GPU training.
- Offer a modular design enabling easy plugging of custom backbones and heads.
- Release preprocessed data, trained models, and training/evaluation environments to facilitate fair benchmarking.
提出的方法
- Modular pipeline: face detection, landmark-based alignment, backbone feature extraction, and loss head computation.
- Support for multiple backbones (e.g., ResNet variants, IR, LightCNN, MobileNet, GhostNet, EfficientNet, etc.).
- A wide range of loss functions (e.g., Softmax, ArcFace, CosFace, SphereFace, Triplet, ArcNegFace, CircleLoss, MagFace, etc.) with optional focal variants.
- Bag of tricks for training stability and performance (warmup, cosine annealing, label smoothing, distillation/knowledge transfer, curricular approaches).
- Data processing steps include handling low-shot classes, augmentation, balanced sampling, and dataset-wide preprocessing for reproducibility.
- Cross-platform training support with distributed training to scale on large datasets (e.g., MS-Celeb-1M, WebFace260M).
- Provision of pre-processed data, trained models, and evaluation pipelines for fair benchmarking.
实验结果
研究问题
- RQ1How can a unified, modular library accelerate reproducible research in face recognition across different backbones and losses?
- RQ2What is the impact of diverse backbones and angular/loss functions on standardized face recognition benchmarks under fair settings?
- RQ3Can distributed training on PyTorch and PaddlePaddle effectively scale to large-scale datasets while maintaining fair comparisons?
- RQ4Do training tricks (warmup, cosine annealing, label smoothing, distillation) meaningfully improve stability and accuracy across models?
- RQ5How do released pre-processed datasets and trained models facilitate reproducibility and rapid experimentation?
主要发现
| Backbone | Head | Loss | Testing Dataset | LFW | CFP_FF | CFP_FP | AgeDB | CALFW | CPLFW | Vggface2_FP |
|---|---|---|---|---|---|---|---|---|---|---|
| IR152 | Arcface | Focal | LFW | 99.82 | 99.84 | 98.37 | 98.07 | 96.03 | 93.05 | 95.50 |
| IR101 | Arcface | Focal | LFW | 99.81 | 99.74 | 98.25 | 97.77 | 95.93 | 92.74 | 95.44 |
| IR50 | AdaCos | Focal | LFW | 99.75 | 99.53 | 98.39 | 97.25 | 95.55 | 92.25 | 95.27 |
| IR152 | AdaCos | Focal | LFW | 99.82 | 99.84 | 98.37 | 98.07 | 96.03 | 93.05 | 95.50 |
| HRNet | MV-Softmax | Focal | LFW | 99.82 | 99.51 | 98.41 | 97.88 | 95.43 | 88.95 | 94.70 |
| TF-NAS-A | AM-Softmax | Focal | LFW | 99.82 | 99.47 | 98.33 | 96.65 | 94.32 | 84.88 | 91.38 |
| GhostNet | ArcFace | Focal | LFW | 99.69 | 99.52 | 98.48 | 97.29 | 94.92 | 85.25 | 90.88 |
| AttentionNet | AdaCos | Focal | LFW | 99.82 | 99.47 | 98.52 | 96.89 | 95.12 | 87.23 | 94.23 |
| MobileFaceNet | AdaCos | Focal | LFW | 99.73 | 99.84 | 97.75 | 95.87 | 94.87 | 89.29 | 93.20 |
- The library supports numerous backbones (e.g., IR50/101/152, ResNet variants, LightCNN, MobileNet, GhostNet, EfficientNet) and losses (ArcFace, CosFace, SphereFace, AM-Softmax, AdaCos, CircleLoss, MagFace, etc.).
- Tables show competitive accuracy across standard benchmarks (LFW, CALFW, CPLFW, AgeDB, CFP variants, VggFace2) using various backbone/head/loss combinations.
- IR152 with AdaCos or ArcFace losses achieves high results on LFW, CALFW, and CPLFW when trained on large-scale data (MS-Celeb-1M or Web260M).
- Face.evoLVe demonstrates competitive or superior performance compared to other toolboxes (e.g., FaceX-zoo) on key datasets.
- Distributed training is supported, enabling scaling to datasets with millions of images.
- The library has achieved state-of-the-art results in open competitions and continues to evolve with active contributors.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。