[Paper Review] In Defense of the Triplet Loss for Person Re-Identification
The paper argues for end-to-end metric learning with a variant of the triplet loss (batch hard with soft margin) and shows it achieves state-of-the-art results on Market-1501, MARS, and CUHK03, including training from scratch.
In the past few years, the field of computer vision has gone through a revolution fueled mainly by the advent of large datasets and the adoption of deep convolutional neural networks for end-to-end learning. The person re-identification subfield is no exception to this. Unfortunately, a prevailing belief in the community seems to be that the triplet loss is inferior to using surrogate losses (classification, verification) followed by a separate metric learning step. We show that, for models trained from scratch as well as pretrained ones, using a variant of the triplet loss to perform end-to-end deep metric learning outperforms most other published methods by a large margin.
Motivation & Objective
- Motivate re-evaluating triplet loss for person re-identification (ReID) as competitive with surrogate losses.
- Propose batch-hard triplet loss variants that remove the need for expensive offline hard negative mining.
- Demonstrate end-to-end training efficacy on both pretrained and from-scratch networks.
- Show that a well-designed triplet loss can surpass many published methods on major ReID datasets.
Proposed method
- Review and contextualize metric embedding losses (including LLMNN and triplet loss).
- Introduce Batch Hard (LBH) and Batch All (LBA) formulations; emphasize hard mining within batch.
- Propose a soft-margin version of the batch-hard loss for stability.
- Compare multiple triplet formulations (vanilla, Lifted, soft-margin variants) on a MARS-based validation set.
- Use Euclidean distance in embedding space and avoid embedding normalization.
- Evaluate on Market-1501, MARS, and CUHK03 with pretrained (TriNet) and from-scratch (LuNet) networks.
Experimental results
Research questions
- RQ1Can end-to-end triplet-loss-based metric learning outperform surrogate losses with an extra metric learning step in person ReID?
- RQ2Does batch-hard mining inside small PK batches eliminate the need for expensive offline hard-negative mining?
- RQ3How do different triplet-loss formulations (batch hard/soft margin, batch all, Lifted) compare for ReID performance?
- RQ4What is the effect of pretrained versus from-scratch training on ReID performance with triplet losses?
- RQ5Is a margin-less or soft-margin formulation preferable for stable and strong ReID embeddings?
Key findings
- A batch-hard triplet loss with a soft-margin achieves state-of-the-art results on Market-1501, MARS, and competitive performance on CUHK03 when combined with test-time augmentation.
- Batch-hard consistently outperforms batch-all and vanilla triplet formulations in their experiments, while removing the overhead of offline hard-mining.
- Soft-margin variant further improves results and reduces training instability.
- Pretrained networks (TriNet) yield the strongest results, but a well-designed network trained from scratch (LuNet) is competitive, showing end-to-end triplet learning can work without large pretrained backbones.
- Training with their batch-hard triplet loss and end-to-end embedding learning yields significant gains over a classification-loss baseline (IDE) with metric learning, underscoring the effectiveness of the triplet approach.
- Performance gains persist even when evaluated with test-time augmentation and additional distractor images.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.