Skip to main content
QUICK REVIEW

[Paper Review] In Defense of the Triplet Loss for Person Re-Identification

Alexander Hermans, Lucas Beyer|arXiv (Cornell University)|Mar 22, 2017
Video Surveillance and Tracking Methods13 references2,885 citations
TL;DR

The paper argues for end-to-end metric learning with a variant of the triplet loss (batch hard with soft margin) and shows it achieves state-of-the-art results on Market-1501, MARS, and CUHK03, including training from scratch.

ABSTRACT

In the past few years, the field of computer vision has gone through a revolution fueled mainly by the advent of large datasets and the adoption of deep convolutional neural networks for end-to-end learning. The person re-identification subfield is no exception to this. Unfortunately, a prevailing belief in the community seems to be that the triplet loss is inferior to using surrogate losses (classification, verification) followed by a separate metric learning step. We show that, for models trained from scratch as well as pretrained ones, using a variant of the triplet loss to perform end-to-end deep metric learning outperforms most other published methods by a large margin.

Motivation & Objective

  • Motivate re-evaluating triplet loss for person re-identification (ReID) as competitive with surrogate losses.
  • Propose batch-hard triplet loss variants that remove the need for expensive offline hard negative mining.
  • Demonstrate end-to-end training efficacy on both pretrained and from-scratch networks.
  • Show that a well-designed triplet loss can surpass many published methods on major ReID datasets.

Proposed method

  • Review and contextualize metric embedding losses (including LLMNN and triplet loss).
  • Introduce Batch Hard (LBH) and Batch All (LBA) formulations; emphasize hard mining within batch.
  • Propose a soft-margin version of the batch-hard loss for stability.
  • Compare multiple triplet formulations (vanilla, Lifted, soft-margin variants) on a MARS-based validation set.
  • Use Euclidean distance in embedding space and avoid embedding normalization.
  • Evaluate on Market-1501, MARS, and CUHK03 with pretrained (TriNet) and from-scratch (LuNet) networks.

Experimental results

Research questions

  • RQ1Can end-to-end triplet-loss-based metric learning outperform surrogate losses with an extra metric learning step in person ReID?
  • RQ2Does batch-hard mining inside small PK batches eliminate the need for expensive offline hard-negative mining?
  • RQ3How do different triplet-loss formulations (batch hard/soft margin, batch all, Lifted) compare for ReID performance?
  • RQ4What is the effect of pretrained versus from-scratch training on ReID performance with triplet losses?
  • RQ5Is a margin-less or soft-margin formulation preferable for stable and strong ReID embeddings?

Key findings

  • A batch-hard triplet loss with a soft-margin achieves state-of-the-art results on Market-1501, MARS, and competitive performance on CUHK03 when combined with test-time augmentation.
  • Batch-hard consistently outperforms batch-all and vanilla triplet formulations in their experiments, while removing the overhead of offline hard-mining.
  • Soft-margin variant further improves results and reduces training instability.
  • Pretrained networks (TriNet) yield the strongest results, but a well-designed network trained from scratch (LuNet) is competitive, showing end-to-end triplet learning can work without large pretrained backbones.
  • Training with their batch-hard triplet loss and end-to-end embedding learning yields significant gains over a classification-loss baseline (IDE) with metric learning, underscoring the effectiveness of the triplet approach.
  • Performance gains persist even when evaluated with test-time augmentation and additional distractor images.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.