Skip to main content
QUICK REVIEW

[Paper Review] Learning Deep Features via Congenerous Cosine Loss for Person Recognition

Yu Liu, Hongyang Li|arXiv (Cornell University)|Feb 22, 2017
Video Surveillance and Tracking Methods17 references56 citations
TL;DR

This paper introduces Congenerous Cosine (COCO) loss to train region-based deep features by maximizing intra-class similarity and inter-class variance using centroid-based softmax, enabling one-time training without test-time fine-tuning.

ABSTRACT

Person recognition aims at recognizing the same identity across time and space with complicated scenes and similar appearance. In this paper, we propose a novel method to address this task by training a network to obtain robust and representative features. The intuition is that we directly compare and optimize the cosine distance between two features - enlarging inter-class distinction as well as alleviating inner-class variance. We propose a congenerous cosine loss by minimizing the cosine distance between samples and their cluster centroid in a cooperative way. Such a design reduces the complexity and could be implemented via softmax with normalized inputs. Our method also differs from previous work in person recognition that we do not conduct a second training on the test subset. The identity of a person is determined by measuring the similarity from several body regions in the reference set. Experimental results show that the proposed approach achieves better classification accuracy against previous state-of-the-arts.

Motivation & Objective

  • Motivate robust person recognition across time and space in unconstrained scenes.
  • Develop a loss that directly optimizes cosine similarities to reduce intra-class variance and increase inter-class separation.
  • Enable a multi-region, alignment-based recognition pipeline that avoids second training on test data.
  • Show that single-stage training with COCO loss yields competitive or superior results on PIPA.
  • Provide analysis of region contributions and alignment to reduce overfitting.

Proposed method

  • Define cosine similarity between samples and their class centroids within mini-batches.
  • Introduce COCO loss which uses softmax over normalized features and centroids to optimize intra-class compactness and inter-class separability.
  • Align four region patches (face, head, upper body, whole body) to a base location via affine transformation to reduce variance.
  • Train four region-specific COCO models on the PIPA training set and combine region scores during inference.
  • Merge region similarities with a logistic normalization and weighted averaging to predict identities in test_1 without retraining on test_0.
  • Detail backpropagation for COCO loss with normalized features and centroids to allow end-to-end training.

Experimental results

Research questions

  • RQ1Can a centroid-based cosine loss improve intra-class compactness and inter-class separation for person recognition in unconstrained images?
  • RQ2Does aligning regional patches to a base location reduce inner-class variance and improve recognition accuracy?
  • RQ3How does aggregating multiple body-region cues affect recognition performance on PIPA under a single-stage training regime?
  • RQ4Is it feasible to perform person recognition by directly comparing features across test splits without a second training on test_0?
  • RQ5What is the impact of COCO loss compared to standard softmax in feature visualization and discrimination?

Key findings

  • COCO loss magnifies inter-class distance while reducing inner-class variance in feature space.
  • Alignment of region patches substantially improves performance across regions.
  • Face and head regions provide strongest signals among regions, with whole-body contributing variably depending on alignment.
  • Combining all four regions yields the best original-set accuracy (and improves other splits) versus single-region cues.
  • The method outperforms prior state-of-the-art on the PIPA dataset across original, album, time, and day splits.
  • Region-based scores can be merged via normalization and weighted averaging to produce final identities without test-time retraining.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.