QUICK REVIEW

[论文解读] Learning Shape Representations for Clothing Variations in Person Re-Identification

Yu-Jhe Li, Zhengyi Luo|arXiv (Cornell University)|Mar 16, 2020

Video Surveillance and Tracking Methods参考文献 49被引用 33

一句话总结

CASE-Net 在衣物颜色不变的身体形状表征方面学习，且在衣物变化场景下利用合成数据集 SMPL-reID 和 Div-Market 超越了现有方法。

ABSTRACT

Person re-identification (re-ID) aims to recognize instances of the same person contained in multiple images taken across different cameras. Existing methods for re-ID tend to rely heavily on the assumption that both query and gallery images of the same person have the same clothing. Unfortunately, this assumption may not hold for datasets captured over long periods of time (e.g., weeks, months or years). To tackle the re-ID problem in the context of clothing changes, we propose a novel representation learning model which is able to generate a body shape feature representation without being affected by clothing color or patterns. We call our model the Color Agnostic Shape Extraction Network (CASE-Net). CASE-Net learns a representation of identity that depends only on body shape via adversarial learning and feature disentanglement. Due to the lack of large-scale re-ID datasets which contain clothing changes for the same person, we propose two synthetic datasets for evaluation. We create a rendered dataset SMPL-reID with different clothes patterns and a synthesized dataset Div-Market with different clothing color to simulate two types of clothing changes. The quantitative and qualitative results across 5 datasets (SMPL-reID, Div-Market, two benchmark re-ID datasets, a cross-modality re-ID dataset) confirm the robustness and superiority of our approach against several state-of-the-art approaches

研究动机与目标

Address the clothing-dependence problem in person re-identification where clothing changes degrade performance over time.
Develop a representation that captures body shape independent of clothing color or texture.
Evaluate the approach on synthetic datasets simulating clothing changes and on standard re-ID benchmarks to demonstrate robustness and generalization.

提出的方法

Propose Color Agnostic Shape Extraction Network (CASE-Net) that disentangles body shape from color via adversarial learning.
Use a shape encoder to produce color-invariant features from RGB and gray-scale images.
Employ a feature discriminator to align color-invariant feature distributions across color variations.
Use a color encoder to capture color-related features and a generator conditioned on shape and color features to enable pose-guided image recovery.
Train with identity and triplet losses to ensure discriminative shape features (L_id and L_tri).
Incorporate an image discriminator and reconstruction loss to enforce realistic pose-guided image synthesis (L_rec and L_adv^D_I).

实验结果

研究问题

RQ1Can CASE-Net learn a body-shape representation that is invariant to clothing color and patterns?
RQ2How well does clothing color variation affect existing re-ID methods compared to CASE-Net?
RQ3Do synthesized clothing-change datasets reveal weaknesses in state-of-the-art re-ID models?
RQ4Does CASE-Net generalize to cross-domain or cross-modality scenarios?
RQ5What is the impact of each loss component on the final body-shape representation?

主要发现

CASE-Net achieves state-of-the-art performance on SMPL-reID and Div-Market under clothing-change conditions.
Standard re-ID methods exhibit severe performance drops when clothing changes are present.
CASE-Net outperforms both standard and cross-modality baselines on Market-1501 and DukeMTMC-reID in standard and extended (color-change) settings.
On cross-modality SYSU-MM01, CASE-Net shows competitive generalization to RGB-IR scenarios.
Ablation studies show that removing components like the color-adversarial loss, triplet loss, identity loss, or reconstruction loss degrades Rank-1 and mAP, confirming the contribution of each term.
The paper provides two synthetic datasets (SMPL-reID and Div-Market) to evaluate clothing-change robustness where many baselines fail.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。