QUICK REVIEW

[论文解读] Embedding Deep Metric for Person Re-identication A Study Against Large Variations

Hailin Shi, Yang Yang|arXiv (Cornell University)|Nov 1, 2016

Video Surveillance and Tracking Methods参考文献 15被引用 69

一句话总结

本文提出了一种新颖的中等正样本挖掘策略与度量权重约束，以在大类内差异下提升行人重识别任务中的深度度量学习性能。通过在局部特征邻域内自适应选择中等正样本对，并对度量层权重施加正则化，该方法在CUHK03和CUHK01数据集上达到最先进性能，在VIPeR上取得具有竞争力的结果，其中CUHK01的rank-1准确率达到69%，VIPeR达到40.91%。

ABSTRACT

Person re-identification is challenging due to the large variations of pose, illumination, occlusion and camera view. Owing to these variations, the pedestrian data is distributed as highly-curved manifolds in the feature space, despite the current convolutional neural networks (CNN)'s capability of feature extraction. However, the distribution is unknown, so it is difficult to use the geodesic distance when comparing two samples. In practice, the current deep embedding methods use the Euclidean distance for the training and test. On the other hand, the manifold learning methods suggest to use the Euclidean distance in the local range, combining with the graphical relationship between samples, for approximating the geodesic distance. From this point of view, selecting suitable positive i.e. intra-class) training samples within a local range is critical for training the CNN embedding, especially when the data has large intra-class variations. In this paper, we propose a novel moderate positive sample mining method to train robust CNN for person re-identification, dealing with the problem of large variation. In addition, we improve the learning by a metric weight constraint, so that the learned metric has a better generalization ability. Experiments show that these two strategies are effective in learning robust deep metrics for person re-identification, and accordingly our deep model significantly outperforms the state-of-the-art methods on several benchmarks of person re-identification. Therefore, the study presented in this paper may be useful in inspiring new designs of deep models for person re-identification.

研究动机与目标

解决由于姿态、光照和视角变化导致的行人重识别任务中大类内差异的挑战。
认识到当前深度学习方法在高度弯曲的特征流形上忽略了对正样本的仔细选择。
通过引入中等正样本挖掘策略，改善深度度量学习，以更好地捕捉数据内在结构。
通过在度量层引入新颖的权重约束，提升模型泛化能力并减少过拟合。
在存在大差异的情况下，实现在主要行人重识别基准上的最先进性能。

提出的方法

提出一种中等正样本挖掘策略，自适应地在特征空间的局部邻域内选择正样本对，避免极端类内差异。
结合局部欧氏距离与样本间的图关系，近似高度弯曲流形上的测地距离。
引入度量权重约束以正则化度量学习层，提升泛化能力并减少过拟合。
使用新选择的中等正样本对，通过三元组损失训练CNN，以增强特征判别力。
在多个数据集（如CUHK03 → CUHK01）上进行微调，并应用数据增强（如随机平移）以提升鲁棒性。
利用大规模数据集的预训练特征与迁移学习，提升在小规模基准（如VIPeR）上的性能。

实验结果

研究问题

RQ1在大类内差异下，正样本训练样本的选择如何影响行人重识别中的深度度量学习？
RQ2在高度弯曲的特征流形中，局部欧氏距离结合图关系是否能有效近似测地距离？
RQ3与标准难负样本挖掘相比，中等正样本挖掘（选择非极端正样本对）是否能提升模型的鲁棒性与准确率？
RQ4对度量层施加权重约束在多大程度上能减少过拟合并提升行人重识别模型的泛化能力？
RQ5所提方法是否能在CUHK03、CUHK01和VIPeR等标准基准上实现最先进性能？

主要发现

所提方法在CUHK01数据集上实现了69%的rank-1识别率，优于以往最先进方法。
在Market1501上微调并在CUHK03上训练后，模型在CUHK01上达到87%的rank-1准确率，表明更大规模训练数据的益处。
在具有挑战性的VIPeR数据集中，该方法实现了40.91%的rank-1识别率，是基于深度学习方法中的最高水平。
该方法显著减少了因真实正样本与相似颜色负样本之间颜色不一致而导致的失败案例，此类问题在实际监控场景中十分常见。
消融实验证实，中等正样本挖掘与权重约束均独立地带来性能提升，尤其在降低类内方差方面效果显著。
可视化结果表明，学习到的滤波器聚焦于颜色特征，且在使用中等正样本时，对光照与颜色变化具有鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。