QUICK REVIEW

[论文解读] Quality Aware Network for Set to Set Recognition

Yu Liu, Junjie Yan|arXiv (Cornell University)|Apr 11, 2017

Video Surveillance and Tracking Methods参考文献 25被引用 63

一句话总结

本文提出一个 Quality Aware Network (QAN)，学习每张图像的质量分数，在聚合图像集合时对特征进行加权，从而提升集合到集合的识别性能，适用于人脸验证和行人再识别，而无需显式的质量注释。

ABSTRACT

This paper targets on the problem of set to set recognition, which learns the metric between two image sets. Images in each set belong to the same identity. Since images in a set can be complementary, they hopefully lead to higher accuracy in practical applications. However, the quality of each sample cannot be guaranteed, and samples with poor quality will hurt the metric. In this paper, the quality aware network (QAN) is proposed to confront this problem, where the quality of each sample can be automatically learned although such information is not explicitly provided in the training stage. The network has two branches, where the first branch extracts appearance feature embedding for each sample and the other branch predicts quality score for each sample. Features and quality scores of all samples in a set are then aggregated to generate the final feature embedding. We show that the two branches can be trained in an end-to-end manner given only the set-level identity annotation. Analysis on gradient spread of this mechanism indicates that the quality learned by the network is beneficial to set-to-set recognition and simplifies the distribution that the network needs to fit. Experiments on both face verification and person re-identification show advantages of the proposed QAN. The source code and network structure can be downloaded at https://github.com/sciencefans/Quality-Aware-Network.

研究动机与目标

通过利用同一身份的多张图像，同时降低低质量样本的影响，推动稳健的集合到集合识别。
开发一个端到端可训练的网络，联合学习每张图像的特征和每张图像的质量分数。
证明基于质量感知的聚合在集合表示的判别性上优于简单的池化方法。
在行人再识别和非受限人脸验证基准测试中展示出最先进或具有竞争力的性能。

提出的方法

提出一个双分支的 Quality Aware Network (QAN)，其中一个分支提取每张图像的外观特征，另一个分支为每张图像预测质量分数。
通过集合池化单元用学习到的质量分数对每张图像的特征进行加权来聚合集合嵌入：R_a(S) = (sum_i mu_i R_Ii) / (sum_i mu_i)，其中 mu_i = Q(I_i)。
端到端训练，结合图像级身份的 Softmax 损失和集合级三元组损失，以将锚点/正样本集合拉近、负样本集合推远。
通过集合池化单元推导梯度，使得高质量样本对最终表示的贡献更大，有效地将 mu_i 视为对图像的注意力。
证明学习得到的质量与人工判断相关，并且在识别任务中可以超越人工提供的质量标注。

实验结果

研究问题

RQ1在没有显式质量监督的情况下，自动学习的每张图像质量分数能否改善集合到集合的聚合？
RQ2将特征生成和质量生成部分端到端地联合训练，是否比固定或外部定义的质量线索产生更好的表示？
RQ3在现实世界的面部验证和行人再识别基准测试中，QAN 的表现如何，尤其是在嘈杂条件下？
RQ4QAN 学到的质量分布能否跨数据集迁移（跨数据集鲁棒性）？

主要发现

QAN 在行人再识别中显著提升了 top-1 准确率：在 PRID2011 上相比强基线提升 +11.1%，在 iLIDS-VID 上提升 +12.21%。
QAN 在跨数据集测试中也取得显著提升，相对于基线在 PRID2011 上顶1 提升 15.6%，在 iLIDS-VID 上提升 8.2%。
在非受限人脸验证中，QAN 在 FPR=0.001 时分别在 YouTube Face 和 IJB-A 上将漏检率降低 15.6% 和 29.32%，相对于基线。
QAN 在四个基准上始终优于基线和若干先进方法，对嘈杂样本具有鲁棒性，低假阳性率表现更佳。
定性分析表明，QAN 学习的质量与人们对图像质量的认知一致，消融研究表明中间层特征（Pool3）在质量生成上最有效。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。