QUICK REVIEW

[論文レビュー] Quality Aware Network for Set to Set Recognition

Yu Liu, Junjie Yan|arXiv (Cornell University)|Apr 11, 2017

Video Surveillance and Tracking Methods参考文献 25被引用数 63

ひとこと要約

本論文は Quality Aware Network (QAN) を提案し、各画像の品質スコアを学習して特徴量を重み付けし、画像セットを集約する際の表現を改善する。顔認証と人の再識別のセット間認識を、明示的な品質アノテーションなしで向上させる。

ABSTRACT

This paper targets on the problem of set to set recognition, which learns the metric between two image sets. Images in each set belong to the same identity. Since images in a set can be complementary, they hopefully lead to higher accuracy in practical applications. However, the quality of each sample cannot be guaranteed, and samples with poor quality will hurt the metric. In this paper, the quality aware network (QAN) is proposed to confront this problem, where the quality of each sample can be automatically learned although such information is not explicitly provided in the training stage. The network has two branches, where the first branch extracts appearance feature embedding for each sample and the other branch predicts quality score for each sample. Features and quality scores of all samples in a set are then aggregated to generate the final feature embedding. We show that the two branches can be trained in an end-to-end manner given only the set-level identity annotation. Analysis on gradient spread of this mechanism indicates that the quality learned by the network is beneficial to set-to-set recognition and simplifies the distribution that the network needs to fit. Experiments on both face verification and person re-identification show advantages of the proposed QAN. The source code and network structure can be downloaded at https://github.com/sciencefans/Quality-Aware-Network.

研究の動機と目的

同一人物につき複数画像を活用しつつ低品質サンプルの影響を抑制することで、堅牢なセット間認識を動機付ける。
各画像の特徴と品質スコアを同時に学習するエンドツーエンドで訓練可能なネットワークを開発する。
品質を考慮した集約が、単純なプーリング手法よりもセット表現の識別力を向上させることを示す。
人の再識別および制約のない顔認証のベンチマークで最先端または競争力のある性能を示す。

提案手法

一方のブランチが各画像の外観特徴を抽出し、もう一方が各画像の品質スコアを予測する2枝の Quality Aware Network (QAN) を提案する。
学習済みの品質スコアで各画像特徴を重み付けし、セットプーリングユニットを介してセット埋め込みを集約する: R_a(S) = (sum_i mu_i R_Ii) / (sum_i mu_i), where mu_i = Q(I_i).
画像レベルのアイデンティティには Softmax loss を、セットレベルのトリプレット損失でアンカー/ポジティブセットを近づけ、ネガティブセットを遠ざけるよう、エンドツーエンドで訓練する。
セットプーリングユニットを通じて勾配を導出し、より高品質なサンプルが最終表現により貢献するようにする。mu_i を画像へのアテンションとして実質的に扱う。
学習された品質が人間の判断と相関し、認識タスクにおいて人間が提供する品質ラベルを超えることを示す。）

実験結果

リサーチクエスチョン

RQ1明示的な品質監視なしに、自動的に学習された各画像の品質スコアがセット間認識の集約を改善できるか？
RQ2特徴生成と品質生成部のエンドツーエンドの結合訓練は、固定または外部定義された品質指標よりも良い表現を生むか？
RQ3現実世界の顔認証および人の再識別のセット間認識ベンチマークにおける QAN の性能はどうか。特にノイズ条件下で。
RQ4QAN が学習する品質分布はデータセット間で転用可能か（クロスデータセットの堅牢性）？

主な発見

QAN は PRID2011 で top-1 精度を +11.1%、iLIDS-VID で +12.21% 向上させ、強力なベースラインに対して顕著な改善を示す。
データセット間の評価でも大幅な向上を達成し、PRID2011で top-1 が15.6%、iLIDS-VIDで8.2%改善。
制約のない顔認証では、YouTube Face で miss rate が 15.6%、IJB-A で 29.32% 減少（FPR=0.001、ベースライン比較）。
4つのベンチマークにおいて、QAN は一貫してベースラインおよびいくつかの最先端手法を上回り、ノイズの多いサンプルに対して頑健で低FPR時の性能が向上。
定性的分析では、QAN が学習した品質が人間の画像品質の概念と一致し、アブレーション研究では中間レベル特徴（Pool3）が品質生成に最も効果的であることを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。