QUICK REVIEW

[論文レビュー] Relation-Aware Global Attention for Person Re-identification

Zhizheng Zhang, Cuiling Lan|arXiv (Cornell University)|Apr 5, 2019

Video Surveillance and Tracking Methods参考文献 67被引用数 38

ひとこと要約

本論文は、各特徴ノードのグローバルな構造関係を学習して空間的およびチャネルのアテンションを生成する Relation-Aware Global Attention (RGA) モジュールを提案し、CUHK03、Market1501、MSMT17で最先端の再識別性能を達成する。

ABSTRACT

For person re-identification (re-id), attention mechanisms have become attractive as they aim at strengthening discriminative features and suppressing irrelevant ones, which matches well the key of re-id, i.e., discriminative feature learning. Previous approaches typically learn attention using local convolutions, ignoring the mining of knowledge from global structure patterns. Intuitively, the affinities among spatial positions/nodes in the feature map provide clustering-like information and are helpful for inferring semantics and thus attention, especially for person images where the feasible human poses are constrained. In this work, we propose an effective Relation-Aware Global Attention (RGA) module which captures the global structural information for better attention learning. Specifically, for each feature position, in order to compactly grasp the structural information of global scope and local appearance information, we propose to stack the relations, i.e., its pairwise correlations/affinities with all the feature positions (e.g., in raster scan order), and the feature itself together to learn the attention with a shallow convolutional model. Extensive ablation studies demonstrate that our RGA can significantly enhance the feature representation power and help achieve the state-of-the-art performance on several popular benchmarks. The source code is available at https://github.com/microsoft/Relation-Aware-Global-Attention-Networks.

研究の動機と目的

局所受容野を超えるグローバルな構造情報を活用するための person re-id のアテンション学習を促進する。
各特徴ノードからグローバルな関係から意味論を抽出するコンパクトな機構を提案する。
空間的 (RGA-S) およびチャネル (RGA-C) の関係認識グローバルアテンションモジュールを開発し、その有効性を示す。
RGA-S と RGA-C の結合が主要な re-id ベンチマークで最先端の結果をもたらすことを示す。

提案手法

特徴ノード間のペアワイズ関係（アフィニティ）をモデリングし、それらを積み上げて各ノードのグローバル関係ベクトルを形成する。
空間的 RGA (RGA-S) の場合、埋め込み1x1畳み込みを用いて r_i,j = f_s(x_i, x_j) を計算し、r_i = [R_s(i,:), R_s(:,i)] を形成し、それを x_i と結合して小さな2層の convnet を通じて注意 a_i を予測する。
チャネル RGA (RGA-C) の場合、チャネルをノードとして扱い、埋め込み特徴を用いて同様に r_i,j を計算し、r_i を形成して、空間と同じ方式でチャネル次元に沿って a_i を導出する。
埋め込みを介して局所特徴とグローバル関係ベクトルを融合し、シグモイド出力を持つ2層の convnet で関係認識特徴とアテンションを生成する。
RGA モジュールを ResNet-50 ボトムアップ (RetNet-50 variant) に組み込み、CUHK03、Market1501、MSMT17 で評価する。

実験結果

リサーチクエスチョン

RQ1特徴位置間のグローバル構造とペアワイズ関係を利用して person re-id のアテンションを改善できるか。
RQ2空間的およびチャネルの関係認識アテンションは互いに補完し、より識別能力の高い特徴を生み出すか。
RQ3RGA は従来のアテンション機構（局所アテンション、非局所、CBAM）と標準的な re-id ベンチマークでどのように比較されるか。
RQ4埋め込みの選択とバックボーン内のモジュール配置が性能に与える影響はどのようか。

主な発見

Model	CUHK03(L) R1	CUHK03(L) mAP	Market1501 R1	Market1501 mAP	MSMT17 R1	MSMT17 mAP
Baseline	73.8	69.0	94.2	83.7	-	-
RGA-S w/o Rel.	76.8	72.3	94.3	83.8	-	-
RGA-S w/o Ori.	78.2	74.0	95.4	86.7	-	-
RGA-S	79.3	74.7	96.0	87.5	-	-
RGA-C w/o Rel.	77.8	73.7	94.7	84.8	-	-
RGA-C w/o Ori.	78.1	74.9	95.4	87.1	-	-
RGA-C	79.3	75.6	95.9	87.9	-	-
RGA-S//C	77.3	73.4	95.3	86.6	-	-
RGA-CS	78.6	75.5	95.3	87.8	-	-
RGA-SC	81.1	77.4	96.1	88.4	80.3	57.5

RGA-S および RGA-C はいずれも CUHK03 と Market1501 でベースラインより性能を大幅に向上させる。
空間アテンションとチャネルアテンションを組み合わせた (RGA-SC) が最良の結果をもたらし、CUHK03 の mAP でベースラインを最大 8.4% 上回り、Market1501 および MSMT17 でも大きな改善を達成する。
RGA-S および RGA-C は Rank-1 および mAP の両方の指標で、CBAM、FC-S/FC-C、SE、NL などのいくつかのアテンションベースラインを上回る。
関係モデリングの非対称埋め込みは、対称埋め込みや埋め込みなしよりさらなる利得をもたらす。
RGA-SC は報告された手法の中で CUHK03、Market1501、MSMT17 の最先端結果を達成し、2位手法との差分も顕著である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。