QUICK REVIEW

[論文レビュー] Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective

Jialun Liu, Yifan Sun|arXiv (Cornell University)|Feb 25, 2020

Video Surveillance and Tracking Methods参考文献 52被引用数 38

ひとこと要約

LEAPは長尾データにおける intra-class angular diversityを頭クラスから尾クラスへ転送することで尾クラスの識別性を向上させ、尾サンプルの周りに特徴クラウドを構築します。これにより人物再識別と顔認識の長尾データにおける識別性が改善され、複数のベンチマークで強力なベースラインを上回る顕著な利得を得ます。

ABSTRACT

This paper considers learning deep features from long-tailed data. We observe that in the deep feature space, the head classes and the tail classes present different distribution patterns. The head classes have a relatively large spatial span, while the tail classes have significantly small spatial span, due to the lack of intra-class diversity. This uneven distribution between head and tail classes distorts the overall feature space, which compromises the discriminative ability of the learned features. Intuitively, we seek to expand the distribution of the tail classes by transferring from the head classes, so as to alleviate the distortion of the feature space. To this end, we propose to construct each feature into a "feature cloud". If a sample belongs to a tail class, the corresponding feature cloud will have relatively large distribution range, in compensation to its lack of diversity. It allows each tail sample to push the samples from other classes far away, recovering the intra-class diversity of tail classes. Extensive experimental evaluations on person re-identification and face recognition tasks confirm the effectiveness of our method.

研究の動機と目的

長尾クラス分布の下で識別的な深い特徴を学習する課題を動機づけ、対処する。
頭クラスから転送された intra-class の角度分散を拡張して尾クラスの多様性を高める学習可能な埋め込み拡張フレームワークを提案する。
トレーニング中に尾サンプルを補正するために intra-class の角度分布をモデル化し特徴クラウドを構築する。
人物再識別と顔認識のベンチマークで手法を評価し、強力なベースラインに対する有効性を示す。

提案手法

特徴とクラス中心との角度を用いて intra-class フィーチャ分布をモデル化する。
クラスごとに角度メモリを維持・更新し、ガウス角度分布を推定する。
頭クラスの角度分散を算出し、それを尾クラスへ転送するために各尾サンプルの周りに特徴クラウドを構築する。
尾特徴の拡張を angle offset alpha ~ N(0, sigma_h^2 - sigma_t^2) からの分布からサンプリングすることで尾空間を拡大することとして定義する。
拡張を CosFace および ArcFace の損失と統合し、変換済みの損失 L3/L4（完全版）またはベースラインとしての L1/L2 を得る；角度を [0, pi] にクリップすることを含む。
人間の介入を避けて tail の多様性を調整するために、vanilla（ヘッド/テールラベルあり）版と完全版（ヘッド/テールラベルなし）版を提供する。

実験結果

リサーチクエスチョン

RQ1長尾データは深い埋め込み学習における特徴空間をいかに歪めるのか。
RQ2尾クラスは頭クラスから学習した intra-class の角度分散を転送することで補償できるのか。
RQ3特徴クラウドを用いた拡張は explicit な head-tail labeling を要求せずに尾クラスの識別性を改善するのか。
RQ4ヘッド–テール比率が異なる場合にLEAPは人物再識別と顔認識ベンチマークでどのような性能を示すのか。

主な発見

Method	Market-1501の mAP	Market-1501の Rank-1	DukeMTMCの mAP	DukeMTMCの Rank-1	MSMT17の mAP	MSMT17の Rank-1
HA-CNN	75.7	91.2	63.8	80.5	-	-
PCB	77.4	92.3	66.1	81.8	40.4	68.2
Mancs	82.3	93.1	71.8	84.9	-	-
CosFace	79.5	92.4	73.0	85.6	49.2	75.3
ArcFace	81.1	92.5	73.2	85.8	50.5	75.5
LEAP-CF	84.2	94.4	74.2	87.8	50.8	76.7
LEAP-AF	83.2	93.5	74.2	86.9	51.3	76.3

LEAPは長尾データにおける識別性を改善し、Market-1501およびDukeMTMC-reIDで強力なベースラインを上回るRank-1および mAPを達成した。
Market-1501およびDukeMTMC-reIDではLEAP-CFが Market-1501でRank-1 94.4%、mAP 84.2%、DukeMTMC-reIDでRank-1 87.8%、mAP 74.2%を達成。
MSMT17ではLEAP-CFがmAP 50.8、Rank-1 76.7、LEAP-AFがmAP 51.3、Rank-1 76.3を達成。
vanillaベースラインと比較して、Head-tail比が異なる場合でもLEAPは一貫して性能を向上させ、特に非常に深刻な長尾設定（例：H20/S3、H20/S4）で顕著。
完全版（明示的な head/tail labeling なし）は、データセット分布に対するロバスト性を示し、ベースライン版と同等以上の結果を達成する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。