QUICK REVIEW

[論文レビュー] Are all negatives created equal in contrastive instance discrimination?

Tiffany Cai, Jonathan Frankle|arXiv (Cornell University)|Oct 13, 2020

Categorization, perception, and language参考文献 31被引用数 57

ひとこと要約

本論文は、ImageNet上のMoCo v2のCIDにおいて、最も難しい5%のネガティブのみがほぼ完全な下流精度に必要かつ十分であり、最も簡単な95%は不必要である；極めて難しい0.1%は特定の設定で有害になり得る。

ABSTRACT

Self-supervised learning has recently begun to rival supervised learning on computer vision tasks. Many of the recent approaches have been based on contrastive instance discrimination (CID), in which the network is trained to recognize two augmented versions of the same instance (a query and positive) while discriminating against a pool of other instances (negatives). The learned representation is then used on downstream tasks such as image classification. Using methodology from MoCo v2 (Chen et al., 2020), we divided negatives by their difficulty for a given query and studied which difficulty ranges were most important for learning useful representations. We found a minority of negatives -- the hardest 5% -- were both necessary and sufficient for the downstream task to reach nearly full accuracy. Conversely, the easiest 95% of negatives were unnecessary and insufficient. Moreover, the very hardest 0.1% of negatives were unnecessary and sometimes detrimental. Finally, we studied the properties of negatives that affect their hardness, and found that hard negatives were more semantically similar to the query, and that some negatives were more consistently easy or hard than we would expect by chance. Together, our results indicate that negatives vary in importance and that CID may benefit from more intelligent negative treatment.

研究の動機と目的

Negativesの相対的重要性を理解する動機づけ。
さまざまな難易度のネガティブが下流のImageNet線形精度にどのように寄与するかを定量化。
難しいネガティブと容易なネガティブを区別する意味論的特性を特定。
特定のクエリ間で学習に一貫して影響を与えるネガティブがあるかを探る。
CIDにおけるより知的なネガティブサンプリングの示唆を提案。

提案手法

MoCo v2をResNet-50エンコーダとMLP投影ヘッドで使用。
正規化済みのコントラスト空間の埋め込み間の内積を用いてネガティブの難易度を定義。
サブセットを除去して下流精度を測定し、ネガティブの必要性と十分性を評価。
2つの温度（0.07と0.20）と3つのランダムシードで評価。
クラスラベルとWordNetベースの類似度指標を用いてネガティブの意味論的類似性を分析。

実験結果

リサーチクエスチョン

RQ1CIDにおいて高い下流精度に必要なネガティブはどれか（難易度別）。
RQ2前処理時に最も難しいネガティブのみを用いた場合、それは十分か。
RQ3極めて難しいネガティブは学習を害するか、そうであればなぜか。
RQ4易い vs 難しいネガティブを区別する意味論的特性は何か。
RQ5発見はCIDのカリキュラムや選択的ネガティブサンプリングに情報を提供するか。

主な発見

Temperature	Condition	Top 1 Acc	Top 5 Acc
0.07	Baseline (remove none)	64.78 ± 0.31	85.86 ± 0.12
0.07	Remove 0.1% hardest	66.25 ± 0.23	86.98 ± 0.09
0.07	Remove same class	66.61 ± 0.10	86.96 ± 0.07
0.07	Remove 0.1% hardest ∩ same class	66.43 ± 0.04	86.78 ± 0.06
0.07	Remove 0.1% hardest ∩ different class	63.69 ± 0.04	85.44 ± 0.00
0.07	Remove 99.9% easiest ∩ same class	65.06 ± 0.11	85.91 ± 0.01
0.20	Baseline (remove none)	67.48 ± 0.07	87.93 ± 0.05
0.20	Remove 0.1% hardest	67.64 ± 0.22	87.88 ± 0.07
0.20	Remove same class	68.07 ± 0.12	88.30 ± 0.15
0.20	Remove 0.1% hardest ∩ same class	67.67 ± 0.02	88.09 ± 0.18
0.20	Remove 0.1% hardest ∩ different class	67.38 ± 0.06	87.86 ± 0.08
0.20	Remove 99.9% easiest ∩ same class	67.79 ± 0.07	88.05 ± 0.05

最も簡単な95%のネガティブは不要で不十分であり、上位5%の難しいネガティブが必要かつ十分である。
最も難しい上位5%のネガティブだけで学習すると、基準のトップ1精度から0.7ポイント未満の差で推移し、最も簡単な95%のみで学習すると性能が低下する。
非常に難しい0.1%のネガティブは、低温度で有害となり、同一クラスのネガティブを除くことは部分的に有益となる場合がある。
難しいネガティブは容易なネガティブよりもクエリに対して意味的に類似する傾向があり、いくつかの容易なネガティブは反相的だがクエリと意味的には類似している。
クエリ間で一貫して難しいまたは容易なネガティブが存在し、キュー内に一貫して難しいネガティブを維持することで利点が得られる可能性が示唆される。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。