QUICK REVIEW

[论文解读] Hard Negative Mixing for Contrastive Learning

Yannis Kalantidis, Mert Bülent Sarıyıldız|arXiv (Cornell University)|Oct 2, 2020

Domain Adaptation and Few-Shot Learning参考文献 99被引用 270

一句话总结

本文提出 MoCHi，一种用于对比自监督学习的特征空间硬负例混合策略，现场合成硬负例以提升表示学习和迁移性能，开销极小。它在线性分类和迁移任务上显示出一致的提升，尤其是在预训练时间较短时。

ABSTRACT

Contrastive learning has become a key component of self-supervised learning approaches for computer vision. By learning to embed two augmented versions of the same image close to each other and to push the embeddings of different images apart, one can train highly transferable visual representations. As revealed by recent studies, heavy data augmentation and large sets of negatives are both crucial in learning such representations. At the same time, data mixing strategies either at the image or the feature level improve both supervised and semi-supervised learning by synthesizing novel examples, forcing networks to learn more robust features. In this paper, we argue that an important aspect of contrastive learning, i.e., the effect of hard negatives, has so far been neglected. To get more meaningful negative samples, current top contrastive self-supervised learning approaches either substantially increase the batch sizes, or keep very large memory banks; increasing the memory size, however, leads to diminishing returns in terms of performance. We therefore start by delving deeper into a top-performing framework and show evidence that harder negatives are needed to facilitate better and faster learning. Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead. We exhaustively ablate our approach on linear classification, object detection and instance segmentation and show that employing our hard negative mixing procedure improves the quality of visual representations learned by a state-of-the-art self-supervised learning method.

研究动机与目标

强调在对比学习中，除了大批量/大内存规模之外，硬负例的重要性。
提出一种特征空间硬负例混合机制，能为每个查询创建合成的硬负例。
证明硬负例混合在跨任务和时期的迁移学习和表示利用率方面的提升。
证明 MoCHi 在 ImageNet-100 和如 PASCAL VOC、COCO 等迁移任务上取得具有竞争力的改进，尤其在较短的训练方案下。

提出的方法

在一个MoCo风格的动量对比框架内工作，带有负例的记忆库。
根据与查询的相似性，识别每个查询的最硬负例。
通过最接近的负例的凸组合来生成合成硬负例（硬负例混合）。
可选地将查询与最硬的负例混合，以获得更硬的合成负例。
将合成负例并入损失计算，开销极小（额外的 s + s' 次点积）。
使用与相关工作相同的MLP头，并使用标准线性评估和迁移任务进行评估。

实验结果

研究问题

RQ1在嵌入空间中合成硬负例是否能在对比自监督框架中实现更快且更鲁棒的学习？
RQ2硬负例混合是否提升跨视觉任务和数据集的迁移性能以及嵌入空间的利用？
RQ3超参N、s和s' 如何影响代理任务的难度和最终表示？
RQ4与 MoCo-v2 和有监督基线相比，MoCHi 对表征的对齐和分布均匀性的影响如何？

主要发现

硬负例混合（MoCHi）在 ImageNet-100 线性分类上相对于 MoCo-v2 显示出持续的改进。
将最硬负例与查询混合比单独混合负例带来更强的改进和更好的空间利用率。
MoCHi 提高嵌入空间的均匀性并提升到 PASCAL VOC 和 COCO 的迁移性能，有时在较短的预训练下达到类似有监督的水平。
在 ImageNet-100 上，MoCHi 变体在 200 epoch 相对于 MoCo-v2 的 top-1 准确率约提升 +0.7% 到 +1.0%，迁移任务上有进一步提升。
MoCHi 加速学习，使得在比基线方法更少的预训练 epoch 就可获得具有竞争力的性能。
类别Oracle分析表明，移除来自同一类别的负例可在一定程度上恢复部分有监督性能，说明硬负例对学习动态的影响。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。