QUICK REVIEW

[论文解读] Intriguing Properties of Contrastive Losses

Ting Chen, Calvin Luo|arXiv (Cornell University)|Nov 5, 2020

Domain Adaptation and Few-Shot Learning参考文献 37被引用 62

一句话总结

本文推广了对比损失的泛化形式，研究了具有多个对象的实例级学习，展示了局部特征的出现，并揭示了一种特征抑制现象：易学的共享特征可能阻碍其他特征的学习。

ABSTRACT

We study three intriguing properties of contrastive learning. First, we generalize the standard contrastive loss to a broader family of losses, and we find that various instantiations of the generalized loss perform similarly under the presence of a multi-layer non-linear projection head. Second, we study if instance-based contrastive learning (with a global image representation) can learn well on images with multiple objects present. We find that meaningful hierarchical local features can be learned despite the fact that these objectives operate on global instance-level features. Finally, we study the phenomenon of feature suppression among competing features shared across augmented views, such as "color distribution" vs "object class". We construct datasets with explicit and controllable competing features, and show that, for contrastive learning, a few bits of easy-to-learn shared features can suppress, and even fully prevent, the learning of other sets of competing features. In scenarios where there are multiple objects in an image, the dominant object would suppress the learning of smaller objects. Existing contrastive learning methods critically rely on data augmentation to favor certain sets of features over others, and could suffer from learning saturation for scenarios where existing augmentations cannot fully address the feature suppression. This poses open challenges to existing contrastive learning techniques.

研究动机与目标

通过提出具有对齐和分布项的广义损失形式，扩展对对比损失的理解。
评估在包含多个对象的图像上，实例级（全局）对比目标是否也能学习到有意义的局部特征。
研究跨增强视图共享特征造成的特征抑制现象及其对数据增强的影响，并理解其含义。
构建受控数据集以量化竞争特征，并分析易学特征如何影响表征学习。

提出的方法

提出广义对比损失形式 L_generalized = L_alignment + lambda L_distribution，显示标准 NT-Xent 作为特例。
对 L_distribution 使用多种先验分布，包括均匀超球面、均匀超立方体，以及正态先验，并通过 LogSumExp 或 SwD 进行分布匹配。
将对齐与降低 H(U|V) 以及分布最大化熵 H(U) 在互信息框架中联系起来。
采用深度、多层投影头进行训练，以比较不同实现并评估对批量大小的敏感性。
引入基于 SWD 的实现，以实现超出均匀超球面的多样先验。
在 CIFAR-10 和 ImageNet 上以 SimCLR 式设置进行实验，比较损失变体和投影头深度。

实验结果

研究问题

RQ1在使用深投影头的情况下，广义对比损失与各种先验是否表现相近？
RQ2实例级（全局）对比目标在包含多对象的图像上是否能学习到有意义的局部特征？
RQ3在增强视图之间存在竞争特征时，特征抑制会如何影响对比学习？易学特征是否会抑制其他特征？
RQ4受控数据集是否能揭示特征抑制的程度及当前增强的局限性？

主要发现

在 CIFAR-10 和 ImageNet 上，当投影头较深时，广义对比损失之间的差异很小。
实例级学习在包含多个对象的图像上仍能从全局表示中学习到有意义的局部特征。
包含明确竞争特征的数据集基线表明，少量的易学共享信息可以抑制其他特征的学习，有时甚至完全阻断它们。
在多对象场景中，支配对象可以抑制对较小对象的学习，揭示了现实世界杂乱环境下对比方法的一个开放挑战。
添加到视图中的额外易学信息（如随机比特）可以完全禁用对比学习，而像 VAE 这样的生成模型受影响较小。
研究强调数据增强设计对学习哪些特征具有关键影响，并将特征抑制作为当前对比方法的基本局限性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。