QUICK REVIEW

[论文解读] ReSSL: Relational Self-Supervised Learning with Weak Augmentation

Mingkai Zheng, Shan You|arXiv (Cornell University)|Jul 20, 2021

Domain Adaptation and Few-Shot Learning参考文献 47被引用 41

一句话总结

ReSSL 通过对增强视图之间的关系相似性建模来学习视觉表征，使用弱增强和动量记忆目标在效率和性能方面优于早期 SSL 方法。

ABSTRACT

Self-supervised Learning (SSL) including the mainstream contrastive learning has achieved great success in learning visual representations without data annotations. However, most of methods mainly focus on the instance level information (\ie, the different augmented images of the same instance should have the same feature or cluster into the same class), but there is a lack of attention on the relationships between different instances. In this paper, we introduced a novel SSL paradigm, which we term as relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances. Specifically, our proposed method employs sharpened distribution of pairwise similarities among different instances as extit{relation} metric, which is thus utilized to match the feature embeddings of different augmentations. Moreover, to boost the performance, we argue that weak augmentations matter to represent a more reliable relation, and leverage momentum strategy for practical efficiency. Experimental results show that our proposed ReSSL significantly outperforms the previous state-of-the-art algorithms in terms of both performance and training efficiency. Code is available at \url{https://github.com/KyleZheng1997/ReSSL}.

研究动机与目标

通过保留实例之间的关系来激发表征学习，而不仅仅强求实例级不变性。
引入一个关系一致性损失，使不同增强之间的相似性分布保持一致。
使用弱增强和基于动量的教师以提供稳定、信息丰富的目标，同时提高训练效率。
在小、中、大规模视觉基准上展示出强大的经验提升。

提出的方法

将关系度量定义为增强视图之间成对相似性的一种尖锐化分布。
对每个图像构建两个增强视图，并使用带温度 tau_t 与 tau_s 的 softmax 计算基于相似性的关系分布 p1 和 p2。
最小化 p1 和 p2 之间的 KL 散度以实现关系一致性（以 p1 作为目标的交叉熵）。
使用动量更新的教师网络和记忆队列来模拟大批量关系并稳定目标（不需要大量内存）。
对教师采用弱增强以提供可靠的关系目标，并使用对比式学生从这些关系中学习。
用所提出的关系一致性损失取代传统对比损失，以在适度的训练成本下实现最先进的结果。

实验结果

研究问题

RQ1在增强之间保留实例间关系结构是否能提升超越传统实例判别的表征学习？
RQ2对目标使用更弱的增强是否能产生更可靠的关系分布并带来更好表现？
RQ3记忆队列大小和教师动量对关系目标质量及下游准确率的影响是什么？
RQ4在标准 SSL 基准（ImageNet 线性评估、迁移任务）上，ReSSL 相较于强基线的表现如何？
RQ5在保持或提升性能的同时，ReSSL 是否比多次反向传播的 SSL 方法更高效？

主要发现

ReSSL 在 ImageNet 的线性评估（200 轮，1x 反向传播并使用 EMA）达到 69.9% Top-1，领先 MoCoV2 2.4%。
采用多裁剪策略，ReSSL 在 ImageNet 上达到 74.7% Top-1，超越 CLSA-Multi 1.4%。
弱教师增强在 CIFAR-10、CIFAR-100、STL-10 和 Tiny ImageNet 上显著提升性能。
用于关系目标的更大记忆库（最高可达 16384）获得更好精度，超过较大规模后收益递减。
在 ImageNet 1k、2x 反向传播下，ReSSL 仍具有与若干基线的竞争力/优越性；使用 4 剪裁时，超过了先前的最先进方法。
t-SNE 可视化显示 ReSSL 相比 MoCoV2 的类别分离更好，表明学习到的特征具有更清晰的关系结构。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。