QUICK REVIEW

[论文解读] Self-Supervised Training Enhances Online Continual Learning

Jhair Gallardo, Tyler L. Hayes|arXiv (Cornell University)|Mar 25, 2021

Domain Adaptation and Few-Shot Learning参考文献 72被引用 23

一句话总结

该论文提出在 ImageNet 上的在线持续学习（OCL）中用自监督预训练替代监督预训练，表明自监督特征——尤其是来自 SwAV 的特征——在未见类别上的泛化能力更强，在预训练数据有限的情况下，相比之前最先进方法，top-1 准确率相对提升了 14.95%。

ABSTRACT

In continual learning, a system must incrementally learn from a non-stationary data stream without catastrophic forgetting. Recently, multiple methods have been devised for incrementally learning classes on large-scale image classification tasks, such as ImageNet. State-of-the-art continual learning methods use an initial supervised pre-training phase, in which the first 10% - 50% of the classes in a dataset are used to learn representations in an offline manner before continual learning of new classes begins. We hypothesize that self-supervised pre-training could yield features that generalize better than supervised learning, especially when the number of samples used for pre-training is small. We test this hypothesis using the self-supervised MoCo-V2, Barlow Twins, and SwAV algorithms. On ImageNet, we find that these methods outperform supervised pre-training considerably for online continual learning, and the gains are larger when fewer samples are available. Our findings are consistent across three online continual learning algorithms. Our best system achieves a 14.95% relative increase in top-1 accuracy on class incremental ImageNet over the prior state of the art for online continual learning.

研究动机与目标

探究自监督预训练是否相比传统监督预训练能提升在线持续学习（OCL）中的泛化能力。
评估自监督特征在未见 ImageNet 类别上的判别能力，特别是在预训练数据有限的情况下。
确定自监督特征是否能在多种 OCL 算法中提升性能，尤其是在数据稀缺的预训练环境下。
通过用自监督方法替代监督预训练，在 ImageNet 上建立在线持续学习的新 SOTA。

提出的方法

使用三种自监督方法（MoCo-V2、Barlow Twins 和 SwAV）在 ImageNet 类别的子集上预训练 ResNet-18 特征。
通过在预训练期间未见过的 ImageNet 类别上进行离线线性评估来衡量泛化能力，以评估特征质量。
将自监督特征集成到 REMIND OCL 框架中，其中特征被冻结，仅在线更新可塑层。
在不同预训练数据量下，比较使用自监督与监督预训练在三种 OCL 算法中的性能表现。
在离线和在线设置中，均使用标准 ImageNet 1000 类 top-1 准确率作为主要评估指标。
由于 SwAV 在消融实验中表现一致且泛化增益显著，故选择其作为最强基线。

实验结果

研究问题

RQ1在预训练数据有限的情况下，自监督预训练是否相比监督预训练能实现对未见 ImageNet 类别的更好泛化？
RQ2在离线线性评估中，自监督特征与监督特征相比，其判别能力如何？
RQ3自监督特征是否能在多种 OCL 算法的在线持续学习设置中提升性能？
RQ4预训练数据量对 OCL 中自监督与监督预训练之间性能差距的影响是什么？
RQ5自监督预训练能否在 ImageNet 上的在线持续学习中树立新的 SOTA？

主要发现

使用 MoCo-V2、Barlow Twins 和 SwAV 的自监督预训练在 ImageNet 的离线线性评估和在线持续学习中均优于监督预训练。
当用于预训练的类别更少时，自监督与监督预训练之间的性能差距最大，且随着数据稀缺性的增加而进一步扩大。
SwAV 特征在所有设置中表现最佳，始终在离线和在线评估中优于或匹配监督特征。
当仅在 ImageNet 类别的 10% 上进行预训练时，自监督方法相比之前 SOTA 在在线持续学习中的 top-1 准确率实现了 14.95% 的相对提升。
自监督特征在未见类别上的泛化能力优于监督特征，尤其在低数据环境下，这是由于其学习了类别无关的表征。
结果在三种不同的在线持续学习算法中保持一致，证实了自监督预训练作为通用增强手段的稳健性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。