QUICK REVIEW

[论文解读] Revisiting Self-Supervised Visual Representation Learning

Alexander Kolesnikov, Xiaohua Zhai|arXiv (Cornell University)|Jan 25, 2019

Domain Adaptation and Few-Shot Learning参考文献 49被引用 68

一句话总结

本文进行了一项大规模自监督视觉表征学习研究，显示 CNN 架构选择和宽度对学习的表征有深远影响，并通过同时优化架构和预文本任务取得了新的最先进水平。

ABSTRACT

Unsupervised visual representation learning remains a largely unsolved problem in computer vision research. Among a big body of recently proposed approaches for unsupervised learning of visual representations, a class of self-supervised techniques achieves superior performance on many challenging benchmarks. A large number of the pretext tasks for self-supervised learning have been studied, but other important aspects, such as the choice of convolutional neural networks (CNN), has not received equal attention. Therefore, we revisit numerous previously proposed self-supervised models, conduct a thorough large scale study and, as a result, uncover multiple crucial insights. We challenge a number of common practices in selfsupervised visual representation learning and observe that standard recipes for CNN design do not always translate to self-supervised representation learning. As part of our study, we drastically boost the performance of previously proposed techniques and outperform previously published state-of-the-art results by a large margin.

研究动机与目标

评估 CNN 架构选择如何影响自监督视觉表征的质量。
确定标准监督设计实践是否可迁移到自监督设定。
确定网络宽度和可逆性如何影响表征质量。
评估线性评估在评估通过自监督学习获得的表征方面的充分性。
提供选择架构和任务以提升无监督学习性能的指南。）

提出的方法

使用不同宽度因子 (k) 评估六种 CNN 架构（ResNet 系列、RevNet、VGG）在自监督任务上的表现。
在不同架构上重新考察四种自监督技术（Rotation、Exemplar、Relative Patch Location、Jigsaw）。
使用前日志层表示来训练线性逻辑回归分类器，用于下游的 ImageNet/Places205 任务。
比较线性与非线性（MLP）评估，以评估线性探针的充分性。
独立分析网络宽度与表征尺寸的影响。
检查线性评估的 SGD 训练动态，以理解收敛要求。

实验结果

研究问题

RQ1CNN 架构如何影响通过自监督任务学习到的表征质量？
RQ2标准的监督设计 CNN 选择是否可迁移到自监督设置？
RQ3增大网络宽度和表征尺寸对自监督性能的影响是什么？
RQ4线性评估是否足以衡量跨架构和跨任务的表征质量？
RQ5跳跃连接和可逆性如何影响深度网络中有用表征的保持？

主要发现

架构选择显著影响自监督性能，且在不同任务中的排名存在差异。
跳跃连接有助于防止自监督学习中表征质量向更深层退化。
增加滤波器数量（宽度）和表征尺寸始终提升性能。
线性评估在很大程度上是足够的；在此情境下非线性评估的增益仅为边际。
上下文预测最初引发自监督学习，在合适的架构下可以达到领先结果。
更宽的模型在数据集（ImageNet 和 Places205）以及低数据情境中都带来收益。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。