QUICK REVIEW

[论文解读] SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption

Dara Bahri, Heinrich Jiang|arXiv (Cornell University)|Jun 29, 2021

Domain Adaptation and Few-Shot Learning被引用 37

一句话总结

SCARF 引入一种简单的自监督对比预训练，用于表格数据，通过从经验边缘分布抽取的随机特征腐蚀来创建视图，提升监督性能，对标签噪声更加鲁棒，并在 OpenML-CC18 数据集上实现半监督学习。

ABSTRACT

Self-supervised contrastive representation learning has proved incredibly successful in the vision and natural language domains, enabling state-of-the-art performance with orders of magnitude less labeled data. However, such methods are domain-specific and little has been done to leverage this technique on real-world tabular datasets. We propose SCARF, a simple, widely-applicable technique for contrastive learning, where views are formed by corrupting a random subset of features. When applied to pre-train deep neural networks on the 69 real-world, tabular classification datasets from the OpenML-CC18 benchmark, SCARF not only improves classification accuracy in the fully-supervised setting but does so also in the presence of label noise and in the semi-supervised setting where only a fraction of the available training data is labeled. We show that SCARF complements existing strategies and outperforms alternatives like autoencoders. We conduct comprehensive ablations, detailing the importance of a range of factors.

研究动机与目标

为表格数据激发并开发一种自监督、领域无关的预训练方法。
通过使用经验边缘对随机子集的特征进行腐蚀来形成视图。
在完全监督、标签噪声和半监督设置下展示改进的下游分类性能。
展示对超参数的鲁棒性并进行设计选项的消融以确立 Scarf 的有效性。

提出的方法

通过随机选择一子集特征并将每个特征替换为其经验边缘分布的随机抽样来生成腐蚀视图。
将原始视图和腐蚀视图同时经过编码器 f 和预训练头 g 以获得 z 和 z~。
使用 InfoNCE 损失进行训练，使 z 和 z~ 对齐，同时将负样本与其他样本区分开。
通过在编码器 f 上附加分类头 h 进行微调，并在带标签数据上端到端训练。
可选地在验证集上的 InfoNCE 损失上使用早停来确定预训练时长。

实验结果

研究问题

RQ1在完全监督设置下，Scarf 预训练是否提高表格数据的下游分类准确性？
RQ2Scarf 对标签噪声是否鲁棒并在半监督模式中是否有益？
RQ3Scarf 如何与其他正则化或数据增强技术以及不同超参数的变化相互作用？
RQ4是否存在比 Scarf 提出的方法对表格数据更有效的腐蚀方案或损失？
RQ5哪些消融实验揭示了视图构造和腐蚀策略的重要性？

主要发现

Scarf 预训练在 69 个 OpenML-CC18 表格数据集上优于非预训练基线。
在标签噪声下以及仅有部分训练数据被标注时（半监督设置）Scarf 也提升了性能。
将 Scarf 与其他方法（如 mixup、标签平滑、蒸馏、 dropout）结合可带来额外收益，表明具有互补优势。
消融结果显示 Scarf 的边际采样腐蚀比其他腐蚀方式在效果上更好且对特征缩放更鲁棒。
Scarf 对批量大小、腐蚀率和 softmax 温度相对不敏感，使用合理的默认值（例如 c ≈ 0.6）效果良好。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。