QUICK REVIEW

[论文解读] Self-Supervised Generalisation with Meta Auxiliary Learning

Shikun Liu, Andrew J. Davison|arXiv (Cornell University)|Jan 25, 2019

Domain Adaptation and Few-Shot Learning参考文献 32被引用 66

一句话总结

MAXL 自动学习辅助标签以提升主任务的泛化能力，而无需额外数据，通过元训练的标签生成器和多任务学习器。

ABSTRACT

Learning with auxiliary tasks can improve the ability of a primary task to generalise. However, this comes at the cost of manually labelling auxiliary data. We propose a new method which automatically learns appropriate labels for an auxiliary task, such that any supervised learning task can be improved without requiring access to any further data. The approach is to train two neural networks: a label-generation network to predict the auxiliary labels, and a multi-task network to train the primary task alongside the auxiliary task. The loss for the label-generation network incorporates the loss of the multi-task network, and so this interaction between the two networks can be seen as a form of meta learning with a double gradient. We show that our proposed method, Meta AuXiliary Learning (MAXL), outperforms single-task learning on 7 image datasets, without requiring any additional data. We also show that MAXL outperforms several other baselines for generating auxiliary labels, and is even competitive when compared with human-defined auxiliary labels. The self-supervised nature of our method leads to a promising new direction towards automated generalisation. Source code can be found at https://github.com/lorenmt/maxl.

研究动机与目标

激发辅助学习以在无人工辅助标签的情况下提升泛化能力。
提出一个自监督框架来自动生成辅助标签。
证明 MAXL 能在多个图像数据集上提升主任务准确性。

提出的方法

两网络的 MAXL 架构：一个用于主任务和辅助任务的多任务网络，以及一个用于辅助标签的标签生成网络。
每个主类别的分层辅助标签结构，使用屏蔽 SoftMax（Mask SoftMax）来强制类别级的辅助映射。
元学习梯度流，其中标签生成器通过主任务性能进行训练（二阶导数 Hessian 技巧）。
对辅助标签分布进行熵正则化以避免辅助标签塌缩。
对主任务和辅助任务均使用焦点损失以关注困难样本。
训练在使用生成的辅助标签更新多任务网络与通过主任务性能更新标签生成器之间交替进行。

实验结果

研究问题

RQ1自生成的辅助标签空间是否能在不需要任何额外数据的情况下提升主任务的泛化能力？
RQ2自动生成的辅助标签相对于随机、无监督聚类或人工定义的辅助标签的效果如何？
RQ3引入分层辅助标签结构是在各数据集上有助还是有碍性能？
RQ4在使用 MAXL 时，辅助损失与主损失之间的梯度相似性动态是怎样的？
RQ5在无监督条件下，MAXL 是否能接近或达到人类定义的辅助标签的性能？

主要发现

MAXL 在七个图像数据集上使用相同有标签数据时，优于单任务学习。
MAXL 超越基线辅助标签生成方法（Random、K-Means），在 CIFAR-100 上与人工定义的辅助标签具有竞争力。
在带有层级的 CIFAR-100 上，MAXL 在整个训练过程中保持高辅助梯度有效性（正的余弦相似度），与固定标签基线不同。
与单任务相比，MAXL 在 t-SNE 可视化中显示出更好的主任务分离性，并接近由人类辅助标签实现的分离度。
该方法在一系列层级（psi 值）下仍然鲁棒，不需要数据集特定调优。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。