QUICK REVIEW

[论文解读] Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace

Yoonho Lee, Seungjin Choi|arXiv (Cornell University)|Jan 17, 2018

Domain Adaptation and Few-Shot Learning被引用 73

一句话总结

MT-nets 和 T-nets 能够实现元学习的子空间和任务特定度量扭曲，通过学习应当自适应的权重以及激活空间的形状以实现快速适应，从而提升基于梯度的元学习。

ABSTRACT

Gradient-based meta-learning methods leverage gradient descent to learn the commonalities among various tasks. While previous such methods have been successful in meta-learning tasks, they resort to simple gradient descent during meta-testing. Our primary contribution is the {\em MT-net}, which enables the meta-learner to learn on each layer's activation space a subspace that the task-specific learner performs gradient descent on. Additionally, a task-specific learner of an {\em MT-net} performs gradient descent with respect to a meta-learned distance metric, which warps the activation space to be more sensitive to task identity. We demonstrate that the dimension of this learned subspace reflects the complexity of the task-specific learner's adaptation task, and also that our model is less sensitive to the choice of initial learning rates than previous gradient-based meta-learning methods. Our method achieves state-of-the-art or comparable performance on few-shot classification and regression tasks.

研究动机与目标

激发基于梯度的元学习，学习在各层中任务自适应应发生的位置（哪一个子空间）。
引入 MT-nets，学习用于任务特定更新的子空间以及将激活空间扭曲的元学习度量。
证明子空间维度反映任务复杂性，且 MT-nets 能降低对初始学习率的敏感性。
展示 MT-nets 在少样本分类和回归任务上达到最前沿或具有竞争力的性能。

提出的方法

引入 Transformation Networks (T-nets)，通过每层的变换矩阵 T 在激活空间学习度量。
扩展为 Mask Transformation Networks (MT-nets)，还学习一个二值梯度掩码 M，以选择给定任务更新哪些权重。
MT-nets 用对数几率 zeta 参数化 M，并使用 Gumbel-Softmax 重参数化通过掩码采样进行反向传播。
给出更新规则：W 更新为 W - alpha M ∘ ∇_W L(...)；在 MT-nets 中，选择用于任务自适应的梯度子空间并应用学得的度量 T。
推导出 MT-nets 能够实现限于任意子空间的更新，具有相关度量，从而在一个低维、任务感知的嵌入中有效执行梯度下降。
概述通过元目标对任务批次进行优化，以最小化 L_t(˜θ_W,T, D_train, D_test)。

实验结果

研究问题

RQ1逐层子空间和度量的学习如何影响基于梯度的元学习性能？
RQ2MT-nets 能否自动确定对每个任务应自适应网络的哪些部分以及程度？
RQ3学习子空间维度是否与任务复杂性相关，以及这是否提高了对学习率选择的鲁棒性？
RQ4T-nets 和 MT-nets 是否能扩展到常规的少样本基准测试（Omniglot、MiniImagenet）和回归任务？
RQ5在实践中，MT-nets 的逐行掩码与全参数掩码相比效果如何？

主要发现

在正弦波回归和少样本分类基准测试中，MT-nets 超越 MAML、Meta-SGD 及 MT-net 的变体。
MT-nets 对学习率变化表现出鲁棒性，当 alpha 变化时仍保持性能，因为元学习得到的 T 会扭曲有效步长。
在 MT-nets 中，更新的权重比例随任务复杂性增加而增加，表明元学习器分配了恰到好处的可用于自适应的自由度。
在 Omniglot 5-way 1-shot 和 MiniImagenet 5-way 1-shot 的分类任务中，MT-nets 获得与其他方法竞争的准确率，接近或超过竞争方法（例如 Omniglot 上 MT-net 99.5%、99.4%，MiniImagenet 5-way 1-shot 的 MT-net 96.2%）。
MT-nets 学习的子空间维度反映任务难度，作为隐式的奥卡姆式正则化，仅更新必要参数。
所提出的方法可泛化到回归和分类，并且可以应用于更大规模的体系结构，因为它将任何前馈网络转换为 MT-net。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。