QUICK REVIEW

[论文解读] LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning

Huaiyu Li, Weiming Dong|arXiv (Cornell University)|May 15, 2019

Domain Adaptation and Few-Shot Learning被引用 50

一句话总结

LGM-Net 训练一个 MetaNet，从少样本任务数据生成 TaskNet 权重，从而在未见任务上实现快速适应，无需微调。它使用 Task Context Encoder 和 Weight Generator 生成匹配网络的 TargetNet 参数，并通过跨任务的归一化共享信息。

ABSTRACT

In this work, we propose a novel meta-learning approach for few-shot classification, which learns transferable prior knowledge across tasks and directly produces network parameters for similar unseen tasks with training samples. Our approach, called LGM-Net, includes two key modules, namely, TargetNet and MetaNet. The TargetNet module is a neural network for solving a specific task and the MetaNet module aims at learning to generate functional weights for TargetNet by observing training samples. We also present an intertask normalization strategy for the training process to leverage common information shared across different tasks. The experimental results on Omniglot and miniImageNet datasets demonstrate that LGM-Net can effectively adapt to similar unseen tasks and achieve competitive performance, and the results on synthetic datasets show that transferable prior knowledge is learned by the MetaNet module via mapping training data to functional weights. LGM-Net enables fast learning and adaptation since no further tuning steps are required compared to other meta-learning approaches.

研究动机与目标

利用跨任务的可迁移先验知识，推动对未见少样本任务的快速自适应需求；
提出一个元学习框架，直接从有限的任务数据生成 TaskNet 的功能性权重；
介绍一个高效的 MetaNet 架构，包含 Task Context Encoder 与有条件的 Weight Generator；
在训练过程中引入跨任务归一化，以利用跨任务的共享信息。

提出的方法

两模块架构：TargetNet（使用生成权重的 TaskNet）与 MetaNet（从任务数据生成 TargetNet 权重）
MetaNet 由一个将训练样本编码为固定大小上下文的 Task Context Encoder 和一个将上下文映射到 TargetNet 权重的 Conditional Weight Generator 组成
Task context 被建模为重新参数化的多元高斯；每个 TargetNet 层的权重由层特定的生成器产生并进行归一化（权重归一化）
TargetNet 是一个 Matching Network，其参数由 MetaNet 生成；分类在嵌入特征的余弦距离注意力（attentional metric）上进行
通过 batch normalization 在一批任务上进行 Intertask Normalization (ITN) 以共享统计信息并改善训练
训练过程在元训练数据上对 Episodic 任务进行优化，使用每个任务测试集的交叉熵损失来训练 MetaNet

实验结果

研究问题

RQ1一个元学习者是否能够从有限的任务数据中学习生成 TaskNet 的功能性权重，并对未见任务进行泛化？
RQ2通过 MetaNet 生成 TaskNet 权重是否相较于依赖初始化或更新规则的传统元学习方法在少样本任务上有性能提升？
RQ3Task Context Encoder 与 Intertask Normalization 对新任务泛化的影响如何？
RQ4生成的权重如何在任务间分布，这对可迁移的先验知识意味着什么？

主要发现

模型	5 类 1-shot	5 类 5-shot	20 类 1-shot
Matching networks (Vinyals et al., 2016)	43.56 ± 0.84%	55.31 ± 0.73%	17.31 ± 0.22%
Meta-LSTM (Ravi & Larochelle, 2017)	43.44 ± 0.77%	60.60 ± 0.71%	16.70 ± 0.23%
MetaNet (Munkhdalai & Yu, 2017)	49.21 ± 0.96%	-	-
Prototypical Nets (Snell et al., 2017)	49.42 ± 0.78%	68.20 ± 0.66%	-
MAML (Finn et al., 2017)	48.70 ± 1.84%	63.11 ± 0.92%	16.49 ± 0.58%
Meta-SGD (Li et al., 2017)	50.47 ± 1.87%	64.03 ± 0.94%	17.56 ± 0.64%
Relation Net (Sung et al., 2018)	51.38 ± 0.82%	67.07 ± 0.69%	-
REPTILE (Nichol & Schulman)	49.97 ± 0.32%	65.99 ± 0.58%	-
SNAIL (Mishra et al., 2018)	55.71 ± 0.99%	65.99 ± 0.58%	-
(Gidaris & Komodakis, 2018)	56.20 ± 0.86%	73.00 ± 0.64%	-
LEO (Rusu et al., 2019)	61.76 ± 0.08%	77.59 ± 0.12%	-
LGM-Net (Ours)	69.13 ± 0.35%	71.18 ± 0.68%	26.14 ± 0.34%

在 mini-ImageNet 上，LGM-Net 达到近乎最优的结果，5-way 1-shot 为 69.13%，5-way 5-shot 为 71.18%，超过若干基线方法。
在 Omniglot 上，LGM-Net 在 5-way 和 20-way 设置下取得了有竞争力的表现（例如 5-way 1-shot 为 99.0%）。
消融研究显示 ITN 显著提升性能，Task Context Encoder 超越随机先验的贡献，权重归一化稳定了训练。
生成的权重在 t-SNE 可视化中按任务聚类，表明 MetaNet 学习了任务特异的权重分布并能迁移到相似任务。
与固定权重的匹配网络及若干元学习基线相比，LGM-Net 展现出更好的适应性和更快的推断，因为无需额外的微调。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。