QUICK REVIEW

[论文解读] Gaussian Prototypical Networks for Few-Shot Learning on Omniglot

Stanislav Fort|arXiv (Cornell University)|Aug 9, 2017

Domain Adaptation and Few-Shot Learning参考文献 16被引用 61

一句话总结

本文提出高斯原型网络，将原型网络扩展为通过对每个样本的嵌入进行协方差预测来表示不确定性，从而实现协方差加权的距离度量，用于 Omniglot 的少样本分类，并取得了最先进的结果。

ABSTRACT

We propose a novel architecture for $k$-shot classification on the Omniglot dataset. Building on prototypical networks, we extend their architecture to what we call Gaussian prototypical networks. Prototypical networks learn a map between images and embedding vectors, and use their clustering for classification. In our model, a part of the encoder output is interpreted as a confidence region estimate about the embedding point, and expressed as a Gaussian covariance matrix. Our network then constructs a direction and class dependent distance metric on the embedding space, using uncertainties of individual data points as weights. We show that Gaussian prototypical networks are a preferred architecture over vanilla prototypical networks with an equivalent number of parameters. We report state-of-the-art performance in 1-shot and 5-shot classification both in 5-way and 20-way regime (for 5-shot 5-way, we are comparable to previous state-of-the-art) on the Omniglot dataset. We explore artificially down-sampling a fraction of images in the training set, which improves our performance even further. We therefore hypothesize that Gaussian prototypical networks might perform better in less homogeneous, noisier datasets, which are commonplace in real world applications.

研究动机与目标

在少样本学习设置中，推动对未见类别的快速适应。
通过为每个嵌入预测不确定性（协方差）来扩展原型网络。
评估协方差感知测量如何影响类原型和决策边界。
通过协方差加权和数据集下采样，研究对嘈杂/非均匀数据的鲁棒性。

提出的方法

使用 CNN 编码器将图像映射到嵌入并为每个嵌入预测一个协方差（不确定性）。
三种协方差变体：半径（标量）、对角线（向量）和全协方差（由于复杂性不使用）。
以方差加权的嵌入组合构建类原型（p_c = sum(s_i ∘ x_i)/sum(s_i)）。
定义类协方差 s_c = sum(s_i)，到类原型的距离为 d_c(i)^2 = (x_i - p_c)^T S_c (x_i - p_c)，其中 S_c = Σ_c^{-1}。
以情节学习方式训练：选择 N_c 个类别，N_s 个支持样本，N_q 个查询样本；对距离的 softmax 交叉熵进行优化。
通过嵌入维度和编码器容量的实验；比较 radius 与 diagonal 协方差；评估下采样训练数据以鼓励使用协方差的影响。

实验结果

研究问题

RQ1相比普通原型网络，在 Omniglot 中预测每个样本的协方差能否提升少样本分类性能？
RQ2在该框架中以最少参数的方式编码不确定性（半径 vs 对角线或全协方差）是哪种最有效？
RQ3有意降采样训练数据对协方差估计的实用性和少样本准确率有何影响？
RQ4在 1-shot 和 5-shot、5-way 和 20-way 的设置中，协方差感知度量是否比现有最先进方法带来更好性能？

主要发现

高斯原型网络在参数数量相当的情况下优于普通原型网络。
在协方差变体中，对每个嵌入预测单一半径值（radius 方法）在 Omniglot 上最有效。
通过下采样部分训练数据以引入更嘈杂/不太均匀的数据，能通过鼓励使用协方差估计来提升 k-shot 性能。
最佳大模型半径配置在 1-shot 和 5-shot、20-way 分类任务上达到最先进的结果，在 5-way 5-shot 任务中也具竞争力。
在部分损坏数据上进行训练进一步提升了在高 shot 阶段的性能，表明协方差加权对噪声具有鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。