QUICK REVIEW

[Paper Review] Prototypical Networks for Few-shot Learning

Jake Snell, Kevin Swersky|arXiv (Cornell University)|Mar 15, 2017

Domain Adaptation and Few-Shot Learning5,185 citations

TL;DR

Prototypical networks learn a simple embedding where each class is represented by the mean of its examples (prototype); classification is by nearest prototype using Euclidean distance, achieving state-of-the-art results in few-shot and zero-shot tasks.

ABSTRACT

We propose prototypical networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class. Prototypical networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve excellent results. We provide an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. We further extend prototypical networks to zero-shot learning and achieve state-of-the-art results on the CU-Birds dataset.

Motivation & Objective

Motivate a simple, data-efficient inductive bias for few-shot classification to mitigate overfitting with limited data.
Propose a metric-based approach where each class is represented by a single prototype in an embedding space.
Show that Euclidean distance to class prototypes yields strong performance and interpret the method via mixture density and clustering concepts.
Extend the approach to zero-shot learning by embedding class meta-data to form prototypes and evaluate on standard benchmarks.

Proposed method

Learn an embedding function f_phi that maps inputs to an M-dimensional space.
Define a prototype c_k for each class k as the mean of embedded support examples: c_k = (1/|S_k|) sum_{(x_i,y_i) in S_k} f_phi(x_i).
Classify a query x by p_phi(y=k|x) proportional to exp(-d(f_phi(x), c_k)) using a distance d (primarily squared Euclidean distance).
Train by minimizing negative log-probability of the true class over episodes that sample a subset of classes and examples as support and query sets.
Provide a probabilistic interpretation: for regular Bregman divergences, the model corresponds to a finite mixture with the prototype means as cluster centers.
Extend to zero-shot learning by setting c_k = g_theta(v_k), where v_k is class meta-data and g_theta is a learned embedding; fix prototype norm when needed.

Experimental results

Research questions

RQ1Can a simple prototype-based embedding with a fixed number of prototypes per class generalize to unseen classes in few-shot settings?
RQ2How does the choice of distance metric affect performance in prototype-based classification for few-shot learning?
RQ3Does training with episodic schemes and higher-way episodes improve generalization in few-shot tasks?
RQ4Can the prototypical framework be extended effectively to zero-shot learning using class meta-data?

Key findings

On Omniglot, ProtNets with Euclidean distance achieve 1-shot: 98.8% and 5-shot: 99.7% (5-way) and 96.0%/98.9% (20-way in some setups).
On miniImageNet, ProtNets achieve 1-shot: 49.42% and 5-shot: 68.20% (5-way setting), outperforming baselines including Matching Networks and Meta-Learner LSTM.
On CUB zero-shot, ProtNets with GoogLeNet features and 312-d attributes reach 54.6% 50-class accuracy, surpassing multiple attribute-based and embedding methods.
The Euclidean distance consistently outperforms cosine distance for this framework, and higher-way training episodes can improve generalization.
The approach is simpler and more efficient than many meta-learning methods while achieving state-of-the-art results across benchmarks.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.