Skip to main content
QUICK REVIEW

[Paper Review] Prototypical Networks for Few-shot Learning

Jake Snell, Kevin Swersky|arXiv (Cornell University)|Mar 15, 2017
Domain Adaptation and Few-Shot Learning5,185 citations
TL;DR

Prototypical networks learn a simple embedding where each class is represented by the mean of its examples (prototype); classification is by nearest prototype using Euclidean distance, achieving state-of-the-art results in few-shot and zero-shot tasks.

ABSTRACT

We propose prototypical networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class. Prototypical networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve excellent results. We provide an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. We further extend prototypical networks to zero-shot learning and achieve state-of-the-art results on the CU-Birds dataset.

Motivation & Objective

  • Motivate a simple, data-efficient inductive bias for few-shot classification to mitigate overfitting with limited data.
  • Propose a metric-based approach where each class is represented by a single prototype in an embedding space.
  • Show that Euclidean distance to class prototypes yields strong performance and interpret the method via mixture density and clustering concepts.
  • Extend the approach to zero-shot learning by embedding class meta-data to form prototypes and evaluate on standard benchmarks.

Proposed method

  • Learn an embedding function f_phi that maps inputs to an M-dimensional space.
  • Define a prototype c_k for each class k as the mean of embedded support examples: c_k = (1/|S_k|) sum_{(x_i,y_i) in S_k} f_phi(x_i).
  • Classify a query x by p_phi(y=k|x) proportional to exp(-d(f_phi(x), c_k)) using a distance d (primarily squared Euclidean distance).
  • Train by minimizing negative log-probability of the true class over episodes that sample a subset of classes and examples as support and query sets.
  • Provide a probabilistic interpretation: for regular Bregman divergences, the model corresponds to a finite mixture with the prototype means as cluster centers.
  • Extend to zero-shot learning by setting c_k = g_theta(v_k), where v_k is class meta-data and g_theta is a learned embedding; fix prototype norm when needed.

Experimental results

Research questions

  • RQ1Can a simple prototype-based embedding with a fixed number of prototypes per class generalize to unseen classes in few-shot settings?
  • RQ2How does the choice of distance metric affect performance in prototype-based classification for few-shot learning?
  • RQ3Does training with episodic schemes and higher-way episodes improve generalization in few-shot tasks?
  • RQ4Can the prototypical framework be extended effectively to zero-shot learning using class meta-data?

Key findings

  • On Omniglot, ProtNets with Euclidean distance achieve 1-shot: 98.8% and 5-shot: 99.7% (5-way) and 96.0%/98.9% (20-way in some setups).
  • On miniImageNet, ProtNets achieve 1-shot: 49.42% and 5-shot: 68.20% (5-way setting), outperforming baselines including Matching Networks and Meta-Learner LSTM.
  • On CUB zero-shot, ProtNets with GoogLeNet features and 312-d attributes reach 54.6% 50-class accuracy, surpassing multiple attribute-based and embedding methods.
  • The Euclidean distance consistently outperforms cosine distance for this framework, and higher-way training episodes can improve generalization.
  • The approach is simpler and more efficient than many meta-learning methods while achieving state-of-the-art results across benchmarks.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.