[Paper Review] Prototypical Networks for Few-shot Learning
Prototypical networks learn a simple embedding where each class is represented by the mean of its examples (prototype); classification is by nearest prototype using Euclidean distance, achieving state-of-the-art results in few-shot and zero-shot tasks.
We propose prototypical networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class. Prototypical networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve excellent results. We provide an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. We further extend prototypical networks to zero-shot learning and achieve state-of-the-art results on the CU-Birds dataset.
Motivation & Objective
- Motivate a simple, data-efficient inductive bias for few-shot classification to mitigate overfitting with limited data.
- Propose a metric-based approach where each class is represented by a single prototype in an embedding space.
- Show that Euclidean distance to class prototypes yields strong performance and interpret the method via mixture density and clustering concepts.
- Extend the approach to zero-shot learning by embedding class meta-data to form prototypes and evaluate on standard benchmarks.
Proposed method
- Learn an embedding function f_phi that maps inputs to an M-dimensional space.
- Define a prototype c_k for each class k as the mean of embedded support examples: c_k = (1/|S_k|) sum_{(x_i,y_i) in S_k} f_phi(x_i).
- Classify a query x by p_phi(y=k|x) proportional to exp(-d(f_phi(x), c_k)) using a distance d (primarily squared Euclidean distance).
- Train by minimizing negative log-probability of the true class over episodes that sample a subset of classes and examples as support and query sets.
- Provide a probabilistic interpretation: for regular Bregman divergences, the model corresponds to a finite mixture with the prototype means as cluster centers.
- Extend to zero-shot learning by setting c_k = g_theta(v_k), where v_k is class meta-data and g_theta is a learned embedding; fix prototype norm when needed.
Experimental results
Research questions
- RQ1Can a simple prototype-based embedding with a fixed number of prototypes per class generalize to unseen classes in few-shot settings?
- RQ2How does the choice of distance metric affect performance in prototype-based classification for few-shot learning?
- RQ3Does training with episodic schemes and higher-way episodes improve generalization in few-shot tasks?
- RQ4Can the prototypical framework be extended effectively to zero-shot learning using class meta-data?
Key findings
- On Omniglot, ProtNets with Euclidean distance achieve 1-shot: 98.8% and 5-shot: 99.7% (5-way) and 96.0%/98.9% (20-way in some setups).
- On miniImageNet, ProtNets achieve 1-shot: 49.42% and 5-shot: 68.20% (5-way setting), outperforming baselines including Matching Networks and Meta-Learner LSTM.
- On CUB zero-shot, ProtNets with GoogLeNet features and 312-d attributes reach 54.6% 50-class accuracy, surpassing multiple attribute-based and embedding methods.
- The Euclidean distance consistently outperforms cosine distance for this framework, and higher-way training episodes can improve generalization.
- The approach is simpler and more efficient than many meta-learning methods while achieving state-of-the-art results across benchmarks.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.