QUICK REVIEW

[论文解读] Delta-encoder: an effective sample synthesis method for few-shot object recognition

Eli Schwartz, Leonid Karlinsky|arXiv (Cornell University)|Jun 12, 2018

Domain Adaptation and Few-Shot Learning参考文献 6被引用 199

一句话总结

Δ-编码器在同一类别样本之间学习非线性变形，以合成未见类别的 plausibl 抗样本，从而在无外部数据的情况下实现有效的少样本与一-shot 对象识别。它在标准基准上实现了最先进的一 shot 结果，并在少量样本设置中具有竞争力的结果。

ABSTRACT

Learning to classify new categories based on just one or a few examples is a long-standing challenge in modern computer vision. In this work, we proposes a simple yet effective method for few-shot (and one-shot) object recognition. Our approach is based on a modified auto-encoder, denoted Delta-encoder, that learns to synthesize new samples for an unseen category just by seeing few examples from it. The synthesized samples are then used to train a classifier. The proposed approach learns to both extract transferable intra-class deformations, or "deltas", between same-class pairs of training examples, and to apply those deltas to the few provided examples of a novel class (unseen during training) in order to efficiently synthesize samples from that new class. The proposed method improves over the state-of-the-art in one-shot object-recognition and compares favorably in the few-shot case. Upon acceptance code will be made available.

研究动机与目标

在计算机视觉中动机与解决极少数样本识别新类别的挑战。
提出通过将已见类别的学习到的类内变形（deltas）转移，来为未见类别合成新样本的机制。
训练一个 Delta-encoder，将同一类别对之间的变形编码为 Δ，解码到未见类别的种子样本上以生成训练样本。
在标准的少样本基准上评估该方法，并在多个数据集上与最先进方法进行对比。

提出的方法

使用一个自编码器变体，其中编码器输出同一类别对 (X, Y) 的紧凑 Δ 表征 Z。
训练以从 Y 和 Z 重建 X，强制依赖 Y 以实现有意义的样本合成。
在采样阶段，从大量同一类别对中收集 Z，然后通过将 D(Z, Y^u) 应用于一个单一种子 Y^u 来为未见类别生成新样本。
在每个未见类别上用合成的 1024 个样本训练一个线性分类器；通过对每个种子重复合成来扩展到 k-shot。
采用自适应的基于特征空间加权的 L1 重构损失和一个 16 维的 Z；骨干网络特征预先计算（VGG16/ResNet18），配以一个小型 MLP 编码器/解码器。

实验结果

研究问题

RQ1一个学习到的 delta 表征是否能够将已见类别的变形转移到未见类别，以仅使用少量样本来合成真实感样本？
RQ2Δ-编码器在标准基准下的一-shot 和 few-shot 设置中的表现如何？
RQ3合成数据是否提供了超出对种子样例的简单增强的非平凡信息？

主要发现

方法	1-shot (5-way) 小型 ImageNet	1-shot (5-way) CIFAR-100	1-shot (5-way) Caltech-256	1-shot (5-way) CUB	平均（1-shot）
最近邻（基线）	59.9 / 69.7	66.7 / 79.8	73.2 / 83.6	69.8 / 82.6	-
MACO [19]	-	-	-	-	-
Meta-Learner LSTM [34]	-	-	-	-	-
Matching Nets [43]	-	-	-	-	-
MAML [10]	-	-	-	-	-
Prototypical Networks [39]	-	-	-	-	-
SRPN [30]	-	-	-	-	-
RELATION NET [41]	-	-	-	-	-
DEML+Meta-SGD ♡ [52]	-	-	-	-	-
Dual TriNet ♡ [4]	-	-	-	-	-
Δ-编码器 ♡	59.9 / 69.7	66.7 / 79.8	73.2 / 83.6	69.8 / 82.6	84.3

Δ-编码器在一 shot 设置中表现强劲，超过了多个数据集上的若干基线方法。
在 1-shot/5-shot 设置中，Δ-编码器在 miniImageNet、CIFAR-100、Caltech-256 和 CUB 上显示出竞争性或优越的准确率。
消融研究表明，在编码器输入中包含 Y 以及学习非线性 Δ 相较于线性偏移或基于属性的方法显著提升了性能。
将合成样本的数量增加到大约每个未见类别约 1,024 时，性能得到提升，收敛表明存在有意义的非平凡数据增强。
使用预训练骨干网络（ImageNet 特征）可进一步提升结果，Δ-编码器在若干数据集上对基线方法取得显著提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。