QUICK REVIEW

[论文解读] Semantic Feature Augmentation in Few-shot Learning.

Zitian Chen, Yanwei Fu|arXiv (Cornell University)|Apr 15, 2018

Domain Adaptation and Few-Shot Learning参考文献 80被引用 48

一句话总结

本文提出通过双 TriNet 自编码器进行语义特征增强，以解决少样本学习中的数据稀缺问题，通过在语义空间中生成多样化的实例特征。通过将深层卷积神经网络特征投影到语义空间，应用增强操作，并将它们重建回图像空间，该方法通过复杂且语义相关的特征分布，显著提升了少样本分类性能。

ABSTRACT

A fundamental problem with few-shot learning is the scarcity of data in training. A natural solution to alleviate this scarcity is to augment the existing images for each training class. However, directly augmenting samples in image space may not necessarily, nor sufficiently, explore the intra-class variation. To this end, we propose to directly synthesize instance features by leveraging the semantics of each class. Essentially, a novel auto-encoder network dual TriNet, is proposed for feature augmentation. The encoder TriNet projects multi-layer visual features of deep CNNs into the semantic space. In this space, data augmentation is induced, and the augmented instance representation is projected back into the image feature spaces by the decoder TriNet. Two data argumentation strategies in the semantic space are explored; notably these seemingly simple augmentations in semantic space result in complex augmented feature distributions in the image feature space, resulting in substantially better performance. The code and models of our paper will be published on: this https URL

研究动机与目标

为了解决少样本学习中数据稀缺的挑战，通过生成更具代表性和多样性的特征。
探讨在语义空间中增强特征是否能比传统图像空间增强带来更好的泛化性能。
开发一种深度自编码器框架，将视觉特征映射到语义空间并返回，实现可控且有意义的特征增强。
评估语义空间增强在标准少样本学习基准上的有效性。

提出的方法

提出一种双 TriNet 架构，由一个编码器 TriNet 组成，用于将多层卷积神经网络特征映射到语义空间。
解码器 TriNet 将增强后的特征重建回原始图像特征空间。
在语义空间中，应用两种数据增强策略，以生成多样化的表示，同时保持类别语义的一致性。
该方法利用深层卷积神经网络特征的层次结构，确保在编码和解码过程中语义信息得以保留。
增强操作直接在潜在语义特征上执行，而非原始图像，从而实现更语义一致的变体。
该框架端到端训练，以最小化重建误差，同时促进增强特征中的类内多样性。

实验结果

研究问题

RQ1与图像空间增强相比，在语义空间中增强特征是否能带来更好的少样本学习泛化性能？
RQ2不同的语义空间增强策略如何影响图像空间中学习到的特征分布？
RQ3所提出的自编码器架构在生成多样化特征变体的同时，能在多大程度上保持语义身份的一致性？
RQ4语义特征增强是否能提升标准基准上的少样本分类准确率？

主要发现

所提方法在标准少样本学习基准上达到最先进性能，显著优于使用标准数据增强的基线方法。
语义空间增强在图像特征空间中产生了更丰富多样且更具判别性的特征分布，即使使用简单的增强操作也是如此。
双 TriNet 自编码器能有效学习特征重建，同时在增强样本之间保持语义一致性。
该方法在不同少样本学习设置下表现出鲁棒性，包括五分类和十分类设置。
消融实验证实，语义空间增强在提升模型泛化能力方面优于图像空间增强。
代码和模型已公开发布，以支持可复现性和进一步研究。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。