QUICK REVIEW

[论文解读] Few-Shot Learning via Saliency-guided Hallucination of Samples

Hongguang Zhang, Jing Zhang|arXiv (Cornell University)|Apr 6, 2019

Domain Adaptation and Few-Shot Learning参考文献 39被引用 21

一句话总结

该论文提出了一种名为SalNet的新少样本学习框架，通过利用显著性图在特征空间中引导前景-背景混合，生成合成训练样本。通过结合预训练的显著性网络与带有真实表示正则化（TriR）的双流混合网络，该方法在miniImageNet上实现了最先进性能，在5-way 1-shot设置下使用224×224输入时达到78.34%的准确率。

ABSTRACT

Learning new concepts from a few of samples is a standard challenge in computer vision. The main directions to improve the learning ability of few-shot training models include (i) a robust similarity learning and (ii) generating or hallucinating additional data from the limited existing samples. In this paper, we follow the latter direction and present a novel data hallucination model. Currently, most datapoint generators contain a specialized network (i.e., GAN) tasked with hallucinating new datapoints, thus requiring large numbers of annotated data for their training in the first place. In this paper, we propose a novel less-costly hallucination method for few-shot learning which utilizes saliency maps. To this end, we employ a saliency network to obtain the foregrounds and backgrounds of available image samples and feed the resulting maps into a two-stream network to hallucinate datapoints directly in the feature space from viable foreground-background combinations. To the best of our knowledge, we are the first to leverage saliency maps for such a task and we demonstrate their usefulness in hallucinating additional datapoints for few-shot learning. Our proposed network achieves the state of the art on publicly available datasets.

研究动机与目标

为解决少样本学习中的挑战，即模型必须仅从一个或少数几个标注样本中进行泛化。
通过使用显著性图而非训练专用生成对抗网络（GANs），减少对大规模标注数据用于数据增强的依赖。
通过在特征空间中进行前景-背景混合，生成多样化且逼真的合成样本，以提升泛化能力。
提出一种正则化策略，确保生成的特征与真实、合理的组合保持接近。

提出的方法

预训练的显著性网络从输入图像中分割前景和背景区域，实现对图像构图的精确控制。
双流网络在潜在空间中混合前景和背景特征，生成新的、合理的图像表征。
二阶统计量将空间特征聚合为固定大小的描述符，支持鲁棒的相似性学习。
真实表示正则化（TriR）通过一个监督网络约束生成的特征，使其更接近真实前景-背景组合。
提出了两种混合策略：类内混合（相同类别）和类间混合（使用其他类别的最近邻背景）。
通过关系网络学习查询与支持特征之间的相似性，实现少样本分类。

实验结果

研究问题

RQ1显著性图能否被有效利用，在无需大规模标注数据的情况下，为少样本学习生成逼真且多样的训练样本？
RQ2与传统的图像空间数据增强相比，显著性引导的特征空间混合在少样本分类中的表现如何？
RQ3不同混合策略（特别是类内与类间混合）对模型泛化能力的影响是什么？
RQ4真实表示正则化（TriR）在提升生成特征的真实感与质量方面有多有效？
RQ5显著性图生成器的选择是否显著影响最终的少样本学习性能？

主要发现

SalNet在使用224×224输入图像的miniImageNet数据集上，实现了5-way 1-shot准确率78.34%，超越了先前的最先进方法。
类内混合策略表现最佳，准确率达到77.95%，表明在生成过程中保持类别一致性有助于提升泛化能力。
使用更大输入图像（224×224）相比标准的84×84分辨率可提升性能，1-shot准确率相比基线提升5.1个百分点。
真实表示正则化（TriR）显著提升了生成质量，表现为在不同设置下均保持一致的性能增益。
消融实验确认，显著性分割与数据生成均为关键组件，禁用任一模块在1-shot任务上均导致准确率下降超过10%。
该方法对不同显著性图生成器具有鲁棒性，使用其他预训练模型时性能下降极小。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。