QUICK REVIEW

[论文解读] Zero-shot Learning via Simultaneous Generating and Learning

Hyeonwoo Yu, Beom-Hee Lee|arXiv (Cornell University)|Oct 21, 2019

Domain Adaptation and Few-Shot Learning被引用 30

一句话总结

该论文提出了一种基于变分自编码器（VAE）并结合类别特定多模态先验的零样本学习联合生成与学习（SGAL）策略。通过将未见类别数据视为需与模型参数一同通过类似EM的迭代过程进行优化的缺失变量，模型学习到了已见与未见类别之间的联合分布，在无需现成分类器的情况下，在多个基准测试上实现了最先进性能。

ABSTRACT

To overcome the absence of training data for unseen classes, conventional zero-shot learning approaches mainly train their model on seen datapoints and leverage the semantic descriptions for both seen and unseen classes. Beyond exploiting relations between classes of seen and unseen, we present a deep generative model to provide the model with experience about both seen and unseen classes. Based on the variational auto-encoder with class-specific multi-modal prior, the proposed method learns the conditional distribution of seen and unseen classes. In order to circumvent the need for samples of unseen classes, we treat the non-existing data as missing examples. That is, our network aims to find optimal unseen datapoints and model parameters, by iteratively following the generating and learning strategy. Since we obtain the conditional generative model for both seen and unseen classes, classification as well as generation can be performed directly without any off-the-shell classifiers. In experimental results, we demonstrate that the proposed generating and learning strategy makes the model achieve the outperforming results compared to that trained only on the seen classes, and also to the several state-of-the-art methods.

研究动机与目标

解决零样本学习中的根本挑战：未见类别缺乏训练数据。
克服传统ZSL方法仅在已见数据上训练并依赖语义嵌入进行间接泛化的局限性。
开发一种统一的生成模型，使模型在训练过程中同时接触已见与未见类别，以提升泛化能力。
通过VAE编码器直接进行分类，消除对外部分类器的需求。
在生成阶段应用Dropout正则化，以减轻模型在生成未见数据时的不确定性。

提出的方法

将未见类别数据视为需与模型参数联合优化的缺失变量，模仿EM算法的流程。
使用具有类别特定多模态先验的变分自编码器（VAE）来建模已见与未见类别复杂且多模态的数据分布。
通过当前模型参数迭代生成合成的未见类别样本，并在这些生成样本与真实已见数据上重新训练模型。
在生成阶段应用Dropout，以降低模型不确定性，提升生成样本的鲁棒性。
端到端训练编码器作为分类器，无需额外的分类头。
利用类别嵌入向量作为VAE的条件输入，在训练期间生成未见类别的样本。

实验结果

研究问题

RQ1当未见类别不存在真实样本时，能否训练一个生成模型以学习已见与未见类别的真实数据分布？
RQ2在零样本学习中，如何解决‘先有数据才能训练模型，先有模型才能生成数据’的鸡肋困境？
RQ3联合优化模型参数与合成的未见类别样本，是否能比仅在已见数据上训练带来更好的泛化性能？
RQ4在生成阶段应用Dropout是否能提升模型的鲁棒性与在未见类别上的性能？
RQ5在零样本学习中，能否直接使用VAE的编码器进行分类，而无需外部分类器？

主要发现

所提出的SGAL方法在AwA1上的调和平均准确率达到62.2%，相较于基线mmVAE的52.2%有显著提升，表明在未见类别上性能显著增强。
在AwA2上，调和平均准确率从mmVAE的26.9%提升至SGAL的65.6%，即使在类别多样性极高的情况下，仍展现出对未见类别的强大泛化能力。
SGAL-Dropout变体进一步提升了鲁棒性，在未见类别上的性能优于SGAL本身，通过降低生成过程中的模型不确定性实现。
该模型在CUB和SUN数据集上达到最先进结果，这两个数据集的类别数分别为AwA的5倍和12倍，表明其具有强大的可扩展性。
尽管未见类别的性能有所提升，但已见类别的性能出现轻微下降，这是由于模型在同时拟合已见与未见分布时产生了权衡。
通过T-SNE可视化证实，SGAL训练后，未见类别的潜在空间聚类更加分离且定义更清晰，表明模型的解耦能力与泛化性能得到提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。