QUICK REVIEW

[论文解读] Small Sample Learning in Big Data Era

Jun Shu, Zongben Xu|arXiv (Cornell University)|Aug 14, 2018

Machine Learning and Algorithms参考文献 274被引用 56

一句话总结

对小样本学习（SSL）技术的综述，区分经验学习与概念学习，并概述相关方法、神经科学基础、挑战与未来方向。

ABSTRACT

As a promising area in artificial intelligence, a new learning paradigm, called Small Sample Learning (SSL), has been attracting prominent research attention in the recent years. In this paper, we aim to present a survey to comprehensively introduce the current techniques proposed on this topic. Specifically, current SSL techniques can be mainly divided into two categories. The first category of SSL approaches can be called "concept learning", which emphasizes learning new concepts from only few related observations. The purpose is mainly to simulate human learning behaviors like recognition, generation, imagination, synthesis and analysis. The second category is called "experience learning", which usually co-exists with the large sample learning manner of conventional machine learning. This category mainly focuses on learning with insufficient samples, and can also be called small data learning in some literatures. More extensive surveys on both categories of SSL techniques are introduced and some neuroscience evidences are provided to clarify the rationality of the entire SSL regime, and the relationship with human learning process. Some discussions on the main challenges and possible future research directions along this line are also presented.

研究动机与目标

定义 SSL 并在大数据时代说明其重要性。
区分并描述 SSL 的两个分支：概念学习和经验学习。
总结概念学习的代表性技术及其与先前工作的联系。
总结经验学习的代表性技术，以及增强数据和知识如何弥补小样本。
讨论 SSL 中的挑战、神经科学证据及未来研究方向。

提出的方法

给出 SSL 及其两种学习类别的正式定义。
描述概念学习的一般方法，包括意涵/外延匹配与新概念形成。
综述从视觉到语义映射的意涵匹配方法与语义相关性。
概述增强数据与知识系统在经验学习中的作用。
将 SSL 与认知科学概念联系起来并提供神经科学证据。
讨论长尾数据、数据稀缺和弱监督/网页监督等作为 SSL 的动机。

实验结果

研究问题

RQ1小样本学习（SSL）中的核心定义和分类有哪些？
RQ2在 SSL 中，概念学习和经验学习包含哪些技术？
RQ3SSL 如何利用表示、映射和知识在少量样本下工作？
RQ4来自神经科学的证据如何支持 SSL，以及它与人类学习的关系？
RQ5在大数据时代，SSL 的主要挑战和未来方向是什么？

主要发现

SSL 可分为概念学习和经验学习，能够从少量样本中实现识别、生成和推理。
经验学习使用增强数据和知识系统来弥补数据有限，而概念学习依赖于概念与小样本之间的匹配规则。
意涵匹配在视觉/语义表示之间建立映射，使概念与数据对齐，从而实现零样本和少样本任务。
语义嵌入和语义相关性方法能够在零样本/少样本设置中实现从已见到未见类别的知识迁移。
神经科学概念，如情节记忆、想象力和组合性，为在先验知识基础上的快速学习提供理论依据。
本文讨论了如弱监督、长尾分布和数据稀缺等挑战，提出把 SSL 作为走向更像人类学习的路径。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。