QUICK REVIEW

[论文解读] Online Learning: A Comprehensive Survey

Steven C. H. Hoi, Doyen Sahoo|arXiv (Cornell University)|Feb 8, 2018

Data Stream Mining Techniques参考文献 273被引用 144

一句话总结

本综述系统性地回顾在线学习文献，聚焦在线监督学习与部分反馈设置，提供理论基础与分类法。

ABSTRACT

Online learning represents an important family of machine learning algorithms, in which a learner attempts to resolve an online prediction (or any type of decision-making) task by learning a model/hypothesis from a sequence of data instances one at a time. The goal of online learning is to ensure that the online learner would make a sequence of accurate predictions (or correct decisions) given the knowledge of correct answers to previous prediction or learning tasks and possibly additional information. This is in contrast to many traditional batch learning or offline machine learning algorithms that are often designed to train a model in batch from a given collection of training data instances. This survey aims to provide a comprehensive survey of the online machine learning literatures through a systematic review of basic ideas and key principles and a proper categorization of different algorithms and techniques. Generally speaking, according to the learning type and the forms of feedback information, the existing online learning works can be classified into three major categories: (i) supervised online learning where full feedback information is always available, (ii) online learning with limited feedback, and (iii) unsupervised online learning where there is no feedback available. Due to space limitation, the survey will be mainly focused on the first category, but also briefly cover some basics of the other two categories. Finally, we also discuss some open issues and attempt to shed light on potential future research directions in this field.

研究动机与目标

总结在线学习的核心思想、原理和分类法。
从学习理论、优化和博弈论角度回顾在线学习的基础。
聚焦在线监督学习与部分反馈设置，并简要涉及无监督学习。
讨论在线学习中的开放问题与未来研究方向。

提出的方法

按反馈类型对在线学习技术进行分类：在线监督、有限反馈和无监督。
给出在线二元分类与后悔最小化的标准化问题表述。
解释经验风险最小化、过度风险分解与在线凸优化框架。
描述主要的算法族群：一阶、二阶与正则化方法（OGD、ONS、FTRL、OMD、EG、AdaGrad）。
概述与博弈论及在线学习情境中重复零和博弈的联系。

实验结果

研究问题

RQ1在线学习的主要类别与反馈模型是什么？
RQ2支撑在线学习的基础理论（学习理论、优化、博弈论）有哪些？
RQ3核心的在线凸优化方法及其后悔保证是什么？
RQ4在线学习算法与经典批量学习和数据流有何关系？
RQ5在线学习研究中存在哪些开放问题与未来方向？

主要发现

基于反馈将在线学习方法分为在线监督、带有限反馈的在线学习和在线无监督学习三大类的分类法。
在在线学习情境中定义并讨论后悔、经验风险以及偏差-方差权衡。
回顾一阶、二阶与正则化基础的在线优化算法及其理论保障。
将在线学习与博弈论联系起来，包括纳什均衡与重复零和博弈中的极小极大概念。
指出在线学习与在线凸优化中的开放问题及潜在未来研究方向。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。