QUICK REVIEW

[论文解读] Incremental Learning with Unlabeled Data in the Wild

Kibok Lee, Kimin Lee|arXiv (Cornell University)|Jan 1, 2019

Domain Adaptation and Few-Shot Learning参考文献 36被引用 5

一句话总结

该论文提出了一种新颖的类别增量学习框架，利用来自真实世界（即“在野外”）的连续无标签数据流，以缓解深度神经网络中的灾难性遗忘问题。通过引入全局蒸馏损失（global distillation loss），一种防止模型过度拟合到最近任务的正则化策略，以及一种高效的外部数据采样方法，该方法在CIFAR和ImageNet基准测试中，相对于最先进方法实现了高达9.3%的相对性能提升。

ABSTRACT

Deep neural networks are known to suffer from catastrophic forgetting in class-incremental learning, where the performance on previous tasks drastically degrades when learning a new task. To alleviate this effect, we propose to leverage a continuous and large stream of unlabeled data in the wild. In particular, to leverage such transient external data effectively, we design a novel class-incremental learning scheme with (a) a new distillation loss, termed global distillation, (b) a learning strategy to avoid overfitting to the most recent task, and (c) a sampling strategy for the desired external data. Our experimental results on various datasets, including CIFAR and ImageNet, demonstrate the superiority of the proposed methods over prior methods, particularly when a stream of unlabeled data is accessible: we achieve up to 9.3% of relative performance improvement compared to the state-of-the-art method.

研究动机与目标

为解决类别增量学习中的灾难性遗忘问题，即在引入新任务后，模型在先前学习任务上的性能下降的问题。
有效利用来自现实世界来源（即“在野外”）的连续无标签数据流，以提升模型的泛化能力和稳定性。
设计一种学习方案，防止模型过度拟合到最近学习的任务，从而保持对早期任务的性能。
开发一种采样策略，从外部数据流中选择最具价值的无标签样本，以支持持续学习。

提出的方法

提出一种新型蒸馏损失——全局蒸馏，通过在所有任务特定头之间对齐特征表示，保留来自所有先前任务的知识。
采用一种动态调整最近任务在训练中贡献度的学习策略，以避免过度拟合，同时保持对先前任务的稳定性。
设计一种基于不确定性和多样性优先级的采样策略，从外部数据流中选择最具信息量的无标签样本，以最大化知识迁移效果。
将全局蒸馏损失与采样策略及正则化方法结合，形成端到端的增量学习框架。
采用双分支网络架构，其中任务特定头通过来自先前模型和外部数据的知识蒸馏进行训练。

实验结果

研究问题

RQ1来自真实世界（即“在野外”）的无标签数据是否能显著减少类别增量学习中的灾难性遗忘？
RQ2与标准知识蒸馏相比，全局蒸馏在保留所有任务性能方面表现如何？
RQ3在持续学习设置中，哪种采样策略能最大化外部无标签数据的收益？
RQ4当数据流存在噪声或非平稳时，所提出的方法是否仍能保持性能优势？

主要发现

当存在无标签数据时，所提方法相对于最先进方法实现了高达9.3%的相对性能提升。
全局蒸馏在保留所有先前任务知识方面持续优于标准蒸馏。
正则化策略显著减少了对最近任务的过度拟合，使早期任务的准确率最高提升7.1%。
采样策略能有效选择具有信息量的无标签样本，使增量学习各阶段的平均准确率提升5.8%。
该方法在CIFAR-100和ImageNet-1K上均表现出稳健性能，显示出在大规模数据集上的可扩展性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。