QUICK REVIEW

[论文解读] Reinforced Continual Learning

Ju Xu, Zhanxing Zhu|arXiv (Cornell University)|May 31, 2018

Domain Adaptation and Few-Shot Learning参考文献 14被引用 152

一句话总结

该论文提出 Reinforced Continual Learning (RCL)，它使用强化学习为每个新任务自适应扩展神经网络，目标是在最大化验证准确性同时最小化模型复杂度和遗忘。RCL 在顺序 MNIST 变体和逐步 CIFAR-100 上优于若干基线，且添加参数更少。

ABSTRACT

Most artificial intelligence models have limiting ability to solve new tasks faster, without forgetting previously acquired knowledge. The recently emerging paradigm of continual learning aims to solve this issue, in which the model learns various tasks in a sequential fashion. In this work, a novel approach for continual learning is proposed, which searches for the best neural architecture for each coming task via sophisticatedly designed reinforcement learning strategies. We name it as Reinforced Continual Learning. Our method not only has good performance on preventing catastrophic forgetting but also fits new tasks well. The experiments on sequential classification tasks for variants of MNIST and CIFAR-100 datasets demonstrate that the proposed approach outperforms existing continual learning alternatives for deep networks.

研究动机与目标

通过在任务到来时自适应扩展网络容量来推动可扩展的持续学习。
通过在训练新增参数时冻结先前任务的参数来防止遗忘。
利用强化学习自动搜索每个任务的近似最优架构扩展。

提出的方法

控制器（LSTM）生成一系列动作，确定为新任务在每一层添加多少过滤器/节点。
任务网络自适应扩展；仅对新任务训练新添加的参数，以避免语义漂移。
奖励结合验证准确性和网络复杂度，以平衡性能和效率（R_t = A_t + α C_t）。
在带有 actor-critic 框架的策略梯度下，更新控制器和价值网络，以最大化期望奖励。
训练按任务进行，按照控制器输出扩展网络并冻结前一任务的参数。

实验结果

研究问题

RQ1强化学习能否有效地搜索最佳架构扩展，以在持续学习中缓解遗忘？
RQ2自适应扩展在连续任务中对准确性与模型复杂度的影响如何？
RQ3在使用更少的新增参数的同时，RCL 是否比固定大小或其他可扩展架构更能防止遗忘？

主要发现

在 MNIST 排列、MNIST 混合，以及 Incremental CIFAR-100 上，RCL 比 PGN 和 DEN 在准确性更好、模型规模更小。
RCL 将新增参数数量显著减少（例如，与 PGN 和 DEN 相比，CIFAR-100 降幅分别为 42% 和 53%）。
在 RCL 和 PGN 中遗忘得到缓解，而固定大小的方法则表现出灾难性遗忘。DEN 由于重新训练先前参数，未能完全防止遗忘。
增加 α（对模型复杂度的权重）会减少参数，但可能略微降低准确性，从而在性能和规模之间实现权衡。
RCL 使用的超参数比 DEN 少，在不同设定下表现更稳定。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。