QUICK REVIEW

[论文解读] PathNet: Evolution Channels Gradient Descent in Super Neural Networks

Chrisantha Fernando, Dylan Banarse|arXiv (Cornell University)|Jan 30, 2017

Neural Networks and Applications参考文献 20被引用 643

一句话总结

PathNet 通过在一个巨型神经网络中演化网络路径来引导梯度下降，从而实现迁移学习、持续学习和多任务学习，方法是冻结已使用的参数并对新任务重用已学习的路径。

ABSTRACT

For artificial general intelligence (AGI) it would be efficient if multiple users trained the same giant neural network, permitting parameter reuse, without catastrophic forgetting. PathNet is a first step in this direction. It is a neural network algorithm that uses agents embedded in the neural network whose task is to discover which parts of the network to re-use for new tasks. Agents are pathways (views) through the network which determine the subset of parameters that are used and updated by the forwards and backwards passes of the backpropogation algorithm. During learning, a tournament selection genetic algorithm is used to select pathways through the neural network for replication and mutation. Pathway fitness is the performance of that pathway measured according to a cost function. We demonstrate successful transfer learning; fixing the parameters along a path learned on task A and re-evolving a new population of paths for task B, allows task B to be learned faster than it could be learned from scratch or after fine-tuning. Paths evolved on task B re-use parts of the optimal path evolved on task A. Positive transfer was demonstrated for binary MNIST, CIFAR, and SVHN supervised learning classification tasks, and a set of Atari and Labyrinth reinforcement learning tasks, suggesting PathNets have general applicability for neural network training. Finally, PathNet also significantly improves the robustness to hyperparameter choices of a parallel asynchronous reinforcement learning algorithm (A3C).

研究动机与目标

动机与解决在多任务中高效重用单个巨型网络以实现迁移并减少灾难性遗忘。
提出 PathNet：一个模块化网络，其中嵌入的代理（路径）决定使用和更新哪些参数。
展示在监督学习和强化学习领域中的迁移学习、持续学习和多任务学习。
证明通过基于梯度的学习演化路径可以超过独立训练和简单微调的表现。

提出的方法

PathNet 是一个具有 L 层、每层有 M 个模块的模块化深度网络；每层最多由 N 个模块组成一个路径。
路径被编码为整数矩阵（基因型），描述每层的活动模块。
采用锦标赛选择的遗传算法演化路径，同时梯度下降在活动路径内更新参数。
在迁移期间，源任务的最佳路径被固定（参数被冻结），并为目标任务进化出新种群。
在强化学习中，64 名异步工作者并行评估路径并共享一个中心基因型池。
实验包括串行监督任务（二进制 MNIST、CIFAR、SVHN）和并行 RL 任务（Atari、Labyrinth），使用 A3C。

实验结果

研究问题

RQ1PathNet 是否能够通过演化可重用的参数子集，在序列任务之间实现正向迁移？
RQ2在一个任务后冻结已学习的路径并为新任务重新演化路径，是否能防止灾难性遗忘同时实现更快的学习？
RQ3在监督学习和强化学习领域，PathNet 相对于从零开始学习或简单微调的迁移学习效果如何？
RQ4哪些网络架构和进化参数最有利于高效的路径发现与迁移？
RQ5模块复制或路径重叠在多大程度上影响迁移性能？

主要发现

PathNet 在二进制 MNIST、CIFAR、SVHN、Atari 游戏和 Labyrinth 任务上实现了正向迁移，优于从零开始的训练和简单微调控制。
固定源任务的最优路径并为目标任务演化新路径，在目标任务上实现比从头学习或通过微调更快的学习。
在 Atari 实验中，PathNet 相较于独立学习平均提升 1.33 倍，相较于微调控制提升 1.16 倍；Labyrinth 的迁移结果在若干任务对上优于基线。
PathNet 的路径演化先将训练集中在前几层，后续层更多地进行探索，表明分层的、任务特定的门控以及出现的类似 dropout 的动力学。
模块复制在某些迁移上可以提高性能，在 Labyrinth 任务中 PathNet 可以在同一层内复制有用的模块以改善迁移。
PathNet 显著提高对 A3C 强化学习设置中超参数选择的鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。