QUICK REVIEW

[论文解读] Understanding Deep Architectures using a Recursive Convolutional Network

David Eigen, Jason Tyler Rolfe|arXiv (Cornell University)|Dec 6, 2013

Domain Adaptation and Few-Shot Learning参考文献 22被引用 25

一句话总结

本文通过使用递归共享权重架构，研究了卷积神经网络中深度、特征图数量和参数数量的独立影响。研究发现，增加网络层数和参数数量可显著提升性能，而特征图的维度影响甚微——特征图带来的大部分收益实际上源于参数数量的增加，而非其表征能力的提升。

ABSTRACT

A key challenge in designing convolutional network models is sizing them appropriately. Many factors are involved in these decisions, including number of layers, feature maps, kernel sizes, etc. Complicating this further is the fact that each of these influence not only the numbers and dimensions of the activation units, but also the total number of parameters. In this paper we focus on assessing the independent contributions of three of these linked variables: The numbers of layers, feature maps, and parameters. To accomplish this, we employ a recursive convolutional network whose weights are tied between layers; this allows us to vary each of the three factors in a controlled setting. We find that while increasing the numbers of layers and parameters each have clear benefit, the number of feature maps (and hence dimensionality of the representation) appears ancillary, and finds most of its benefit through the introduction of more weights. Our results (i) empirically confirm the notion that adding layers alone increases computational power, within the context of convolutional layers, and (ii) suggest that precise sizing of convolutional feature map dimensions is itself of little concern; more attention should be paid to the number of parameters in these layers instead.

研究动机与目标

分离卷积神经网络中网络深度（层数）、特征图维度（特征图）和模型容量（参数）的独立贡献。
解决卷积神经网络架构设计中的挑战，其中这些因素相互关联，难以独立评估。
在固定参数预算下，确定增加特征图尺寸或增加层数哪一个能带来更大的性能提升。
在参数数量保持不变的前提下，评估深层网络（特征图较少）是否优于浅层网络（特征图较多）

提出的方法

设计一种递归卷积网络，所有层之间共享权重，确保所有层使用相同的滤波器权重并具有完全相同的架构。
利用该共享权重模型，在控制参数数量和层数的同时，改变特征图数量，从而实现对各因素的独立分析。
在CIFAR-10和SVHN数据集上训练并评估该模型的共享与非共享版本，以在受控条件下比较性能差异。
开展三项受控实验：(1) 固定特征图数量，同时改变层数和参数数量；(2) 固定层数和特征图数量，仅改变参数数量；(3) 固定层数和参数数量，仅改变特征图数量。
使用线性回归量化每项实验中共享模型与非共享模型之间的性能差异，评估各架构因素的相对影响。
在第一层后应用最大池化，并在整个网络中使用ReLU激活函数，保持与标准卷积神经网络实践的一致性。

实验结果

研究问题

RQ1在不改变参数数量或特征图尺寸的前提下，增加卷积网络的层数是否能独立提升性能？
RQ2在总参数数量保持不变的情况下，增加每层的特征图数量是否能提升性能？
RQ3增加特征图带来的性能提升是源于更高维的表征能力，还是源于伴随而来的参数数量增加？
RQ4将参数分布在更多层中，是否比将参数集中在更少、更宽的层（具有更多特征图）中带来更好的性能？
RQ5在控制架构因素的前提下，共享权重的递归网络与标准非共享网络在性能上表现如何？

主要发现

即使在参数数量和特征图数量保持不变的情况下，增加网络层数仍能显著提升性能，证实深度本身可增强表征能力。
参数数量与性能呈强烈正相关，将更多参数分配到多个层中，其性能优于将参数集中在较少层数中。
在固定层数和参数数量的前提下，改变特征图数量时，共享模型与非共享模型的性能几乎完全相同，表明特征图维度的独立影响极小。
当参数数量和层数固定时，共享模型与非共享模型之间的性能差异可忽略不计，表明特征图数量对模型容量的影响主要通过其对参数数量的影响体现，而非其本身。
结果表明，性能主要由层数和总参数数量驱动，而非特征图的维度，这意味着架构设计应优先考虑深度和参数分配，而非特征图大小。
在参数分布于更多层的实验中，即使特征图维度减小，性能仍有所提升，支持了深度优于宽表征的结论。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。