QUICK REVIEW

[论文解读] Densely Connected Convolutional Networks

Gao Huang, Zhuang Liu|arXiv (Cornell University)|Aug 25, 2016

Advanced Neural Network Applications参考文献 37被引用 1,897

一句话总结

DenseNet 将每一层与之前所有层相连，实现特征重用和高效训练，在 CIFAR-10/100 和 SVHN 上取得强劲效果，同时所需参数和计算量更少；在 ImageNet 上达到与 ResNet 相同的性能但参数更少。

ABSTRACT

Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet .

研究动机与目标

通过密集连接改善信息和梯度流动，以驱动更深的网络
提出一个紧凑、参数高效的架构，在层之间重复使用特征
在 CIFAR-10、CIFAR-100、SVHN 和 ImageNet 上展示 DenseNet 的准确性提升和参数数量的减少
分析密集连接如何作为隐式深监督和正则化机制发挥作用

提出的方法

提出 DenseNet，使每一层 l 的输入为前面所有层的特征图级联之和
将 H_l 定义为复合函数 BN-ReLU-Conv，必要时包含 bottleneck BN-ReLU-Conv(1x1)-BN-ReLU-Conv(3x3)
将网络划分为密集块，块之间以变换层分离，并通过所选的 theta 参数进行压缩（DenseNet-C）或 bottleneck（DenseNet-BC）
控制增长率 k，以决定每层向整体状态添加的新特征图数量
使用 SGD 进行训练，CIFAR/ImageNet 上进行标准数据增强；报告 top-1/top-5 精度和参数数量
提供证据表明与 ResNet 类架构相比，密集连接可改进梯度流动、特征重用和正则化效果

实验结果

研究问题

RQ1密集连接（将每一层连接到所有后续层）是否能在信息与梯度流动方面超越残差连接？
RQ2增长率 k 与 bottleneck/压缩设计如何影响在标准视觉基准上的参数效率和准确性？
RQ3DenseNets 是否在 CIFAR-10、CIFAR-100、SVHN 和 ImageNet 上以更少的参数和计算量达到先进水平？

主要发现

模型	CIFAR-10	CIFAR-10+	CIFAR-100	CIFAR-100+	SVHN
DenseNet-BC (k=12), L=100	5.92	4.51	24.15	22.27	1.76
DenseNet-BC (k=24), L=250	5.19	3.62	19.64	17.60	1.74
DenseNet-BC (k=40), L=190	-	3.46	-	17.18	-

DenseNets 在多数配置下在 CIFAR-10、CIFAR-100 和 SVHN 上无需增强数据就超越了状态最先进的模型，在某些情况下使用增强数据时表现与之并驾齐驱或更好
DenseNet-BC 其中 L=190、k=40 在 CIFAR-10（有增强）下达到 3.46%，在 CIFAR-100（有增强）下达到 17.18%，在若干设置下超越 FractalNet 和 Wide ResNets
DenseNet 变体在 ImageNet 上也有强劲表现，DenseNet-201 的 top-1 与一个 101 层的 ResNet 相当，但参数约为后者的一半；DenseNet-BC 能在显著更少的参数和 FLOPs 下达到与 ResNet 相同的性能
DenseNets 展现出更高的参数效率，0.8–7.0M 参数就能在 CIFAR/SVHN 上取得竞争性结果，且 250 层 DenseNet-BC（k=24）在 CIFAR-100+ 和 SVHN 上优于参数更多的更大模型
DenseNets 在较小数据集上呈现更低的过拟合倾向，并通过密集连接提供隐式深度监督

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。