[论文解读] Backbones-Review: Feature Extraction Networks for Deep Learning and Deep Reinforcement Learning Approaches
对用于深度学习和深度强化学习特征提取的骨干网络(例如 AlexNet、VGG、ResNet、DenseNet、EfficientNet、HRNet 等)进行全面综述,详细介绍架构、任务以及比较见解。
To understand the real world using various types of data, Artificial Intelligence (AI) is the most used technique nowadays. While finding the pattern within the analyzed data represents the main task. This is performed by extracting representative features step, which is proceeded using the statistical algorithms or using some specific filters. However, the selection of useful features from large-scale data represented a crucial challenge. Now, with the development of convolution neural networks (CNNs), the feature extraction operation has become more automatic and easier. CNNs allow to work on large-scale size of data, as well as cover different scenarios for a specific task. For computer vision tasks, convolutional networks are used to extract features also for the other parts of a deep learning model. The selection of a suitable network for feature extraction or the other parts of a DL model is not random work. So, the implementation of such a model can be related to the target task as well as the computational complexity of it. Many networks have been proposed and become the famous networks used for any DL models in any AI task. These networks are exploited for feature extraction or at the beginning of any DL model which is named backbones. A backbone is a known network trained in many other tasks before and demonstrates its effectiveness. In this paper, an overview of the existing backbones, e.g. VGGs, ResNets, DenseNet, etc, is given with a detailed description. Also, a couple of computer vision tasks are discussed by providing a review of each task regarding the backbones used. In addition, a comparison in terms of performance is also provided, based on the backbone used for each task.
研究动机与目标
- 对用于 DL(深度学习)和 DRL(深度强化学习)的特征提取骨干族群进行调研和分类。
- 讨论在不同计算机视觉任务(分类、检测、分割等)中如何选择骨干网络。
- 提供关于架构、参数和计算方面考量的对比讨论。
- 突出骨干设计与应用中的挑战与未来发展方向。
提出的方法
- 描述并对主要骨干架构进行分类(如 AlexNet、VGG 系列、ResNet、Inception、DenseNet、MobileNet、EfficientNet、HRNet 等)。
- 概述骨干特征:参数、训练任务以及关键的架构特征。
- 评估骨干在计算机视觉任务和 DRL 场景中的部署情况。
- 提供跨任务和骨干之间的定性比较及趋势。
实验结果
研究问题
- RQ1在 DL 和 DRL 任务中,最常用于特征提取的骨干架构有哪些?
- RQ2在图像分类、目标检测、人群计数和视频摘要等任务中,骨干选择如何影响性能与计算成本?
- RQ3在骨干设计与应用中观察到的趋势与差距有哪些,以及提出了哪些未来方向?
- RQ4在 DRL 场景中,骨干网络的表现与传统 DL 任务相比如何?
主要发现
| Backbone | Year | # of parameters | trained task |
|---|---|---|---|
| AlexNet | 2012 | 60M | Img-class |
| VGG-16 | 2014 | 138M | Img-class |
| VGG-19 | 2014 | 144M | Img-class |
| Inception-V1 (GoogleNet) | 2014 | 5 M | Img-class |
| ResNet-50 | 2015 | 26 M | Img-class |
| ResNet-101 | 2015 | 44.6 M | Img-class |
| ResNet-152 | 2015 | 230M | Img-class |
| Inception-V2 | 2015 | 21.8M | Img-class |
| Inception-V3 | 2015 | 21.8M | Img-class |
| Inception-ResNet-V2 | 2015 | 55 M | Img-class, obj-det |
| Darknet-19 | 2015 | 20.8 M | Obj-det |
| Xception | 2017 | 22.9 M | Img-class |
- 本文枚举并描述了包括 AlexNet、VGG-16/19、GoogleNet/Inception 变体、ResNet 家族、DenseNet、Darknet、ShuffleNet、DetNet、SqueezeNet、MobileNet、WideResNet、EfficientNet、SWideRNet、Xception 和 HRNet 等在内的广泛骨干网络。
- 骨干网络与具体的计算机视觉任务(图像分类、目标检测、人群计数、视频摘要等)相关联,关注它们的优点和使用场景。
- 性能与复杂度方面的考量(如参数量、FLOPs,以及在功率受限的设备上的适用性)被讨论,强调准确性与效率之间的权衡。
- 综述指出设计趋势的演变,如深度与宽度、残差连接、多尺度和高分辨率保持架构,以及适用于移动和边缘应用的高效卷积策略。
- 一个表格汇总骨干特征(年份、参数、训练任务)以便快速参考常见网络。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。