QUICK REVIEW

[论文解读] DetNAS: Backbone Search for Object Detection

Yukang Chen, Tong Yang|arXiv (Cornell University)|Mar 26, 2019

Advanced Neural Network Applications参考文献 44被引用 174

一句话总结

DetNAS 引入一个三步背骨搜索框架，使用一体化超网和进化搜索为目标检测器定制背骨，在 FLOPs 更少的情况下实现更高的 COCO mmAP，相对于人工设计网络。

ABSTRACT

Object detectors are usually equipped with backbone networks designed for image classification. It might be sub-optimal because of the gap between the tasks of image classification and object detection. In this work, we present DetNAS to use Neural Architecture Search (NAS) for the design of better backbones for object detection. It is non-trivial because detection training typically needs ImageNet pre-training while NAS systems require accuracies on the target detection task as supervisory signals. Based on the technique of one-shot supernet, which contains all possible networks in the search space, we propose a framework for backbone search on object detection. We train the supernet under the typical detector training schedule: ImageNet pre-training and detection fine-tuning. Then, the architecture search is performed on the trained supernet, using the detection task as the guidance. This framework makes NAS on backbones very efficient. In experiments, we show the effectiveness of DetNAS on various detectors, for instance, one-stage RetinaNet and the two-stage FPN. We empirically find that networks searched on object detection shows consistent superiority compared to those searched on ImageNet classification. The resulting architecture achieves superior performance than hand-crafted networks on COCO with much less FLOPs complexity.

研究动机与目标

为目标检测专门设计背骨的必要性，而非仅使用图像分类背骨的动机。
提出一种实用的 NAS 框架，通过一体化超网将权重训练与架构搜索解耦。
证明在目标检测上搜索得到的背骨在检测器和数据集上优于在 ImageNet 分类上搜索得到的背骨。
证明 DetNASNet 和 DetNASNet (3.8) 在 COCO 和 VOC 上以更低的计算成本实现更高的精度。

提出的方法

构建一个一体化超网，涵盖搜索空间中的所有候选背骨。
采用路径采样策略在 ImageNet 上对超网进行预训练，以反映相对架构性能。
在检测数据集 (COCO/VOC) 上使用 SyncBN 对超网进行微调，以处理微调阶段的较小批量统计。
在经过训练的超网上使用进化算法在 FLOPs/推理约束下搜索架构。
在评估每条路径时重新计算批量统计，以确保在评估过程中 BN 层具有有效的统计量。

实验结果

研究问题

RQ1一个直接在目标检测上搜索得到的背骨是否能优于在 ImageNet 分类上搜索得到的背骨？
RQ2在一体化 NAS 框架中将预训练整合进来是否使检测器背骨搜索在计算上可行？
RQ3当 NAS 为目标检测器（FPN、 RetinaNet）和数据集（COCO、VOC）进行优化时，会出现哪些架构模式？

主要发现

DetNASNet 在 COCO 上以 1.3G FLOPs 实现了 40.2 mmAP，超越同一检测器（FPN）下的 ResNet-50。
DetNASNet (3.8) 在 3.8G FLOPs 下达到 42.0 mmAP，超越 ResNet-50 4.7%、并超越 ResNet-101 2.0%。
与具有相同 FLOPs（1.3G）的手工设计的 ShuffleNetv2-40 相比，DetNASNet 的 mmAP 高出 0.8。
在不同检测器和数据集上，针对检测进行搜索得到的网络普遍比在 ImageNet 分类上搜索得到的网络高出超过 3%（VOC）和 1%（COCO）。
DetNAS 框架大约需要 44 个 GPU 天的成本，约为标准检测器训练成本的两倍，使背骨搜索变得可行。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。