QUICK REVIEW

[论文解读] MDSSD: Multi-scale Deconvolutional Single Shot Detector for Small Objects

Lisha Cui, Rui Ma|arXiv (Cornell University)|May 18, 2018

Advanced Neural Network Applications参考文献 32被引用 63

一句话总结

MDSSD 引入多尺度反卷积融合块，通过上采样高层特征并与浅层特征融合，提升对小物体的检测，在 TT100K、VOC2007 和 COCO 上达到最新成果。

ABSTRACT

For most of the object detectors based on multi-scale feature maps, the shallow layers are rich in fine spatial information and thus mainly responsible for small object detection. The performance of small object detection, however, is still less than satisfactory because of the deficiency of semantic information on shallow feature maps. In this paper, we design a Multi-scale Deconvolutional Single Shot Detector (MDSSD), especially for small object detection. In MDSSD, multiple high-level feature maps at different scales are upsampled simultaneously to increase the spatial resolution. Afterwards, we implement the skip connections with low-level feature maps via Fusion Block. The fusion feature maps, named Fusion Module, are of strong feature representational power of small instances. It is noteworthy that these high-level feature maps utilized in Fusion Block preserve both strong semantic information and some fine details of small instances, rather than the top-most layer where the representation of fine details for small objects are potentially wiped out. The proposed framework achieves 77.6% mAP for small object detection on the challenging dataset TT100K with 512 x 512 input, outperforming other detectors with a large margin. Moreover, it can also achieve state-of-the-art results for general object detection on PASCAL VOC2007 test and MS COCO test-dev2015, especially achieving 2 to 5 points improvement on small object categories.

研究动机与目标

激发对小物体检测的挑战及现有多尺度检测器的局限性。
开发一个多尺度反卷积框架，在保留空间细节的同时利用语义丰富性。
引入 Fusion Blocks 将高层与低层特征融合，以实现对小物体的检测。
在 TT100K、PASCAL VOC2007 和 MS COCO 上评估 MDSSD，以证明其相对于 SSD 及相关方法的改进。

提出的方法

对不同尺度的高层特征图应用反卷积层以对空间分辨率进行上采样。
引入 Fusion Blocks，通过跳跃连接将上采样后的高层特征与相应的浅层特征融合。
创建三个 Fusion Modules（Module 1、Module 2、Module 3），在最深的 SSD 层（conv11_2）之前工作，以恢复小物体细节。
在新的 Fusion Modules 与原始 SSD 层并行进行预测。
以定位损失（Smooth L1）与置信度损失（Softmax）的加权和来训练。

实验结果

研究问题

RQ1特征分辨率如何影响类似 SSD 架构中的小物体检测？
RQ2多尺度反卷积上采样结合特征融合是否能够在不牺牲大物体性能的情况下改善小物体检测？
RQ3在不同数据集（TT100K、VOC2007、COCO）上，添加 Fusion Modules 对检测精度的影响是多少？

主要发现

MDSSD512 在 TT100K 上达到 77.6% 的 mAP，优于 SSD512（68.7%）和 RFB Net（74.4%）。
尽管输入尺寸更小（512×512），MDSSD512 在 TT100K 上超过 Faster R-CNN 的变体（52.9% 与 61.1%）。
MDSSD300 在 PASCAL VOC2007 上达到 78.6% 的 mAP，与 DSSD321 相当，并且使用 ResNet-101 骨干达到 81.0%（MDSSD512*）。
在 COCO 上，MDSSD300 和 MDSSD512 对小物体（面积 < 32^2）分别达到 10.8% 和 13.9% 的 AP，优于 SSD、DSSD 和 DSOD 基线。
MDSSD 还报告了更高的对小物体的平均召回（AR），表明对小物体检测能力的提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。