QUICK REVIEW

[论文解读] SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects

Xue Yang, Jirui Yang|arXiv (Cornell University)|Nov 17, 2018

Advanced Neural Network Applications参考文献 47被引用 145

一句话总结

SCRDet 提出了一种用于小型、杂乱且任意方向对象的多类别旋转检测器，引入 SF-Net 以实现更细的抽样，MDA-Net 以实现有监督的注意力，以及基于 IoU 的旋转损失以改进旋转目标检测。它在遥感与通用数据集上实现了最新的结果。

ABSTRACT

Object detection has been a building block in computer vision. Though considerable progress has been made, there still exist challenges for objects with small size, arbitrary direction, and dense distribution. Apart from natural images, such issues are especially pronounced for aerial images of great importance. This paper presents a novel multi-category rotation detector for small, cluttered and rotated objects, namely SCRDet. Specifically, a sampling fusion network is devised which fuses multi-layer feature with effective anchor sampling, to improve the sensitivity to small objects. Meanwhile, the supervised pixel attention network and the channel attention network are jointly explored for small and cluttered object detection by suppressing the noise and highlighting the objects feature. For more accurate rotation estimation, the IoU constant factor is added to the smooth L1 loss to address the boundary problem for the rotating bounding box. Extensive experiments on two remote sensing public datasets DOTA, NWPU VHR-10 as well as natural image datasets COCO, VOC2007 and scene text data ICDAR2015 show the state-of-the-art performance of our detector. The code and models will be available at https://github.com/DetectionTeamUCAS.

研究动机与目标

在航拍和自然图像中推动对小型、杂乱和任意方向对象的鲁棒检测。
开发一个结合定制抽样、抑制背景噪声的注意力以及旋转感知回归的检测器。
证明所提出的技术在遥感和自然图像数据集上具有泛化性。
在公开基准（DOTA、NWPU VHR-10）上达到最先进的性能，在 COCO、VOC2007、ICDAR2015 上也具有竞争力。

提出的方法

提出 SF-Net，通过使用更小的锚步长和多层特征融合来实现对小物体的更细抽样与特征融合。
引入一个带有像素和通道注意力的有监督多维注意力网络 (MDA-Net)，以抑制背景噪声并突出前景对象。
添加一个旋转感知分支，采用五参数（x, y, w, h, theta）回归和使用倾斜 IoU 的旋转非极大值抑制（R-NMS），以获得准确的定向边界框。
通过在平滑L1损失中引入基于 IoU 的因子来修改回归损失，以解决旋转框的边界不连续性。
使用同时结合旋转框回归、注意力监督和分类损失的多任务损失进行训练。
通过在遥感数据集（DOTA、NWPU VHR-10）和自然图像数据集（COCO、VOC2007、ICDAR2015）上验证来展示方法的普遍性。

实验结果

研究问题

RQ1在多类别遥感场景中，如何更有效地对小物体进行采样和定位？
RQ2有监督注意力机制是否能在杂乱、嘈杂的背景中提升检测性能？
RQ3基于 IoU 的旋转损失是否能稳定并提升任意方向边界框的回归？
RQ4所提出的组件是否可以推广到遥感之外的自然图像数据集？
RQ5在定向和水平边界框的标准基准上，SCRDet 的整体性能提升是多少？

主要发现

SCRDet 在 DOTA 的定向边界框（OBB）检测上达到最先进的性能，在所提出的配置下的 mAP 为 72.61%。
在 NWPU VHR-10 的水平边界框检测上，SCRDet 获得已发表方法中的最佳性能，mAP 为 91.75%。
在消融实验中，MDA-Net 通过抑制噪声并突出对象线索带来显著提升（在 DOTA 上的 mAP 最高可提升约 3.7 个百分点）。
SF-Net 通过更细的采样和特征融合在小物体检测上带来显著提升，在一次消融中实现了最佳总体 mAP（68.89%）。
IoU-smooth L1 损失解决了旋转边界不连续性并提高检测精度（如在消融中最高达到 69.83% mAP）。
在自然图像数据集上，SCRDet-augmented baselines (e.g., R2CNN) achieve higher single-scale mAPs (e.g., 80.08% on ICDAR2015) demonstrating generality of the approach.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。