Skip to main content
QUICK REVIEW

[论文解读] ReDet: A Rotation-equivariant Detector for Aerial Object Detection

Jiaming Han, Jian Ding|arXiv (Cornell University)|Mar 13, 2021
Advanced Neural Network Applications参考文献 45被引用 39
一句话总结

ReDet 引入一个旋转等变骨干网和 Rotation-invariant RoI Align,以实现空中目标检测的完全旋转不变特征,同时在减小模型规模的同时达到最先进的 mAP。

ABSTRACT

Recently, object detection in aerial images has gained much attention in computer vision. Different from objects in natural images, aerial objects are often distributed with arbitrary orientation. Therefore, the detector requires more parameters to encode the orientation information, which are often highly redundant and inefficient. Moreover, as ordinary CNNs do not explicitly model the orientation variation, large amounts of rotation augmented data is needed to train an accurate object detector. In this paper, we propose a Rotation-equivariant Detector (ReDet) to address these issues, which explicitly encodes rotation equivariance and rotation invariance. More precisely, we incorporate rotation-equivariant networks into the detector to extract rotation-equivariant features, which can accurately predict the orientation and lead to a huge reduction of model size. Based on the rotation-equivariant features, we also present Rotation-invariant RoI Align (RiRoI Align), which adaptively extracts rotation-invariant features from equivariant features according to the orientation of RoI. Extensive experiments on several challenging aerial image datasets DOTA-v1.0, DOTA-v1.5 and HRSC2016, show that our method can achieve state-of-the-art performance on the task of aerial object detection. Compared with previous best results, our ReDet gains 1.2, 3.5 and 2.6 mAP on DOTA-v1.0, DOTA-v1.5 and HRSC2016 respectively while reducing the number of parameters by 60\% (313 Mb vs. 121 Mb). The code is available at: \url{https://github.com/csuhan/ReDet}.

研究动机与目标

  • 激励并解决空中影像中任意目标方向问题。
  • 将旋转等变网络引入检测器骨干网络。
  • 开发一个 Rotation-invariant RoI Align 以生成完全旋转不变的 RoI 特征。
  • 在 DOTA-v1.0、DOTA-v1.5 和 HRSC2016 上展示最先进的性能。
  • 展示相比基线的模型尺寸和精度改进。

提出的方法

  • 采用旋转等变骨干网络(ReResNet)结合 ReFPN,在 N 个方向通道上生成旋转等变的特征图。
  • 引入 RiRoI Align,执行空间 RoI 变换以及方向通道切换和插值,以产生旋转不变的 RoI 特征。
  • 使用 RoI Transformer 生成旋转 RoI,并对 RoIWise 分类和边框回归应用 RiRoI Align。
  • 在带有定向边框的空中数据集上,使用标准检测流水线(RPN、RoIHead)进行训练和微调。
  • 通过共享权重和由于旋转等变设计导致的参数数量减少来展示参数效率。
Figure 1: Illustration of our method (top) and comparisons of RRoI warping (bottom) . CNN features are not equivariant to the rotation $T_{r}$ , i.e. , feeding a rotated image to CNNs is not the same as rotating feature maps of the original image. Therefore, the corresponding RoI features are not in
Figure 1: Illustration of our method (top) and comparisons of RRoI warping (bottom) . CNN features are not equivariant to the rotation $T_{r}$ , i.e. , feeding a rotated image to CNNs is not the same as rotating feature maps of the original image. Therefore, the corresponding RoI features are not in

实验结果

研究问题

  • RQ1旋转等变骨干是否能减少空中目标检测器中对大规模方向特定参数的需求?
  • RQ2旋转不变的 RoI Align 是否能够有效地从旋转等变骨干中提取方向无关的特征?
  • RQ3与强基线相比,ReDet 对关键空中检测基准(DOTA-v1.0、DOTA-v1.5、HRSC2016)的影响是什么?
  • RQ4与非旋转数据增强方法相比,ReDet 如何影响模型尺寸与精度的权衡?

主要发现

方法AP50 (DOTA-v1.0)AP75 (DOTA-v1.0)mAP (DOTA-v1.0)AP50 (HRSC2016)AP75 (HRSC2016)mAP (HRSC2016)
baseline75.6248.3746.1390.1880.4868.17
ReDet (Ours)76.2550.8647.1190.4689.4670.41
  • ReDet 在 DOTA-v1.0 上达到 80.10 mAP,在 DOTA-v1.5 上达到 76.80 mAP,在 HRSC2016 上达到 90.46 mAP,分别比此前最佳结果高出 1.2、3.5 和 2.6 mAP。
  • ReDet 将模型尺寸显著降低约 60%(基线 313 Mb 对比 121 Mb),同时提供具有竞争力至更优的性能。
  • RiRoI Align 超越传统的 RRoI Align,l=2 的方向插值在消融中带来最佳 mAP 增益(66.86 mAP)。
  • 旋转等变骨干网(ReResNet+ReFPN)在显著减少参数的同时提升检测性能,尤其是在 C8 旋转组下。
  • 在相似训练日程下,与旋转数据增强基线相比,ReDet 展现出显著的 mAP 增益,训练时间相当,并且具有更好的参数效率。
Figure 2: Model size vs. accuracy (mAP) on DOTA-v1.5. We evaluate RetinaNet OBB [ 18 ] , Faster R-CNN OBB (FR) [ 27 ] , Mask R-CNN (Mask) [ 11 ] and Hybrid Task Cascade (HTC) [ 2 ] with ResNet18 (R18) and ResNet50 (R50) backbones. Note all algorithms are our re-implemented version for DOTA, which is
Figure 2: Model size vs. accuracy (mAP) on DOTA-v1.5. We evaluate RetinaNet OBB [ 18 ] , Faster R-CNN OBB (FR) [ 27 ] , Mask R-CNN (Mask) [ 11 ] and Hybrid Task Cascade (HTC) [ 2 ] with ResNet18 (R18) and ResNet50 (R50) backbones. Note all algorithms are our re-implemented version for DOTA, which is

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。