[论文解读] Learning RoI Transformer for Detecting Oriented Objects in Aerial Images
本论文提出 RoI Transformer,一种轻量级模块,通过从水平 RoI 学习旋转 RoI,并提取旋转不变特征以检测航空影像中的定向对象,在推理高效的同时达到最先进的结果。
Object detection in aerial images is an active yet challenging task in computer vision because of the birdview perspective, the highly complex backgrounds, and the variant appearances of objects. Especially when detecting densely packed objects in aerial images, methods relying on horizontal proposals for common object detection often introduce mismatches between the Region of Interests (RoIs) and objects. This leads to the common misalignment between the final object classification confidence and localization accuracy. Although rotated anchors have been used to tackle this problem, the design of them always multiplies the number of anchors and dramatically increases the computational complexity. In this paper, we propose a RoI Transformer to address these problems. More precisely, to improve the quality of region proposals, we first designed a Rotated RoI (RRoI) learner to transform a Horizontal Region of Interest (HRoI) into a Rotated Region of Interest (RRoI). Based on the RRoIs, we then proposed a Rotated Position Sensitive RoI Align (RPS-RoI-Align) module to extract rotation-invariant features from them for boosting subsequent classification and regression. Our RoI Transformer is with light weight and can be easily embedded into detectors for oriented object detection. A simple implementation of the RoI Transformer has achieved state-of-the-art performances on two common and challenging aerial datasets, i.e., DOTA and HRSC2016, with a neglectable reduction to detection speed. Our RoI Transformer exceeds the deformable Position Sensitive RoI pooling when oriented bounding-box annotations are available. Extensive experiments have also validated the flexibility and effectiveness of our RoI Transformer. The results demonstrate that it can be easily integrated with other detector architectures and significantly improve the performances.
研究动机与目标
- 在航空图像中准确检测定向且密集排布的对象,其中水平 RoI 导致对齐不准确的动机。
- 提出一个轻量级、端到端可训练的 RoI Transformer,将 HRoIs 转换为 RRoIs 并提取旋转不变特征。
- 与广泛的旋转锚框方法相比,降低计算复杂度,同时提高精度。
提出的方法
- 引入 RRoI Learner,通过一个小型全连接回归头将 HRoIs 转换为旋转 RoIs。
- 应用旋转位置敏感 RoI Align,从 RRoIs 池化旋转不变特征。
- 使用轻量化头架构,以保持 RoI 级别计算的高效性。
- 通过 IoU 基于匹配在 RRoIs 和 RRoTs(旋转的真实标签)之间进行监督,以改进监督。
- 提供端到端可微分的 RoI Transformer,便于与现有检测器集成。
实验结果
研究问题
- RQ1从水平到旋转 RoIs 的学习变换能否改善与航空图像中定向对象的对齐?
- RQ2旋转 PS RoI Align 是否提供旋转不变特征,以提升定向对象的分类和定位?
- RQ3在 DOTA 和 HRSC2016 上,与可变形 PS RoI 池化及基线轻量头检测器相比,RoI Transformer 的精度和效率如何?
主要发现
| 方法 | mAP | 训练速度 | 测试速度 | 参数 |
|---|---|---|---|---|
| LR-O | 58.3 | 0.403 s | 0.141 s | 273MB |
| DPSRP | 63.89 | 0.445 s | 0.206 s | 273.2MB |
| RoI Transformer | 67.74 | 0.475 s | 0.17 s | 273MB |
- RoI Transformer 在 DOTA 与 HRSC2016 数据集上实现了最先进或具竞争力的 mAP。
- 将 RoI Transformer 添加到基线 Light-Head OBB 在消融实验中将 mAP 提升最多 4.87 点。
- RoI Transformer 对密集排布和长而薄的对象具有更好的处理能力,相对于先前的方法(如 DOTA 的船只)有显著提升。
- 与可变形 PS RoI 池化相比,RoI Transformer 提供更高的准确性,且回归目标更轻、对旋转对齐更友好。
- 推理速度和内存相较于竞争的旋转 ROI 方法保持有利(如在 1024x1024、TITAN X 上约 0.17s/图像)。
- RoI Transformer 可轻松嵌入到其他检测器架构中,以提升定向对象检测。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。