QUICK REVIEW

[论文解读] Towards Large-Scale Small Object Detection: Survey and Benchmarks

Gong Cheng, Xiang Yuan|arXiv (Cornell University)|Jul 28, 2022

Advanced Neural Network Applications被引用 34

一句话总结

本论文综述小对象检测（SOD）并提出两个大尺度SOD基准数据集，SODA-D用于驾驶场景，SODA-A用于空中场景，以评估多类别SOD方法。

ABSTRACT

With the rise of deep convolutional neural networks, object detection has achieved prominent advances in past years. However, such prosperity could not camouflage the unsatisfactory situation of Small Object Detection (SOD), one of the notoriously challenging tasks in computer vision, owing to the poor visual appearance and noisy representation caused by the intrinsic structure of small targets. In addition, large-scale dataset for benchmarking small object detection methods remains a bottleneck. In this paper, we first conduct a thorough review of small object detection. Then, to catalyze the development of SOD, we construct two large-scale Small Object Detection dAtasets (SODA), SODA-D and SODA-A, which focus on the Driving and Aerial scenarios respectively. SODA-D includes 24828 high-quality traffic images and 278433 instances of nine categories. For SODA-A, we harvest 2513 high resolution aerial images and annotate 872069 instances over nine classes. The proposed datasets, as we know, are the first-ever attempt to large-scale benchmarks with a vast collection of exhaustively annotated instances tailored for multi-category SOD. Finally, we evaluate the performance of mainstream methods on SODA. We expect the released benchmarks could facilitate the development of SOD and spawn more breakthroughs in this field. Datasets and codes are available at: \url{https://shaunyuan22.github.io/SODA}.

研究动机与目标

在多个领域回顾基于深度学习的小对象检测的发展。
识别小对象检测特有的挑战并对现有方法进行分类。
介绍两大规模的 SODA 基准数据集，覆盖驾驶和空中场景，以实现全面评估。
提供在 SODA 基准上对代表性检测器的基线评估。
提供洞见以指引未来的小对象检测研究。

提出的方法

将小对象检测方法分为六类：样本导向、尺度感知、基于注意力、特征模仿、上下文建模，以及聚焦与检测。
讨论数据增强和优化的标签分配，以增加小对象的正样本。
描述尺度特定的体系结构和特征融合策略，以改进小对象表征。
总结基于注意力和模仿的技术，以提升对微小对象的辨别和定位。
介绍 SODA-D 和 SODA-A 数据集的构建过程、统计信息及标注。
对提出的基准上的主流与 SOD 方法进行实验评估。

实验结果

研究问题

RQ1在驾驶和空中领域，小对象检测所面临的主要特有挑战是什么？
RQ2现有的 SOD 方法在大规模、多类别的小对象基准上的表现如何？
RQ3哪些数据与架构策略最有效地提升小对象检测性能？
RQ4提出的 SODA 基准能否推动 SOD 研究的新进展？
RQ5尺度、上下文与表征如何影响跨域的小对象检测？

主要发现

小对象存在信息丢失、表征噪声、低 IoU 容忍度以及训练正样本不足等问题。
尺度感知和多尺度特征融合方法（如类 FPN 的结构、尺度特定的检测器以及分层融合）对提升 SOD 性能至关重要。
基于注意力和特征模仿的方法带来收益，但可能引入计算开销或训练挑战。
两个新的大规模 SODA 基准（SODA-D 和 SODA-A）分别提供用于驾驶和空中场景的完整标注、多类别的小对象数据。
这些基准使对检测器在不同尺度和领域的深入评估成为可能，且公开可用（数据集和代码在提供的 URL）。
在 SODA 上的基线评估揭示了代表性检测方法的有效性，并凸显未来 SOD 研究的空白。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。