[论文解读] Focal and Efficient IOU Loss for Accurate Bounding Box Regression
本文提出 Efficient IOU (EIOU) 损失以及回归版本的 focal 损失,然后将它们结合为 Focal-EIOU,以加速收敛并改善目标检测中的边框定位。
In object detection, bounding box regression (BBR) is a crucial step that determines the object localization performance. However, we find that most previous loss functions for BBR have two main drawbacks: (i) Both $\ell_n$-norm and IOU-based loss functions are inefficient to depict the objective of BBR, which leads to slow convergence and inaccurate regression results. (ii) Most of the loss functions ignore the imbalance problem in BBR that the large number of anchor boxes which have small overlaps with the target boxes contribute most to the optimization of BBR. To mitigate the adverse effects caused thereby, we perform thorough studies to exploit the potential of BBR losses in this paper. Firstly, an Efficient Intersection over Union (EIOU) loss is proposed, which explicitly measures the discrepancies of three geometric factors in BBR, i.e., the overlap area, the central point and the side length. After that, we state the Effective Example Mining (EEM) problem and propose a regression version of focal loss to make the regression process focus on high-quality anchor boxes. Finally, the above two parts are combined to obtain a new loss function, namely Focal-EIOU loss. Extensive experiments on both synthetic and real datasets are performed. Notable superiorities on both the convergence speed and the localization accuracy can be achieved over other BBR losses.
研究动机与目标
- 通过改进几何差异测量,解决现有 BBR 损失收敛缓慢和不准确的问题。
- 通过设计回归版本的 focal 损失以强调高质量锚框来缓解锚框不平衡。
- 开发并验证结合 Focal-EIOU 的损失,在合成数据集和真实数据集上进行评估。
- 在人们使用的 COCO 2017 数据集上的 state-of-the-art 检测器上展示改进。
提出的方法
- 提出 EIOU 损失,将 L_EIOU 分解为 IOU 损失、距离分量和纵横比分量,明确最小化重叠、中心距离以及宽高差。
- 设计 FocalL1 损失以重新加权回归梯度,由参数 beta 和 alpha 控制,强调高质量样本。
- 将 EIOU 与 focal 加权结合成 Focal-EIOU 损失,使用基于 IOU 的再加权以聚焦于有信息量的锚框(L_Focal-EIOU = IOU^gamma * L_EIOU)。
- 对批次权重进行归一化以稳定训练(L_Focal-EIOU = sum(W_i * L_EIOU_i)/sum(W_i))。
- 在合成设置和 COCO 2017 上对多种主干/检测器进行评估(Faster R-CNN、Mask R-CNN、RetinaNet、ATSS、PAA、DETR)。
- 进行消融实验以分离 EIOU、FocalL1 和再加权策略的影响。
实验结果
研究问题
- RQ1EIOU 是否在收敛速度和定位精度方面优于基于 IOU 的损失?
- RQ2回归聚焦机制(EEM)是否能够在 BBR 中公平地平衡高质量与低质量锚框的贡献?
- RQ3将 FocalL1 与 EIOU(Focal-EIOU)结合是否在 COCO 的多种检测器和主干上带来一致的提升?
- RQ4超参数(beta、gamma)对性能和训练稳定性有何影响?
主要发现
| 方法 | AP | AP50 | AP75 | AP_S | AP_M | AP_L |
|---|---|---|---|---|---|---|
| Baseline | 35.9 | 55.2 | 38.4 | 21.2 | 39.5 | 48.4 |
| IOU | 36.5 | 55.6 | 38.9 | 20.9 | 40.1 | 48.0 |
| GIOU | 36.5 | 55.6 | 39.0 | 20.7 | 40.2 | 48.2 |
| CIOU | 36.7 | 55.7 | 39.2 | 20.6 | 40.4 | 49.0 |
| FocalL1 | 36.5 | 55.8 | 38.9 | 21.2 | 39.8 | 48.8 |
| EIOU | 37.0 | 55.7 | 39.5 | 20.7 | 40.5 | 49.5 |
| Focal-EIOU (v1) | 36.8 | 55.4 | 39.5 | 20.9 | 40.0 | 49.1 |
| Focal-EIOU | 37.5 | 56.1 | 40.0 | 21.1 | 40.9 | 49.8 |
- EIOU 在仿真和 COCO 实验中相较于 IOU、GIOU、CIOU 损失实现更快的收敛和更好的定位。
- 基于 Focal-L1 的再加权提升来自高质量锚框的梯度,在消融实验中显著提升 AP。
- Focal-EIOU 在 COCO 2017 的各种检测器上持续提升 AP,最佳单行结果为 37.5 AP(56.1 AP50,40.0 AP75,21.1 AP_S,40.9 AP_M,49.8 AP_L)。
- 表格驱动的消融研究显示 Focal-EIOU 优于基线和其他基于 IOU 的损失,与基线相比提升最高约 1.6 AP。
- Focal-EIOU 在中等和大目标的定位更优,同时在合适的 gamma(gamma=0.5)和 beta 设置下保持稳定性。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。