QUICK REVIEW

[论文解读] Focal Loss for Dense Object Detection

Tsung-Yi Lin, Priya Goyal|arXiv (Cornell University)|Aug 7, 2017

Advanced Neural Network Applications参考文献 29被引用 1,337

一句话总结

引入 focal loss 以解决单阶段检测器中的极端类别失衡，使 RetinaNet 在保持速度的同时超越先前的最先进检测器。

ABSTRACT

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors. Code is at: https://github.com/facebookresearch/Detectron.

研究动机与目标

识别密集单阶段检测器精度不高的主要原因。
提出一种将学习重点放在困难样本上的损失函数，以解决前景与背景的不平衡。
设计一个简单而高效的单阶段检测器（RetinaNet），实现最先进的精度。
证明 focal loss 使精度具有竞争力甚至超过现有方法的同时，保持快速推理速度。

提出的方法

将 focal loss 形式化为 FL(p_t) = -alpha_t (1 - p_t)^gamma log(p_t)，并在 gamma=0 时证明其与交叉熵等价。
引入一种 alpha 平衡变体以解决类别不平衡。
以前景概率先验 pi 初始化训练以稳定学习。
构建具有 FPN 主干、分类子网和框回归子网的 RetinaNet，用于一阶段密集检测。
端到端使用 SGD 训练，每张图约有 100k 个锚框，分类使用 focal loss，回归使用 smooth L1。

实验结果

研究问题

RQ1focal loss 是否能够缓解密集单阶段检测器中的极端前景-背景不平衡？
RQ2通过 focal loss 将学习聚焦在困难样本上，是否相较于 CE 与 OHEM 基线提高 COCO AP？
RQ3RetinaNet 是否能够在保持速度的同时达到与两阶段检测器相匹配或超越的准确性？
RQ4在此情境下，focal loss 的最优 gamma 和 alpha 设置是什么？
RQ5在使用 focal loss 时，锚框和特征金字塔网络的设计选择如何影响性能？

主要发现

Backbone	AP	AP 50	AP 75	AP S	AP M	AP L
RetinaNet (ours)	39.1	59.1	42.3	21.8	42.7	50.2
RetinaNet (ours)	40.8	61.1	44.1	24.1	44.2	51.2

Focal loss 相对于 CE 与 alpha-balanced CE 提供了显著的 AP 提升，gamma=2 带来显著改进。
使用 ResNet-101-FPN 的 RetinaNet 在 COCO test-dev 上达到 39.1 AP，超越了先前的单阶段方法和许多两阶段方法。
与 OHEM 基线相比，基于 FL 的训练在 AP 上表现更好（例如，FL 比 OHEM 变体高出超过 3 AP 点）。
使用 focal loss 的简单单阶段检测器在准确性方面可以接近或超过最先进的两阶段检测器，同时保持有竞争力的速度。
该方法在一系列 gamma 值和锚框配置下都保持鲁棒性，最佳结果在 gamma≈2、alpha≈0.25 左右。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。