QUICK REVIEW

[论文解读] Multiple Instance Detection Network with Online Instance Classifier Refinement

Peng Tang, Xinggang Wang|arXiv (Cornell University)|Apr 1, 2017

Advanced Image and Video Retrieval Techniques参考文献 28被引用 30

一句话总结

该论文提出了一种用于弱监督目标检测的多实例检测网络（MIDN），并引入在线实例分类器精炼（OICR）方法，仅使用图像级别标签进行端到端训练。通过迭代地利用空间重叠的提议框和多流网络结构来精炼实例分类器，该方法在PASCAL VOC 2007上实现了47%的mAP，显著优于先前的最先进方法。

ABSTRACT

Of late, weakly supervised object detection is with great importance in object recognition. Based on deep learning, weakly supervised detectors have achieved many promising results. However, compared with fully supervised detection, it is more challenging to train deep network based detectors in a weakly supervised manner. Here we formulate weakly supervised detection as a Multiple Instance Learning (MIL) problem, where instance classifiers (object detectors) are put into the network as hidden nodes. We propose a novel online instance classifier refinement algorithm to integrate MIL and the instance classifier refinement procedure into a single deep network, and train the network end-to-end with only image-level supervision, i.e., without object location information. More precisely, instance labels inferred from weak supervision are propagated to their spatially overlapped instances to refine instance classifier online. The iterative instance classifier refinement procedure is implemented using multiple streams in deep network, where each stream supervises its latter stream. Weakly supervised object detection experiments are carried out on the challenging PASCAL VOC 2007 and 2012 benchmarks. We obtain 47% mAP on VOC 2007 that significantly outperforms the previous state-of-the-art.

研究动机与目标

为解决仅使用图像级别标注的弱监督目标检测挑战，避免昂贵的边界框标注。
通过在标准加权池化之外精炼实例分类器，提升端到端深度网络中的定位精度。
通过提议框之间的空间重叠实现在线、迭代的分类器精炼，避免缓慢的交替优化过程。
开发一个统一的深度网络，整合MIL学习与分类器精炼，以实现更好的判别性定位。

提出的方法

提出一种多实例检测网络（MIDN），将目标提议框视为实例，并将实例分类器作为深层网络中的隐层节点进行整合。
提出一种在线实例分类器精炼（OICR）算法，在训练过程中将高分提议框的标签信息传播至空间重叠的其他提议框。
采用多流网络结构，其中每个流监督下一个流，实现在端到端训练中对实例分类器的迭代精炼。
在公式(4)中使用加权损失函数，优先考虑与真实框重叠度更高的提议框，以提升分类器学习的稳定性。
设定IoU阈值为0.5以识别空间重叠的提议框用于标签传播，实验表明对阈值微小变化具有鲁棒性。
使用随机梯度下降对整个网络进行端到端训练，标签精炼在每次前向传播后执行。

实验结果

研究问题

RQ1在线、迭代的实例分类器精炼是否能提升弱监督目标检测中的定位精度？
RQ2提议框之间的空间重叠如何影响标签信息的传播以及分类器性能？
RQ3多流网络结构能否有效在单一端到端训练过程中整合MIL学习与分类器精炼？
RQ4IoU阈值和损失加权对最终检测mAP的影响如何？

主要发现

所提出的OICR方法在PASCAL VOC 2007上实现了47%的mAP，显著优于弱监督目标检测领域先前的最先进方法。
在精炼过程中使用加权损失函数可带来显著性能提升，而使用未加权损失则仅带来微小或负向改进。
即使IoU阈值设为0.5，该方法仍保持优异性能，且在0.5至0.6的阈值范围内结果稳定。
结合多个模型的预测（OICR-Ens.）可将VOC 2012上的mAP提升至38.2%，进一步通过FRCNN精炼后性能提升至42.5% mAP。
该方法在刚性物体（如“bicycle”、“bus”和“motorbike”）上表现尤为出色，但对可变形物体（如“cat”、“dog”和“person”）效果较差，原因在于代表性部分的定位不完整。
可视化结果表明，精炼过程使检测器能够逐步覆盖完整物体，减少对物体区域的过度或不足分割。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。