QUICK REVIEW

[论文解读] Toward unsupervised, multi-object discovery in large-scale image collections

Huy V. Vo, Patrick Pérez|arXiv (Cornell University)|Jul 6, 2020

Advanced Neural Network Applications参考文献 42被引用 49

一句话总结

该论文提出一种无监督流水线，使用基于 CNN 的区域提案和一个正则化、可扩展的对象发现框架（rOSD）在大型图像集合中发现多个对象。

ABSTRACT

This paper addresses the problem of discovering the objects present in a collection of images without any supervision. We build on the optimization approach of Vo et al. (CVPR'19) with several key novelties: (1) We propose a novel saliency-based region proposal algorithm that achieves significantly higher overlap with ground-truth objects than other competitive methods. This procedure leverages off-the-shelf CNN features trained on classification tasks without any bounding box information, but is otherwise unsupervised. (2) We exploit the inherent hierarchical structure of proposals as an effective regularizer for the approach to object discovery of Vo et al., boosting its performance to significantly improve over the state of the art on several standard benchmarks. (3) We adopt a two-stage strategy to select promising proposals using small random sets of images before using the whole image collection to discover the objects it depicts, allowing us to tackle, for the first time (to the best of our knowledge), the discovery of multiple objects in each one of the pictures making up datasets with up to 20,000 images, an over five-fold increase compared to existing methods, and a first step toward true large-scale unsupervised image interpretation.

研究动机与目标

开发一种无监督的方法，在没有边界框监督的情况下跨大型图像集合发现对象。
通过利用在辅助分类任务上训练的 CNN 特征来改进区域提案。
引入可正则化的 OSD（rOSD）公式，使在图像内实现多对象发现成为可能。
提出一个两阶段的可扩展方法，以在多达 20,000 张图像及以上的数据集上应用对象发现。

提出的方法

直接从 CNN 特征映射中生成区域提案，而无需通过边界框，通过构建全局显著性图和局部极大值来形成提案。
通过将由生成局部极大值组的提案进行约束，使每组最多保留一个区域，来引入可正则化的 OSD（rOSD）。
通过两阶段策略在大规模集合上提高可扩展性：首先为每张图像筛选有前景的提案，然后在 reduced proposal 集上对完整集合执行 OSD。
采用两阶段的大规模变体，包括对邻域的预过滤和在完整数据集优化前使用代理 OSD。

实验结果

研究问题

RQ1无监督的基于 CNN 的区域提案是否能在对象发现方面优于传统的无监督提案？
RQ2引入按组约束（每个局部极大组一个区域）是否能提升多对象发现的性能？
RQ3两阶段的大规模策略在非常大数据集的图像中实现多对象发现的效果如何？
RQ4在单对象和多对象发现的标准基准上，OSD 和所提出的 rOSD 与当前最先进的方法相比如何？
RQ5使用预训练的 CNN 特征而不使用边界框对发现性能有何影响？

主要发现

基于 CNN 的区域提案在多个数据集的对象发现方面优于现成的无监督提案。
可正则化的 OSD（rOSD）显著优于原始的 OSD，能够实现鲁棒的多对象发现。
两阶段的大规模方法使在多达 20,000 张图像的数据集上应用对象发现成为可能，同时保持性能提升。
在 OD、VOC_6x2、VOC_all 和 VOC12 上，rOSD 在多对象发现设置下的结果具有竞争力或优于现有方法。
在大规模数据集上，rOSD 在多对象共定位和发现方面超过竞争方法，且在 VOC_all 和 VOC12 上有显著提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。