QUICK REVIEW

[论文解读] Deep Region Hashing for Efficient Large-scale Instance Search from Images

Jingkuan Song, Tao He|arXiv (Cornell University)|Jan 26, 2017

Advanced Image and Video Retrieval Techniques参考文献 28被引用 32

一句话总结

本文提出深度区域哈希（DRH），一种端到端的深度神经网络，联合执行目标候选区域生成、特征提取和二值哈希码学习，以实现高效的大规模实例搜索。通过在区域候选网络和特征提取器之间共享完整的图像卷积特征，DRH 实现了近乎零成本的区域候选生成，在四个基准数据集上实现了比最先进方法更高的平均平均精度（mAP），同时将搜索速度提升高达100倍。

ABSTRACT

Instance Search (INS) is a fundamental problem for many applications, while it is more challenging comparing to traditional image search since the relevancy is defined at the instance level. Existing works have demonstrated the success of many complex ensemble systems that are typically conducted by firstly generating object proposals, and then extracting handcrafted and/or CNN features of each proposal for matching. However, object bounding box proposals and feature extraction are often conducted in two separated steps, thus the effectiveness of these methods collapses. Also, due to the large amount of generated proposals, matching speed becomes the bottleneck that limits its application to large-scale datasets. To tackle these issues, in this paper we propose an effective and efficient Deep Region Hashing (DRH) approach for large-scale INS using an image patch as the query. Specifically, DRH is an end-to-end deep neural network which consists of object proposal, feature extraction, and hash code generation. DRH shares full-image convolutional feature map with the region proposal network, thus enabling nearly cost-free region proposals. Also, each high-dimensional, real-valued region features are mapped onto a low-dimensional, compact binary codes for the efficient object region level matching on large-scale dataset. Experimental results on four datasets show that our DRH can achieve even better performance than the state-of-the-arts in terms of MAP, while the efficiency is improved by nearly 100 times.

研究动机与目标

为解决现有两阶段实例搜索流水线中区域候选与特征提取分离所导致的效率低下和性能不佳问题。
通过学习紧凑的二值哈希码，克服大规模数据集中高维特征匹配带来的计算瓶颈。
实现端到端训练，联合优化区域候选生成、特征学习和哈希码生成，以提升准确性和效率。
在大幅降低搜索时间的同时，实现在大规模实例检索任务中的最先进性能（mAP指标）。

提出的方法

DRH 是一种端到端的深度神经网络，将目标候选、特征提取和哈希码生成整合到单一架构中。
它在区域候选网络和特征提取器之间共享完整的图像卷积特征图，从而实现近乎零成本的区域候选生成。
每个区域的高维实值特征被映射为低维紧凑的二值哈希码，以实现高效的相似性搜索。
哈希码生成层学习具有判别性的二值码，以保留语义相似性，实现有效的实例级匹配。
该方法采用类似孪生网络的结构，以无监督方式训练模型，同时优化区域定位和哈希码质量。
该框架支持全局（gDRH）和局部（lDRH）重排序策略，同时应用查询扩展（QE）以进一步提升检索精度。

实验结果

研究问题

RQ1端到端的深度学习框架能否联合优化区域候选生成、特征提取和哈希码生成，以提升实例搜索的效率和准确性？
RQ2在区域候选和特征提取模块之间共享完整的图像卷积特征，对计算成本和性能有何影响？
RQ3所学习的二值哈希码在多大程度上能减少大规模实例搜索中的搜索时间，同时保持或提升检索准确性？
RQ4在标准基准数据集上，所提出的DRH方法与最先进方法相比，在mAP和推理速度方面表现如何？
RQ5将重排序和查询扩展与深度区域哈希结合，能否进一步提升检索性能？

主要发现

在Oxford 105k数据集上，DRH实现了0.825的平均平均精度（mAP），比最先进方法Tolias et al. + AML + QE高出9.3%。
在Paris 106k数据集上，DRH实现了0.802的mAP，相对于基线方法的相对提升达到9.3%。
当使用512位哈希码时，DRH在Oxford 105k和Paris 106k数据集上的搜索时间仅为3毫秒，相较于基线CNN特征方法实现了超过300倍的速度提升。
即使使用1024位哈希码，DRH仍保持相对于基线方法100倍的速度提升，展现出良好的可扩展性和效率。
定性结果表明，即使查询图像仅为目标图像的一个小区域，DRH也能准确检索出对应实例。
DRH的性能优于现有基于哈希的方法，后者因量化导致的信息损失而表现不佳，且在速度和准确性上均优于非哈希方法。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。