Skip to main content
QUICK REVIEW

[Paper Review] HashNet: Deep Learning to Hash by Continuation

Zhangjie Cao, Mingsheng Long|arXiv (Cornell University)|Feb 2, 2017
Advanced Image and Video Retrieval Techniques38 references71 citations
TL;DR

HashNet introduces a continuation-based deep learning to hash framework that directly learns exact binary hash codes under imbalanced pairwise supervision, achieving state-of-the-art retrieval on standard benchmarks.

ABSTRACT

Learning to hash has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval, due to its computation efficiency and retrieval quality. Deep learning to hash, which improves retrieval quality by end-to-end representation learning and hash encoding, has received increasing attention recently. Subject to the ill-posed gradient difficulty in the optimization with sign activations, existing deep learning to hash methods need to first learn continuous representations and then generate binary hash codes in a separated binarization step, which suffer from substantial loss of retrieval quality. This work presents HashNet, a novel deep architecture for deep learning to hash by continuation method with convergence guarantees, which learns exactly binary hash codes from imbalanced similarity data. The key idea is to attack the ill-posed gradient problem in optimizing deep networks with non-smooth binary activations by continuation method, in which we begin from learning an easier network with smoothed activation function and let it evolve during the training, until it eventually goes back to being the original, difficult to optimize, deep network with the sign activation function. Comprehensive empirical evidence shows that HashNet can generate exactly binary hash codes and yield state-of-the-art multimedia retrieval performance on standard benchmarks.

Motivation & Objective

  • Address ill-posed gradient when training sign-activated networks for end-to-end hashing.
  • Mitigate data imbalance in pairwise similarity learning for hashing.
  • Learn exactly binary hash codes without post-binarization loss.
  • Provide convergence guarantees for the continuation-based optimization.
  • Demonstrate superior retrieval performance on standard benchmarks.

Proposed method

  • Use a CNN with a fully-connected hash layer to produce K-dimensional representations.
  • Apply a sign activation to obtain exact binary codes from the hash layer.
  • Adopt a weighted maximum likelihood objective to preserve pairwise similarities under data imbalance.
  • Introduce a continuation strategy that starts with a smoothed tanh activation and gradually increases non-smoothness to converge to sign activation.
  • Define a pairwise logistic likelihood P(sij|hi,hj) with an adaptive sigmoid to guide learning.
  • Provide convergence results showing stage-to-stage loss stability and SGD-based decrease within stages.

Experimental results

Research questions

  • RQ1Can end-to-end hashing be learned directly with sign activations without a separate binarization step?
  • RQ2How can one address ill-posed gradients and data imbalance in deep hashing?
  • RQ3Does a continuation-based optimization improve retrieval quality compared with prior deep hashing methods?
  • RQ4What is the impact of weighted likelihood and continuation on learned hash codes under imbalanced similarity data?

Key findings

  • HashNet achieves state-of-the-art retrieval performance on ImageNet, NUS-WIDE, and MS COCO across 16–64 bit codes.
  • HashNet shows substantial MAP gains over both shallow and deep hashing baselines, e.g., large absolute MAP improvements over ITQ/ITQ-CCA and over DHN across datasets.
  • The weighted maximum likelihood and continuation approach yield large gains, e.g., HashNet-C and HashNet- continuation variants outperform alternatives by notable margins.
  • P@H=2 (precision within Hamming radius 2) is highest for HashNet across datasets, indicating strong ranking with compact codes.
  • t-SNE visualizations indicate more discriminative hash codes from HashNet than DHN, reflecting better category separation in learned codes.
  • Ablation shows continuation and weighting are crucial, with HashNet outperforming variants by up to double-digit MAP gains on some datasets.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.