QUICK REVIEW

[논문 리뷰] HashNet: Deep Learning to Hash by Continuation

Zhangjie Cao, Mingsheng Long|arXiv (Cornell University)|2017. 02. 02.

Advanced Image and Video Retrieval Techniques참고 문헌 38인용 수 71

한 줄 요약

HashNet은 불균형한 쌍감독 하에서 정확한 이진 해시 코드를 직접 학습하는 연속성(계속성) 기반 딥러닝 해싱 프레임워크를 도입하여 표준 벤치마크에서 최첨단 검색 성능을 달성한다.

ABSTRACT

Learning to hash has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval, due to its computation efficiency and retrieval quality. Deep learning to hash, which improves retrieval quality by end-to-end representation learning and hash encoding, has received increasing attention recently. Subject to the ill-posed gradient difficulty in the optimization with sign activations, existing deep learning to hash methods need to first learn continuous representations and then generate binary hash codes in a separated binarization step, which suffer from substantial loss of retrieval quality. This work presents HashNet, a novel deep architecture for deep learning to hash by continuation method with convergence guarantees, which learns exactly binary hash codes from imbalanced similarity data. The key idea is to attack the ill-posed gradient problem in optimizing deep networks with non-smooth binary activations by continuation method, in which we begin from learning an easier network with smoothed activation function and let it evolve during the training, until it eventually goes back to being the original, difficult to optimize, deep network with the sign activation function. Comprehensive empirical evidence shows that HashNet can generate exactly binary hash codes and yield state-of-the-art multimedia retrieval performance on standard benchmarks.

연구 동기 및 목표

Address ill-posed gradient when training sign-activated networks for end-to-end hashing.
Mitigate data imbalance in pairwise similarity learning for hashing.
Learn exactly binary hash codes without post-binarization loss.
Provide convergence guarantees for the continuation-based optimization.
Demonstrate superior retrieval performance on standard benchmarks.

제안 방법

Use a CNN with a fully-connected hash layer to produce K-dimensional representations.
Apply a sign activation to obtain exact binary codes from the hash layer.
Adopt a weighted maximum likelihood objective to preserve pairwise similarities under data imbalance.
Introduce a continuation strategy that starts with a smoothed tanh activation and gradually increases non-smoothness to converge to sign activation.
Define a pairwise logistic likelihood P(sij|hi,hj) with an adaptive sigmoid to guide learning.
Provide convergence results showing stage-to-stage loss stability and SGD-based decrease within stages.

실험 결과

연구 질문

RQ1Can end-to-end hashing be learned directly with sign activations without a separate binarization step?
RQ2How can one address ill-posed gradients and data imbalance in deep hashing?
RQ3Does a continuation-based optimization improve retrieval quality compared with prior deep hashing methods?
RQ4What is the impact of weighted likelihood and continuation on learned hash codes under imbalanced similarity data?

주요 결과

HashNet은 16–64 비트 코드를 대상으로 ImageNet, NUS-WIDE, MS COCO에서 최첨단 검색 성능을 달성한다.
HashNet은 얕은 해시 및 깊은 해시 기준선 대비 상당한 MAP 상승을 보이며, 예를 들어 ITQ/ITQ-CCA 및 DHN에 걸쳐 데이터셋 전체에서 큰 절대 MAP 개선을 보인다.
가중 최대 우도와 연속화 접근법은 큰 이득을 가져오며, 예를 들어 HashNet-C 및 HashNet- continuation 변형이 다른 대안들을 눈에 띄는 차이로 능가한다.
P@H=2(해밍 반경 2 이내의 정밀도)는 모든 데이터셋에서 HashNet이 최고로, 콤팩트 코드로 강한 순위를 나타낸다.
t-SNE 시각화는 DHN보다 HashNet이 더 구별력이 있는 해시 코드를 생성함을 나타내며, 학습된 코드의 범주 분리가 더 잘 이루어짐을 반영한다.
식별 실험 결과 연속화와 가중치가 결정적임을 보여주며, 일부 데이터셋에서 HashNet이 변형들보다 두 자릿수 MAP 이득으로 앞선다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.