Skip to main content
QUICK REVIEW

[論文レビュー] Whitening Sentence Representations for Better Semantics and Faster Retrieval

Jianlin Su, Jiarun Cao|arXiv (Cornell University)|Mar 29, 2021
Topic Modeling参考文献 16被引用数 204
ひとこと要約

本論文は、BERT系モデルからの文表現をホワイトニングすることで空間を等方化し、意味的類似性の性能を向上させ、埋め込みの次元を削減して高速な検索を実現し、しばしば BERT-flow のベースラインを上回ることを示している。

ABSTRACT

Pre-training models such as BERT have achieved great success in many natural language processing tasks. However, how to obtain better sentence representation through these pre-training models is still worthy to exploit. Previous work has shown that the anisotropy problem is an critical bottleneck for BERT-based sentence representation which hinders the model to fully utilize the underlying semantic features. Therefore, some attempts of boosting the isotropy of sentence distribution, such as flow-based model, have been applied to sentence representations and achieved some improvement. In this paper, we find that the whitening operation in traditional machine learning can similarly enhance the isotropy of sentence representations and achieve competitive results. Furthermore, the whitening technique is also capable of reducing the dimensionality of the sentence representation. Our experimental results show that it can not only achieve promising performance but also significantly reduce the storage cost and accelerate the model retrieval speed.

研究の動機と目的

  • Investigate the isotropy problem in BERT-based sentence embeddings and its impact on semantic similarity tasks.
  • Propose a whitening post-processing method to transform sentence embeddings to a standard orthogonal basis.
  • Explore dimensionality reduction (k) during whitening to balance performance and storage/speed benefits.
  • Evaluate the method on multiple semantic textual similarity benchmarks without and with NLI supervision.

提案手法

  • Apply whitening to a set of sentence embeddings: center to zero mean and transform via W where W^T Σ W = I, with Σ the covariance of embeddings.
  • Compute whitening matrix W from Σ using SVD: Σ = U Λ U^T and W = U sqrt(Λ^{-1}).
  • Optionally reduce dimensionality by keeping only the first k columns of W, enabling Whitening-k (PCA-like reduction).
  • Evaluate performance using cosine similarity on STS benchmarks with and without NLI supervision.
  • Compare against BERT-flow and SBERT baselines to assess isotropy improvement and retrieval efficiency.

実験結果

リサーチクエスチョン

  • RQ1Can whitening transform BERT-based sentence embeddings into an isotropic space to improve cosine-based similarity measurements?
  • RQ2Does whitening (with or without dimensionality reduction) improve STS task performance compared to flow-based baselines?
  • RQ3What is the effect of embedding dimensionality k on performance and retrieval efficiency?
  • RQ4Do whitening-based embeddings maintain gains under supervised (NLI) training settings?

主な発見

  • Whitening improves Spearman correlation on several STS benchmarks compared to BERT-flow, achieving state-of-the-art-like results on multiple datasets with 256/384 dimensional embeddings.
  • Dimensionality reduction ( Whitening-k ) often maintains or improves performance while substantially reducing storage and speeding up retrieval.
  • Using whitening with NLI supervision yields competitive or superior results to flow-based methods across several datasets.
  • Performance gains are observed for both BERT-base and BERT-large configurations across various STS tasks.
  • Whitening provides a simpler alternative to flow-based approaches for isotropy and compact representations.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。