Skip to main content
QUICK REVIEW

[論文レビュー] Offline bilingual word vectors, orthogonal transformations and the inverted softmax

Samuel Smith, David H. P. Turban|arXiv (Cornell University)|Feb 13, 2017
Natural Language Processing Techniques被引用数 263
ひとこと要約

論文は offline bilingual word mappings should be orthogonal を証明し、SVD-based alignment plus an inverted softmax を用いて翻訳を取得し、 prior offline methods より顕著な改善を達成し、expert bilingual signalsなしに pseudo-dictionary および sentence-level translation を実現する。

ABSTRACT

Usually bilingual word vectors are trained "online". Mikolov et al. showed they can also be found "offline", whereby two pre-trained embeddings are aligned with a linear transformation, using dictionaries compiled from expert knowledge. In this work, we prove that the linear transformation between two spaces should be orthogonal. This transformation can be obtained using the singular value decomposition. We introduce a novel "inverted softmax" for identifying translation pairs, with which we improve the precision @1 of Mikolov's original mapping from 34% to 43%, when translating a test set composed of both common and rare English words into Italian. Orthogonal transformations are more robust to noise, enabling us to learn the transformation without expert bilingual signal by constructing a "pseudo-dictionary" from the identical character strings which appear in both languages, achieving 40% precision on the same test set. Finally, we extend our method to retrieve the true translations of English sentences from a corpus of 200k Italian sentences with a precision @1 of 68%.

研究の動機と目的

  • Motivate the use of offline bilingual word representations to enable cross-language transfer without joint training.
  • Propose and justify an orthogonal linear mapping via SVD for aligning monolingual word vectors across languages.
  • Introduce the inverted softmax to mitigate hubness and improve translation precision.
  • Demonstrate robustness through pseudo-dictionaries built from identical strings and through sentence-level translation using aligned corpora.

提案手法

  • Prove that the self-consistent linear mapping between word-vector spaces should be orthogonal, and obtain the orthogonal map O via SVD from a bilingual dictionary (Y_D^T X_D = U Σ V^T, O = U V^T).
  • Compute cross-language similarity with S = Y U V^T X^T and apply V^T to the source and U^T to the target languages.
  • Introduce the inverted softmax P_{j→i} = e^{β S_{ij}} / (α_j Σ_n e^{β S_{in}}) to address hubness by normalizing over source words and optionally use a subset of samples to approximate the denominator.
  • Construct pseudo-dictionaries from identical character strings across languages, and use aligned sentences (Europarl) as weak dictionaries.
  • Demonstrate that the SVD-based orthogonal mapping, combined with inverted softmax and dimensionality reduction, yields higher translation precision than prior offline methods.
  • Show that sentence vectors formed by simple sums and normalisation can retrieve translations and that 200k-sentence retrieval reaches 68% precision (sentence-level).

実験結果

リサーチクエスチョン

  • RQ1Can an orthogonal linear transformation learned from bilingual data align word vectors across languages for offline bilingual word representations?
  • RQ2Does the inverted softmax reduce hubness and improve word-level translation accuracy compared to the standard softmax or nearest-neighbor baselines?
  • RQ3How robust is the orthogonal mapping when using weak/pseudo dictionaries such as identical strings or aligned sentences instead of expert bilingual dictionaries?
  • RQ4To what extent can simple sentence-level representations enable cross-lingual sentence translation and retrieval without complex models?

主な発見

  • Orthogonal mappings are optimal for self-consistent cross-language vector alignment and can be computed efficiently via a single SVD.
  • Inverted softmax substantially improves translation precision over Mikolov’s original mapping (English→Italian 34%→43%; Italian→English 25%→37%).
  • Using identical character-string pseudo-dictionaries (no bilingual signal) yields translation precision ~40% (English→Italian) and ~34% (Italian→English), outperforming baseline methods that rely on expert dictionaries.
  • Sentence-level methods using simple sum-and-normalize vectors plus an SVD-derived alignment achieve translation precision comparable to word-level approaches (around 43% for English→Italian) and enable sentence retrieval with 68% precision on 200k candidate Italian sentences.
  • Using Europarl-aligned sentences as a phrase dictionary for SVD alignment achieves ~42.8% precision@1 English→Italian and ~37.5% English↔Italian in the opposite direction, showing competitive performance with online methods.
  • The approach demonstrates robust cross-language alignment even with weaker resources, and allows retrieval of true translations from large candidate sets.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。