QUICK REVIEW

[論文レビュー] XNLI: Evaluating Cross-lingual Sentence Representations

Alexis Conneau, Guillaume Lample|arXiv (Cornell University)|Sep 13, 2018

Topic Modeling参考文献 49被引用数 152

ひとこと要約

XNLI は MultiNLI を15言語に拡張し、跨言語の文表現と多言語転送を評価する。翻訳ベースラインと多言語エンコーダ、および整合性ベースの手法を比較する。

ABSTRACT

State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models. These models are generally trained on data in a single language (usually English), and cannot be directly used beyond that language. Since collecting data in every language is not realistic, there has been a growing interest in cross-lingual language understanding (XLU) and low-resource cross-language transfer. In this work, we construct an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus (MultiNLI) to 15 languages, including low-resource languages such as Swahili and Urdu. We hope that our dataset, dubbed XNLI, will catalyze research in cross-lingual sentence understanding by providing an informative standard evaluation task. In addition, we provide several baselines for multilingual sentence understanding, including two based on machine translation systems, and two that use parallel data to train aligned multilingual bag-of-words and LSTM encoders. We find that XNLI represents a practical and challenging evaluation suite, and that directly translating the test data yields the best performance among available baselines.

研究の動機と目的

低資源言語を含む15言語に跨る大規模な跨言語自然言語推論(NLI)のベンチマークを定義する。
NLIにおける跨言語転送のための翻訳ベースラインと多言語文エンコーダを評価する。
English NLIモデルを他言語へ転送するための整合性ベースの多言語文埋め込みを提案・評価する。

提案手法

前提と仮説を専門翻訳して英語NLIデータを15言語へ拡張する。
翻訳ベースラインを評価する：trainを翻訳、testを翻訳。
多言語文エンコーダを評価する：x-cbow (CBOW)、x-bilstm (BiLSTM) と整合損失を用いる。
英語とターゲット言語の埋め込みを並列データで揃えるための整合損失 L_align を提案する。
英語で訓練した分類器と多言語エンコーダを用いたベースラインと比較する。
整列の訓練には並列コーパス（例：UN、Europarl、OpenSubtitles、IIT Bombay）を用いる。

XNLI: Evaluating Cross-lingual Sentence Representations

実験結果

リサーチクエスチョン

RQ115言語を跨るNLIにおける翻訳ベースのアプローチはどの程度の性能を示すか。
RQ2推論時に翻訳を行わず、単純な損失転送で英語から他言語へNLIを整列したまま多言語文エンコーダは機能するか。
RQ3整合損失とネガティブサンプリングが跨言語転送性能に与える影響は？
RQ4低資源言語（ウルドゥー語、スワヒリ語）は翻訳ベースと整合性ベースの跨言語NLIでどうなるか。
RQ5展開時の翻訳ベースラインと多言語エンコーダの実用的なトレードオフは何か。

主な発見

翻訳テストベースラインはベースラインの中で最も良い跨言語性能を示す。
XNLI転送は multilingual sentence encoders は translate-train ベースラインと競合するが、翻訳-test 性能には言語により最大数ポイント程度劣る。
BiLSTMエンコーダ（BiLSTM-max）は言語を超えてCBOWベースラインを上回る。
整合性ベースの多言語埋め込みは有望で、整合損失が改善するとXNLI精度の向上と相関を示す。
ウルドゥー語とスワヒリ語では並列データが限られており、整合性ベース手法の利得を制約し、リソースの影響を際立たせる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。