QUICK REVIEW
[論文レビュー] Does Neural Machine Translation Benefit from Larger Context?
Sébastien Jean, Stanislas Lauly|arXiv (Cornell University)|Apr 17, 2017
Natural Language Processing Techniques参考文献 2被引用数 128
ひとこと要約
本論文は NMT に大規模な文脈エンコーダ/アテンションを追加し、周囲の文を取り入れる。小規模コーパスでは BLEU、RIBES、代名詞予測を改善するが、大規模データでは利得がほとんど消える。
ABSTRACT
We propose a neural machine translation architecture that models the surrounding text in addition to the source sentence. These models lead to better performance, both in terms of general translation quality and pronoun prediction, when trained on small corpora, although this improvement largely disappears when trained with a larger corpus. We also discover that attention-based neural machine translation is well suited for pronoun prediction and compares favorably with other approaches that were specifically designed for this task.
研究の動機と目的
- Investigate whether larger discourse context improves neural machine translation quality.
- Extend an attention-based NMT model to encode and attend over neighboring sentences.
- Evaluate effects of larger context on standard translation metrics and pronoun prediction tasks.
- Assess how training data size modulates benefits from larger context.
提案手法
- Extend Bahdanau-style attention NMT with an additional context encoder and second attention mechanism for surrounding sentences.
- Compute two source representations: main source attention (s_t') and a context attention (c_t') using h_t and h_t^c respectively.
- Update decoder to condition on both s_t' and c_t' in the next-token distribution.
- Train using log-likelihood with Adadelta and perform early stopping on BLEU.
- Evaluate using BLEU and RIBES for translation quality and macro-average recall for cross-lingual pronoun prediction.
- Demonstrate results across varying training set sizes (5%, 10%, 20%, 40%, 100%).
実験結果
リサーチクエスチョン
- RQ1Does incorporating preceding/following source sentences improve translation quality as measured by BLEU and RIBES?
- RQ2Does larger-context MT improve pronoun prediction performance on cross-lingual pronoun tasks?
- RQ3How does the size of the training corpus affect the benefits of larger-context modeling?
- RQ4Is the pronoun-prediction performance due to larger context or other factors such as lemmatization?
主な発見
| Dataset / Metric | 5% | 10% | 20% | 40% | 100% | Notes |
|---|---|---|---|---|---|---|
| BLEU En-Fr (NMT) | 27.6 | 32.7 | 35.7 | 38.2 | 39.9 | N/A |
| BLEU En-Fr (LC-NMT) | 28.8 | 33.9 | 36.7 | 38.6 | 39.0 | N/A |
| BLEU En-De (NMT) | 16.3 | 19.8 | 22.1 | 24.3 | 25.6 | N/A |
| BLEU En-De (LC-NMT) | 17.4 | 20.9 | 22.7 | 23.9 | 25.1 | N/A |
| RIBES En-Fr (NMT) | 82.0 | 84.0 | 85.0 | 85.9 | 86.9 | N/A |
| RIBES En-Fr (LC-NMT) | 82.4 | 84.8 | 85.6 | 86.0 | 86.4 | N/A |
| RIBES En-De (NMT) | 76.6 | 78.9 | 80.4 | 81.4 | 81.7 | N/A |
| RIBES En-De (LC-NMT) | 77.3 | 79.5 | 80.6 | 81.5 | 81.7 | N/A |
- Larger-context NMT generally improves BLEU and RIBES over vanilla NMT at small to moderate data sizes.
- As training data increases to larger corpora, the advantage of larger-context modeling diminishes and largely disappears.
- On IWSLT En-De (non-lemmatized, ~10% of pronoun corpus size), LC-NMT still outperforms NMT, indicating benefits beyond lemmatization.
- LC-NMT yields higher pronoun prediction macro-average recall than vanilla NMT on pronoun tasks with smaller training sets.
- Pronoun prediction performance from LC-NMT approaches or matches top shared-task systems, showing strong potential for discourse-aware MT in targeted evaluation.
- Overall, larger-context gains are modest and context-dependent, suggesting that a more focused evaluation metric may be needed to capture discourse effects.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。