QUICK REVIEW

[论文解读] Multilingual Alignment of Contextual Word Representations

Steven Cao, Nikita Kitaev|arXiv (Cornell University)|Feb 10, 2020

Topic Modeling参考文献 29被引用 152

一句话总结

该论文为多语言 BERT 提出一个情境对齐流程，显示零-shot XNLI 迁移的提升，并提出一个情境单词检索度量，与下游性能相关。它展示了通过对齐取得的显著提升，达到 Bulgarian 和 Greek 的 translate-train 水平，并分析对齐的优点/弱点。

ABSTRACT

We propose procedures for evaluating and strengthening contextual embedding alignment and show that they are useful in analyzing and improving multilingual BERT. In particular, after our proposed alignment procedure, BERT exhibits significantly improved zero-shot performance on XNLI compared to the base model, remarkably matching pseudo-fully-supervised translate-train models for Bulgarian and Greek. Further, to measure the degree of alignment, we introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer. Using this word retrieval task, we also analyze BERT and find that it exhibits systematic deficiencies, e.g. worse alignment for open-class parts-of-speech and word pairs written in different scripts, that are corrected by the alignment procedure. These results support contextual alignment as a useful concept for understanding large multilingual pre-trained models.

研究动机与目标

理解多语言 BERT 超越原始零-shot 性能的动机与意义。
定义并通过跨平行语料的单词对齐任务来衡量情境对齐。
提出在多语言 BERT 上基于并行数据的微调对齐方法。
评估对齐对零-shot 跨语言转移（XNLI）的影响并与基线进行比较。
分析影响对齐的语言因素，包括词性类别和脚本差异。

提出的方法

给出一个带并行语料和词对齐的情境对齐目标。
使用基于 CSLS 的相似度进行检索，但以带正则化的平方距离损失进行优化。
正则化以使嵌入尽量靠近初始的预训练值，同时将非英语嵌入向英语方向对齐。
在多个语言对上使用并行 Europarl 数据进行联合训练对齐。
与基于句子的 fastText 基线以及基于旋转的情境对齐方法进行比较。
通过在 English MultiNLI 上微调来评估 XNLI 的零-shot 转移，并在 Bulgarian、German、Greek、Spanish、French 上进行测试。

实验结果

研究问题

RQ1情境对齐是否能够提升多语言 BERT 的跨语言转移能力，超越基础性能？
RQ2所提出的对齐方法与旋转基线及非情境方法相比有何差异？
RQ3达到有意义提升需要多少并行数据量的效率？
RQ4影响对齐质量的语言因素（词性、词汇重叠、脚本差异）有哪些？

主要发现

情境对齐显著提升 XNLI 的零-shot 准确率，覆盖多种语言，至少提升 1% 以上，对 Bulgarian 和 Greek 的提升更大。
在使用并行数据对齐时，基于零-shot 的平均提升为 2.78%，接近 Bulgarian 和 Greek 的 translate-train 天花板。
微调后的情境对齐优于基于 fastText 的基线和基于旋转的方法，凸显了上下文在对齐中的价值。
对齐与下游转移呈强相关，验证情境单词检索作为预测跨语言性能的评估指标的有效性。
对齐缓解了系统性不足，特别是在开放词性与封闭词性对齐及跨脚本词对处理方面的改进。
在每对语言对 50K 条并行句的条件下可以获得大部分增益；即使 10K 条也能带来改进，体现了数据效率。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。