QUICK REVIEW

[论文解读] Controlling the Output Length of Neural Machine Translation

Surafel M. Lakew, Mattia Di Gangi|arXiv (Cornell University)|Oct 23, 2019

Natural Language Processing Techniques参考文献 36被引用 33

一句话总结

该论文提出两种在 Transformer 中控制 NMT 输出长度的方法：(1) 通过在前缀长度类别令牌来进行长度令牌条件化；(2) 在解码器的位置嵌入中进行长度编码；两种方法都旨在生成不比源句长的翻译，但在 BLEU 与长度精确度之间存在权衡。

ABSTRACT

The recent advances introduced by neural machine translation (NMT) are rapidly expanding the application fields of machine translation, as well as reshaping the quality level to be targeted. In particular, if translations have to fit some given layout, quality should not only be measured in terms of adequacy and fluency, but also length. Exemplary cases are the translation of document files, subtitles, and scripts for dubbing, where the output length should ideally be as close as possible to the length of the input text. This paper addresses for the first time, to the best of our knowledge, the problem of controlling the output length in NMT. We investigate two methods for biasing the output length with a transformer architecture: i) conditioning the output to a given target-source length-ratio class and ii) enriching the transformer positional embedding with length information. Our experiments show that both methods can induce the network to generate shorter translations, as well as acquiring interpretable linguistic skills.

研究动机与目标

阐明在布局受限的机器翻译任务中控制输出长度为何有价值（文档、字幕、配音等）。
研究通过两种方法使 Transformer NMT 偏向更短或更长的翻译。
评估长度控制方法在不同数据规模下对翻译质量（BLEU）和长度指标的影响。
证明结合方法在可控长度方面可实现较小的质量损失。
探索面向长度控制模型的实际训练策略（从头训练 vs 微调）。

提出的方法

通过对源文本使用长度类别令牌进行标记，引入三种长度组（短、正常、长），并训练一个模型以处理所有组。
在 Transformer 解码器中开发表示剩余长度或成比例目标长度的长度编码，使用绝对（len-pos）和相对（quantized len/pos）两种变体。
结合长度令牌和长度编码方法，以利用粗粒度和细粒度的长度控制。
可选地在含长度信息的预训练 NMT 模型上进行微调，以使基线质量与长度控制解耦。
使用 BLEU 和 BLEU*（用于考虑简略性）在 En-It 和 En-De 上进行评估；报告平均长度比 LR_src 和 LR_ref。
在小规模（TED MuST-C）和大规模数据条件下进行实验，以测试不同数据情形下的鲁棒性。

实验结果

研究问题

RQ1是否可以引导 Transformer NMT 模型生成长度受控的翻译且不产生较大质量损失？
RQ2长度令牌条件化和长度编码是否在长度控制方面提供互补的收益？
RQ3将两种方法结合在一起在不同语言对和数据规模下对翻译质量和长度可预测性有何影响？
RQ4在微调预训练模型与从头训练时，长度控制方法的影响有何不同？
RQ5在保持 BLEU 分数的同时，输出长度能接近源长度（LR_src ≈ 1.0）的程度有多接近？

主要发现

模型	策略	BLEU	BLEU*	LR_src	LR_ref
Baseline	standard	32.33	32.33	1.05	1.03
Baseline	penalty	32.45	32.45	1.04	1.02
Training from scratch	normal	32.54	32.54	1.04	1.02
Len-Tok	short	31.62	32.90	0.97	0.95
Len-Tok	long	31.16	31.16	1.10	1.08
Len-Enc Rel	match	30.96	30.96	1.03	1.01
Len-Enc Abs	match	30.26	30.26	1.01	1.04

长度令牌条件化在较粗粒度的长度控制下，BLEU 损失很小，并且在某些语言中可以将 LR_src 接近 1.0。
长度编码可以实现更细粒度的长度控制，但绝对编码可能由于截断而降低翻译质量；相对编码提供了一种折中。
结合 Tok+Enc 方法实现多样的长度风格（短、正常、长），并可在质量下降有限的情况下调节目标长度。
对基线模型进行含长度信息的微调通常在保持 BLEU 的同时实现接近源长度的长度控制。
在大数据条件下，token-short 可以将长度降至 LR_src ≈ 1.00，BLEU 约在 34–36 之间，具体取决于语言对，而 normal/token 设置保持更高的 BLEU。
人工评估表明，较短的翻译会带来较小的质量下降，这在统计上显著，但涉及可辨识的释义和简化策略。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。