QUICK REVIEW

[论文解读] xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

Linzheng Chai, Jian Yang|arXiv (Cornell University)|Jan 13, 2024

Topic Modeling被引用 7

一句话总结

xCoT 引入跨语言指令微调，将推理从高资源语言迁移到低资源语言，使用 xICL、Random-CoT 以及跨语言蒸馏来提升多语言的链式推理能力。

ABSTRACT

Chain-of-thought (CoT) has emerged as a powerful technique to elicit reasoning in large language models and improve a variety of downstream tasks. CoT mainly demonstrates excellent performance in English, but its usage in low-resource languages is constrained due to poor language generalization. To bridge the gap among different languages, we propose a cross-lingual instruction fine-tuning framework (xCOT) to transfer knowledge from high-resource languages to low-resource languages. Specifically, the multilingual instruction training data (xCOT-INSTRUCT) is created to encourage the semantic alignment of multiple languages. We introduce cross-lingual in-context few-shot learning (xICL)) to accelerate multilingual agreement in instruction tuning, where some fragments of source languages in examples are randomly substituted by their counterpart translations of target languages. During multilingual instruction tuning, we adopt the randomly online CoT strategy to enhance the multilingual reasoning ability of the large language model by first translating the query to another language and then answering in English. To further facilitate the language transfer, we leverage the high-resource CoT to supervise the training of low-resource languages with cross-lingual distillation. Experimental results on previous benchmarks demonstrate the superior performance of xCoT in reducing the gap among different languages, highlighting its potential to reduce the cross-lingual gap.

研究动机与目标

缩小低资源语言在链式推理方面的跨语言差距。
创建跨语言对齐推理的多语言指令数据。
开发训练策略（xICL、Random-CoT、xDistill）以增强跨语言迁移。
在多语言基准 MGSM 和 MSVAMP 上展示改进。

提出的方法

构建 xCoT-Instruct，一个多语言指令数据集，通过将英文数据翻译成 10 种语言，同时保留英文输出。
通过跨语言代码切换示例查询以对齐表示，引入跨语言上下文少样本学习（xICL）。
在多语言指令微调中应用 Random-CoT：将查询翻译为一个随机中间语言，然后用英文回答。
使用跨语言蒸馏（xDistill）在标记级别用高资源的 CoT 分布监督低资源输出。
使用多语言微调训练（D = {D^Lk}）并具有将输出在各语言之间对齐的联合目标（P(a^Lj|c^Li,q^Li;M)）。
可选地用微调模型生成的 D' 来扩充数据，以巩固正确的推理路径。

实验结果

研究问题

RQ1跨语言指令微调如何改善低资源语言的链式推理？
RQ2将多语言上下文学习与代码切换结合是否能够增强推理过程在跨语言间的对齐？
RQ3通过蒸馏，高资源语言的 CoT 监督是否可有效迁移到低资源语言？
RQ4Random-CoT 和多语言数据增强对多语言推理准确性的影响是什么？
RQ5xCoT 组件在 MGSM 与 MSVAMP 多语言基准上的表现如何？

主要发现

xCoT 在 MGSM 和 MSVAMP 基准上实现了分别在 11 种和 10 种语言上的最先进性能。
带代码切换的跨语言上下文学习（xICL）显著提升多语言对齐。
先将查询翻译为中间语言再用英文回答的 Random-CoT 能提升多语言推理。
跨语言蒸馏（xDistill）在标记级别利用高资源 CoT 来监督低资源语言。
与基线相比，xCoT-Instruct 训练后，多语言表示在共享空间中变得更加对齐。
消融研究显示 xICL、mSampling、Random-CoT 和 xDistill 的累积增益，xCoT 取得最佳整体性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。