QUICK REVIEW

[논문 리뷰] xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

Linzheng Chai, Jian Yang|arXiv (Cornell University)|2024. 01. 13.

Topic Modeling인용 수 7

한 줄 요약

xCoT는 고자원 언어에서 저자원 언어로 추론을 전달하기 위해 교차 언어 지시 학습을 도입하고, xICL, Random-CoT, 그리고 교차-언어 증류를 사용하여 다국어 체인-오브-생각 추론을 개선합니다.

ABSTRACT

Chain-of-thought (CoT) has emerged as a powerful technique to elicit reasoning in large language models and improve a variety of downstream tasks. CoT mainly demonstrates excellent performance in English, but its usage in low-resource languages is constrained due to poor language generalization. To bridge the gap among different languages, we propose a cross-lingual instruction fine-tuning framework (xCOT) to transfer knowledge from high-resource languages to low-resource languages. Specifically, the multilingual instruction training data (xCOT-INSTRUCT) is created to encourage the semantic alignment of multiple languages. We introduce cross-lingual in-context few-shot learning (xICL)) to accelerate multilingual agreement in instruction tuning, where some fragments of source languages in examples are randomly substituted by their counterpart translations of target languages. During multilingual instruction tuning, we adopt the randomly online CoT strategy to enhance the multilingual reasoning ability of the large language model by first translating the query to another language and then answering in English. To further facilitate the language transfer, we leverage the high-resource CoT to supervise the training of low-resource languages with cross-lingual distillation. Experimental results on previous benchmarks demonstrate the superior performance of xCoT in reducing the gap among different languages, highlighting its potential to reduce the cross-lingual gap.

연구 동기 및 목표

저자원 언어의 체인-오브-생각 추론에서 교차 언어 간 격차를 해소한다.
언어 간 일치된 추론을 갖는 다국어 지시 데이터 생성한다.
교차 언어 전달을 향상시키기 위한 학습 전략(xICL, Random-CoT, xDistill)을 개발한다.
다국어 벤치마크 MGSM 및 MSVAMP에서 개선점을 입증한다.

제안 방법

영어 데이터를 10개 언어로 번역하면서 영어 출력은 유지하는 다국어 지시 데이터셋인 xCoT-Instruct를 구성한다.
구현된 대표 예시 쪽을 언어 간에 코드 스위칭하여 표현을 정렬하는 교차 언어 맥락상 샷 학습(xICL)을 도입한다.
다국어 지시 튜닝 중 Random-CoT를 적용한다: 질의를 임의의 중간 언어로 번역한 뒤 영어로 답한다.
토큰 수준에서 고자원 CoT 분포를 저자원 출력에 감독으로 적용하는 교차 언어 증류(xDistill)를 사용한다.
다국어 미세 조정(D = {D^Lk})를 통한 학습을 joint objective로 수행하여 출력 across languages를 정렬한다(P(a^Lj|c^Li,q^Li;M)).
선별적으로 fine-tuned 모델로 생성된 D'로 데이터를 보강하여 올바른 추론 경로를 강화한다.

실험 결과

연구 질문

RQ1교차 언어 지시 학습이 저자원 언어의 체인-오브-생각 추론을 어떻게 개선할 수 있는가?
RQ2다국어 맥락 학습 및 코드 스위칭의 도입이 추론 과정의 언어 간 정렬을 향상시키는가?
RQ3고자원 언어의 CoT 감독이 증류를 통해 저자원 언어로 효과적으로 전달될 수 있는가?
RQ4Random-CoT 및 다국어 데이터 증강이 다국어 추론 정확도에 어떤 영향을 미치는가?
RQ5xCoT 구성 요소들이 MGSM 및 MSVAMP 다국어 벤치마크에서 어떤 성과를 보이는가?

주요 결과

xCoT는 각각 11개 및 10개 언어에서 MGSM 및 MSVAMP 벤치마크의 최첨단 성능을 달성한다.
코드 스위칭을 포함한 교차 언어 맥락 학습(xICL)이 다국어 정렬을 크게 향상시킨다.
질의를 중간 언어로 번역한 뒤 영어로 답하는 Random-CoT는 다국어 추론을 강화한다.
교차 언어 증류(xDistill)는 고자원 CoT를 이용하여 토큰 수준에서 저자원 언어를 감독한다.
xCoT-Instruct 훈련 후 다국어 표현이 공유 공간에서 더 잘 정렬된다.
추출 연구에서는 xICL, mSampling, Random-CoT, 및 xDistill의 누적 이득이 확인되며, xCoT가 가장 높은 종합 성능을 달성한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.