QUICK REVIEW

[論文レビュー] EuroLLM-22B: Technical Report

Miguel Moura Ramos, Duarte Alves|arXiv (Cornell University)|Feb 5, 2026

Natural Language Processing Techniques被引用数 0

ひとこと要約

tldr: EuroLLM-22B is a large open multilingual European-focused LLM trained from scratch to cover all 24 EU languages plus 11 additional languages, with a 32K context window and improved post-training data. It achieves competitive multilingual and instruction-following performance and releases base/instruct models, data, and code.

ABSTRACT

This report presents EuroLLM-22B, a large language model trained from scratch to support the needs of European citizens by covering all 24 official European Union languages and 11 additional languages. EuroLLM addresses the issue of European languages being underrepresented and underserved in existing open large language models. We provide a comprehensive overview of EuroLLM-22B's development, including tokenizer design, architectural specifications, data filtering, and training procedures. Across a broad set of multilingual benchmarks, EuroLLM-22B demonstrates strong performance in reasoning, instruction following, and translation, achieving results competitive with models of comparable size. To support future research, we release our base and instruction-tuned models, our multilingual web pretraining data and updated EuroBlocks instruction datasets, as well as our pre-training and evaluation codebases.

研究の動機と目的

Address underrepresentation of European languages in open LLMs by developing an open model native to all 24 official EU languages and 11 additional languages.
Improve multilingual reasoning, instruction following, and translation through a high-quality, filtered pre-training corpus and an extended context window.
Provide openly accessible resources (models, data, and code) to support researchers in multilingual AI development in Europe.

提案手法

Design and train EuroLLM-22B with a 32K context window using a multi-phase training schedule on curated multilingual data.
Extend the tokenizer and architecture (SwiGLU, RoPE, RMSNorm, layered configurations) and adopt grouped query attention and pre-layer normalization.
Curate and filter pre-training data (EuroWeb) with language-aware filtering and quality scoring (EuroFilter) across multilingual sources.
Augment post-training with EuroBlocks v2, using multiple generations and reward-model-based selection for high-quality instruction data.
Fine-tune for instruction-following to create EuroLLM-22B-Instruct using 32K contexts and efficient training tooling (Axolotl + Liger-Kernel).
Publish base, instruction-tuned models, multilingual web data (EuroWeb), and post-training dataset (EuroBlocks) along with codebases for pretraining and evaluation.

Figure 1: Scheme of the learning rate scheduler.

実験結果

リサーチクエスチョン

RQ1How does EuroLLM-22B perform on multilingual benchmarks across instruction following, reasoning, and translation tasks compared to similarly sized open models?
RQ2What is the impact of a 32K context window and enhanced post-training data on multilingual reasoning and instruction-following capabilities?
RQ3Do European-language–focused data curation and quality filtering improve performance across EU languages without sacrificing translation quality?
RQ4How do base and instruction-tuned variants compare within the EuroLLM family in terms of multilingual capabilities and efficiency?

主な発見

EuroLLM-22B achieves competitive results with models of comparable size on multilingual benchmarks.
The instruction-tuned EuroLLM-22B generally outperforms the 9B counterpart and shows strong gains in instruction following and STEM/problem-solving tasks.
Extending the context length to 32K enables longer-input handling and improves evaluation on long-context benchmarks without harming translation quality.
Post-training improvements (EuroBlocks v2) yield significant gains over previous EuroLLM checkpoints across English and multilingual evaluations.
The EuroLLM family remains the strongest fully open European alternative among its peers, given the comparative sizes and training regimes.
Base models (EuroLLM-22B-Base) show consistent gains over the 9B base with competitive performance to larger open baselines.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。