QUICK REVIEW

[论文解读] Diffusion LLMs can think EoS-by-EoS

Sarah Breckner, Sebastian Schuster|arXiv (Cornell University)|Mar 5, 2026

Topic Modeling被引用 0

一句话总结

该论文研究扩散型大模型是否将尾随的结束符（end-of-sequence，EoS）令牌视为隐藏的草稿板以提升推理，通过在多任务中的行为与因果干预实验给出证据。

ABSTRACT

Diffusion LLMs have been proposed as an alternative to autoregressive LLMs, excelling especially at complex reasoning tasks with interdependent sub-goals. Curiously, this is particularly true if the generation length, i.e., the number of tokens the model has to output, is set to a much higher value than is required for providing the correct answer to the task, and the model pads its answer with end-of-sequence (EoS) tokens. We hypothesize that diffusion models think EoS-by-EoS, that is, they use the representations of EoS tokens as a hidden scratchpad, which allows them to solve harder reasoning problems. We experiment with the diffusion models LLaDA1.5, LLaDA2.0-mini, and Dream-v0 on the tasks Addition, Entity Tracking, and Sudoku. In a controlled prompting experiment, we confirm that adding EoS tokens improves the LLMs' reasoning capabilities. To further verify whether they serve as space for hidden computations, we patch the hidden states of the EoS tokens with those of a counterfactual generation, which frequently changes the generated output to the counterfactual. The success of the causal intervention underscores that the EoS tokens, which one may expect to be devoid of meaning, carry information on the problem to solve. The behavioral experiments and the causal interventions indicate that diffusion LLMs can indeed think EoS-by-EoS.

研究动机与目标

探究生成长度如何影响扩散型LLM在推理任务中的表现。
在扩散型LLM中解耦解码步骤与尾随EoS令牌的作用。
提供因果证据表明EoS令牌表示有助于推理。
将EoS逐条推理与在扩散模型中使用的明显链式推理提示进行比较。

提出的方法

研究三种指令调整的扩散型LLM（LLaDA1.5、LLaDA2.0-mini、Dream-v0）与自回归基线模型（Llama3.1、Qwen3）。
使用受控提示来改变生成长度与尾随EoS令牌以观察推理表现。
对EoS令牌的隐藏状态进行修补以评估对输出的因果影响（反事实提示）。
在 additions、entity tracking、Sudoku 数据集上评估，以测试在不同难度下的推理能力。
比较扩散模型的EoS逐条推理与在不同令牌预算下的链式思维提示在推理上的差异。

实验结果

研究问题

RQ1增加生成长度是否在各任务上提升扩散型LLM的推理表现？
RQ2尾随EoS令牌是否在不依赖解码步骤的情况下对推理有贡献？
RQ3EoS令牌表示是否在产出模型答案时具有因果参与？
RQ4在扩散与自回归模型中，EoS逐条推理与传统链式思维提示相比有何差异？

主要发现

生成长度在若干任务上提升扩散型LLM的表现，且在足够长度时甚至可超过自回归模型。
在固定解码步数的条件下增加尾随EoS令牌可提高准确性，表明EoS令牌充当隐藏的草稿板。
对EoS令牌表示进行干预（替换）会改变输出，说明EoS令牌携带解题所需信息。
在自回归模型中Chain-of-Thought（CoT）提示可带来提升，在较大令牌预算下可接近甚至超过扩散模型，尤其在较易任务上。
LLaDA2.0由于其块级因果注意力设计，在尾随EoS令牌带来的增益方面表现有限。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。