QUICK REVIEW

[论文解读] Thinking in Latents: Adaptive Anchor Refinement for Implicit Reasoning in LLMs

Disha Sheshanarayana, Rajat Subhra Pal|arXiv (Cornell University)|Mar 16, 2026

Topic Modeling被引用 0

一句话总结

AdaAnchor 通过迭代更新一小组锚向量进行静默潜在细化，并使用自适应停止规则在锚向量动态收敛时停止，从而在固定潜在步骤下实现更好的效率-精度权衡，并减少输出令牌数量以区别于逐词推理。

ABSTRACT

Token-level Chain-of-Thought (CoT) prompting has become a standard way to elicit multi-step reasoning in large language models (LLMs), especially for mathematical word problems. However, generating long intermediate traces increases output length and inference cost, and can be inefficient when the model could arrive at the correct answer without extensive verbalization. This has motivated latent-space reasoning approaches that shift computation into hidden representations and only emit a final answer. Yet, many latent reasoning methods depend on a fixed number of latent refinement steps at inference, adding another hyperparameter that must be tuned across models and datasets to balance accuracy and efficiency. We introduce AdaAnchor, a latent reasoning framework that performs silent iterative computation by refining a set of latent anchor vectors attached to the input. AdaAnchor further incorporates an adaptive halting mechanism that monitors anchor stability across iterations and terminates refinement once the anchor dynamics converge, allocating fewer steps to easier instances while reserving additional refinement steps for harder ones under a shared maximum-step budget. Our empirical evaluation across three mathematical word-problem benchmarks shows that AdaAnchor with adaptive halting yields accuracy gains of up to 5% over fixed-step latent refinement while reducing average latent refinement steps by 48-60% under the same maximum-step budget. Compared to standard reasoning baselines, AdaAnchor achieves large reductions in generated tokens (92-93%) by moving computation into silent latent refinement, offering a different accuracy-efficiency trade-off with substantially lower output-token usage.

研究动机与目标

通过将计算从显式逐步推理转移到潜在空间来降低逐词级推理成本。
引入 AdaAnchor，一种在推理过程中对一组紧凑锚向量进行细化的潜在推理框架。
开发基于稳定性的自适应停止机制，以在锚向量的稳定性达到收敛时终止细化。
在数学文字题上评估 AdaAnchor，以比较其在固定步长潜在方法和显式推理基线之间的效率与准确性。

提出的方法

在嵌入序列前置 m 个可学习的锚向量以增强输入。
通过前向传播并从骨干隐藏状态更新锚槽，带平滑参数 β，迭代细化锚向量。
使用基于稳定性的停止规则在多次迭代中检测锚向量动态的收敛来终止细化。
在细化终止后仅以答案为输出格式解码最终答案。
在共享的最大潜在预算 Kmax 下比较自适应停止与固定步长潜在细化。

Figure 1: Comparison of AdaAnchor with explicit Chain-of-Thought (CoT) reasoning. CoT generates long intermediate reasoning tokens, whereas AdaAnchor performs implicit multi-step computation by refining latent anchor vectors and uses stability-based early stopping before answer-only decoding.

实验结果

研究问题

RQ1潜在锚向量细化是否能够在不输出推理过程令牌的情况下提供多步隐式推理？
RQ2在固定计算预算下，基于锚向量稳定性的自适应停止是否改善效率-准确性权衡？
RQ3与基于令牌的和固定潜在方法相比，AdaAnchor 在标准数学文字题基准上的表现如何？

主要发现

在相同预算下，采用自适应停止的 AdaAnchor 相较固定步长潜在细化可使准确率提升最高达 5%。
相较固定步长细化，自适应停止平均将潜在细化步数减少 48%–60%，在平均水平上表现显著。
通过在潜在空间进行计算，AdaAnchor 大幅降低输出令牌使用量（相对基于令牌的推理基线减少 92–93%）。
与 No-CoT 和显式 CoT 基线相比，AdaAnchor 在 GSM8K、SVAMP 和 MultiArith 上实现了较好的效率，同时保持或提升准确性。
固定步长预算会出现收益递减，推动自适应终止策略的使用。

Figure 2: Overview of AdaAnchor. AdaAnchor prepends $m$ learnable latent anchor vectors to the input embedding sequence (left), iteratively refines them via repeated forward passes and anchor-slot updates (middle), and uses a stability-based criterion to halt early before performing answer-only deco

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。