Skip to main content
QUICK REVIEW

[论文解读] Capabilities and Fundamental Limits of Latent Chain-of-Thought

Jiaxuan Zou, Yaozhong Xiong|arXiv (Cornell University)|Feb 1, 2026
Explainable Artificial Intelligence (XAI)被引用 0
一句话总结

该论文分析 Latent CoT 与显式 CoT 的探索-执行权衡,提出 Symbolic Index 用于量化决策确定性,证明课程学习在理论上是必要的,并提供一个统一框架将确定性与推理性能联系起来。

ABSTRACT

Latent Chain-of-Thought (Latent CoT) models promise efficient reasoning via continuous representations, yet exhibit puzzling performance inconsistencies: excelling at exploration (ProsQA: 97.0%) but failing at computation (GSM8K: 34.1%). We reveal that this trade-off is governed by decisional certainty. Our contributions are threefold: (1) We theoretically characterize the fundamental Exploration-Execution Trade-off, proving that high certainty enables precise execution but inhibits exploration, while low certainty facilitates search but causes error accumulation. (2) We introduce the Symbolic Index--quantifying decisional commitment--as the core mechanism governing this trade-off and establish its causal relationship with both execution stability and exploration capability. (3) We prove that curriculum learning is theoretically necessary, as direct training provably fails due to distributional mismatch. Our framework shifts the design paradigm from binary architectural choices toward adaptive systems that dynamically regulate decisional certainty based on task demands.

研究动机与目标

  • Motivate and formalize why explicit CoT and latent CoT exhibit complementary failure modes in reasoning tasks.
  • Characterize the exploration-execution trade-off through decisional certainty and introduce the Symbolic Index as a regulating metric.
  • Show that curriculum learning is theoretically necessary to train Latent CoT and bridge distributional gaps.
  • Propose a framework for adaptive systems that regulate decisional certainty based on task demands.

提出的方法

  • Model CoT as discrete token generation and Latent CoT as continuous latent state evolution.
  • Formalize the Coconut training objective and show its equivalence to the Conditional Information Bottleneck (CIB) via a duality (Theorem 4.1).
  • Define and analyze the Symbolic Index (I_S) as the top-token probability to regulate certainty.
  • Derive the Exploration-Execution Trade-off bound linking I_S to KL divergence from uniform exploration (Theorem 4.12).
  • Analyze robustness to noise through logit margins (Theorem 4.11) and sub-decisional perturbations (Theorem 4.8).
  • Prove curriculum learning is necessary (Theorem 5.1) and sufficient for convergence under standard learning conditions (Theorem 5.2).
Figure 1 : Symbolic Index on GSM8K. Latent CoT (shown) maintains a low Symbolic Index ( $\mathcal{I}_{\text{S}}\in[0.2,0.5]$ ), indicating a dispersed probability distribution. It lacks the probability concentration ( $\mathcal{I}_{\text{S}}\approx 1.0$ ) observed in Explicit CoT.
Figure 1 : Symbolic Index on GSM8K. Latent CoT (shown) maintains a low Symbolic Index ( $\mathcal{I}_{\text{S}}\in[0.2,0.5]$ ), indicating a dispersed probability distribution. It lacks the probability concentration ( $\mathcal{I}_{\text{S}}\approx 1.0$ ) observed in Explicit CoT.

实验结果

研究问题

  • RQ1为什么显式 CoT 与潜在 CoT 在不同任务中呈现互补的强项与弱点?
  • RQ2决策确定性如何调控推理模型中的探索与执行?
  • RQ3课程学习对 Latent CoT 是否在理论上是必要的,是否能保证收敛?
  • RQ4是否存在一个统一框架(Symbolic Index)指导自适应推理系统在探索与执行之间切换?

主要发现

MethodGSM8K Acc. (%)GSM8K TokensProntoQA Acc. (%)ProntoQA TokensProsQA Acc. (%)ProsQA Tokens
CoT42.9±0.225.098.8±0.892.577.5±1.949.4
No-CoT16.5±0.52.293.8±0.73.076.7±1.08.2
COCONUT34.1±1.58.299.8±0.29.097.0±0.314.2
- w/o curriculum14.4±0.88.252.4±0.49.076.1±0.214.2
  • 显式 CoT 能实现较高的执行准确率,但由于决策确定性高,探索能力较差。
  • 潜在 CoT 通过低确定性实现探索,但容易因噪声积累而损害符号精度。
  • Symbolic Index I_S 决定权衡:I_S 高时决策边界大、执行鲁棒性强但探索受限;I_S 低时有探索空间但易受扰动影响。
  • 课程学习在理论上是必要的,以避免分布不匹配并实现向专家级推理的收敛(定理 5.1 和 5.2)。
  • 实证结果显示潜在 CoT 在 ProsQA 上保持低 I_S(0.2–0.5),在 GSM8K 缺乏离散化,与理论一致;显式 CoT 将概率质量集中(I_S 接近 1)。
  • 对噪声鲁棒性的分析显示 CoT 的离散化重置能对扰动提供屏蔽,而潜在 CoT 在存在噪声时呈现连续衰退(定理 4.8)。
Figure 2 : Symbolic Index on ProsQA. Latent CoT exhibits a stable, low $\mathcal{I}_{\text{S}}$ distribution across reasoning steps. This validates Theorem 4.5 , showing that the model distributes probability mass across multiple latent paths rather than converging to a single token.
Figure 2 : Symbolic Index on ProsQA. Latent CoT exhibits a stable, low $\mathcal{I}_{\text{S}}$ distribution across reasoning steps. This validates Theorem 4.5 , showing that the model distributes probability mass across multiple latent paths rather than converging to a single token.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。