QUICK REVIEW

[论文解读] Entropy-Tree: Tree-Based Decoding with Entropy-Guided Exploration

Longxuan Wei, Yubo Zhang|arXiv (Cornell University)|Jan 2, 2026

Topic Modeling被引用 0

一句话总结

Entropy-Tree 在高熵标记处进行分支，以探索多样化的推理路径，在推理任务中提升推理准确性和不确定性标定（相较于随机采样）

ABSTRACT

Large language models achieve strong reasoning performance, yet existing decoding strategies either explore blindly (random sampling) or redundantly (independent multi-sampling). We propose Entropy-Tree, a tree-based decoding method that exploits entropy as a signal for branching decisions--expanding the search tree only at positions where the model exhibits genuine uncertainty. Entropy-Tree shows superior accuracy and calibration in reasoning tasks: it achieves better pass@k than Multi-chain across multiple models and datasets, and its predictive entropy demonstrates better AUROC compared to several traditional metrics. Entropy-Tree unifies efficient structured exploration and reliable uncertainty estimation within a single decoding procedure.

研究动机与目标

通过在推理任务中针对真实模型不确定性来驱动改进解码；
开发一个以树结构为基础的解码框架，在高熵位置进行分支并共享前缀；
demonstrate 在多模型与多数据集上相较于随机独立采样的 pass@k 性能提升；
表明从树叶推导的熵不确定性比传统度量在标定（AUROC）上更优。

提出的方法

计算每个解码步骤的逐点熵 H_t
识别在阈值 tau 及以上的高熵标记作为分支候选
通过自注意力为基础的重要性 I_t 和重要性阈值 delta 筛选候选
通过在选定标记处分支来扩展树，采用广度优先扩展并设定叶子上限 N_tree
利用叶节点输出来估计 p(a|x) 与预测熵 H 以进行不确定性量化
对比 Entropy-Tree 与 Multi-chain，并在 pass@k 和基于 AUROC 的标定上进行评估

Figure 1: Entropy-Tree: Branching at high entropy tokens to form multiple decoding paths.

实验结果

研究问题

RQ1熵引导分支是否在推理任务的 pass@k 上相较于随机采样有所提升？
RQ2Entropy-Tree 的叶节点预测分布是否比现有不确定性指标具有更好的标定（AUROC）？
RQ3Entropy-Tree 在不同模型规模和推理数据集上的表现如何？
RQ4分支位置和高熵引导对解码质量有何影响？

主要发现

模型	方法	SVAMP	MATH-500	SciBench	GPQA-main	GPQA-diamond	AIME24	AIME25
Qwen2.5-7B-Instruct	Multi-chain	94.37%	75.41	57.52	71.56	70.83	9.98	17.99
Qwen2.5-7B-Instruct	Entropy-Tree	94.77%	78.24	58.27	72.07	74.81	11.64	23.84
Qwen2.5-14B-Instruct	Multi-chain	96.40%	84.78	71.05	72.11	73.25	11.66	24.64
Qwen2.5-14B-Instruct	Entropy-Tree	96.62%	82.96	71.31	73.53	74.10	12.54	26.20
Qwen2.5-32B-Instruct	Multi-chain	95.37%	77.56	70.58	76.70	78.29	14.86	21.34
Qwen2.5-32B-Instruct	Entropy-Tree	95.46%	79.55	72.23	75.93	77.98	18.33	21.78

Entropy-Tree 在多个模型和数据集上取得比 Multi-chain 更优的 pass@k。
对于 Qwen2.5-7B-Instruct 在 MATH-500 数据集上，Entropy-Tree 达到与更少叶节点相当的 pass@k（例如 pass@13 约等于 Multi-chain 的 pass@20）。
来自 Entropy-Tree 叶节点的预测熵（ET-PE）在多模型与数据集上提供更优的 AUROC 标定。
AUROC 比较显示 Entropy-Tree 在标定方面常常优于传统熵度量（包括语义熵）。
消融研究表明更早的分支（较低百分位阈值）提升了性能，熵引导分支优于随机分支。

Figure 2: The complete decoding process of Entropy-Tree.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。