QUICK REVIEW

[论文解读] GeoSteer: Faithful Chain-of-Thought Steering via Latent Manifold Gradients

Kentaro Kazama, Daiki Shirafuji|arXiv (Cornell University)|Jan 15, 2026

Topic Modeling被引用 0

一句话总结

GeoSteer 通过在学习到的潜在流形上引导隐藏状态来改善中间推理质量，而不影响最终答案的准确性，在使用 VAE 潜在空间和梯度引导的 Qwen3 模型的 GSM8k 上证明了该效果。

ABSTRACT

Recent advances in Large Language Models (LLMs) have demonstrated remarkable progress in their reasoning capabilities, such as Chain-of-Thought (CoT). Most approaches rely on CoT rationales. Previous studies have shown that LLMs often generate logically inconsistent reasoning steps even when their final answers are correct. These inconsistencies reduce the reliability of the reasoning process. We propose GeoSteer, a manifold-based framework that improves the quality of intermediate reasoning. The method consists of: (1) constructing a CoT dataset with step-level scores, (2) training a Variational Autoencoder (VAE) model and a quality estimation model to learn a low-dimensional manifold of high-quality CoT trajectories, and (3) steering hidden states of target LLMs toward higher-quality regions in the latent space. This last step enables steering of the hidden states by following gradients along the learned manifold. It facilitates geometrically coherent steering. Evaluation experiments were conducted on the GSM8k dataset using the Qwen3 series. We evaluated performance using two metrics: answer accuracy and overall reasoning quality. GeoSteer improved the accuracy by 0.9 points and enhanced the reasoning quality by 4.5 points on average, compared with those of original LLMs. These results indicate that GeoSteer improves an effective and controllable mechanism for improving the quality of intermediate reasoning in LLMs.

研究动机与目标

需要对中间 CoT 推理质量进行可靠评估与控制，而不仅仅关注最终答案。
提出一种基于流形的激活引导方法，将隐藏状态引导至高质量的 CoT 区域。
构建高质量的 CoT 轨迹数据集，并用 VAE 加质量预测器学习潜在流形。
在推理阶段通过在潜在空间沿梯度拉回隐藏状态，以提升推理连贯性。

提出的方法

建立包含高质量与低质量轨迹及逐步质量分数的 CoT 数据集。
在隐藏状态上训练变分自编码器，学习 CoT 轨迹的低维潜在流形。
训练一个可微的质量函数 Rψ，基于潜在向量对推理质量进行评分。
推理时计算潜在向量 z，通过编码器雅可比矩阵将梯度拉回隐含状态，并通过归一化梯度步长更新 h_t: h'_t = h_t + β ∇_h_t Rψ(z_t)/||∇_h_t Rψ(z_t)||。

实验结果

研究问题

RQ1潜在空间引导是否能够在不牺牲最终答案准确性的前提下提升中间 CoT 推理质量和连贯性？
RQ2一种几何感知、基于流形的引导方法是否在保持推理一致性方面优于在欧几里得空间中的线性激活引导？

主要发现

β	Qwen3-0.6B Baseline EM	Qwen3-0.6B Steered EM	Qwen3-1.7B Baseline EM	Qwen3-1.7B Steered EM	Qwen3-4B Baseline EM	Qwen3-4B Steered EM	Qwen3-8B Baseline EM	Qwen3-8B Steered EM
1	60.0	58.7	82.3	82.4	90.6	90.5	90.7	90.4
10	60.0	60.0	82.3	82.9	90.6	90.5	90.7	90.6
50	60.0	58.5	82.3	83.1	90.6	90.3	90.7	90.4
100	60.0	55.0	82.3	83.5	90.6	89.5	90.7	90.8
125	60.0	52.0	82.3	83.5	90.6	89.8	90.7	91.3
150	60.0	50.9	82.3	84.9	90.6	89.8	90.2	91.3
200	60.0	46.2	82.3	84.1	90.6	89.9	90.7	91.4
300	60.0	28.7	82.3	84.7	90.6	88.9	90.7	91.3

GeoSteer 能根据模型规模在最终回答准确性上带来不同幅度的提升（例如在某些 Qwen3 规模的 β 设置下观察到的 EM 增益）。
在对比评估中，被引导的模型在对比中更受偏好（GPT-4o），跨模型规模一致优于基线。
引导在多种配置下通常提升推理质量（连贯性、结构、逐步一致性），对 EM 的下降最小或无显著下降。
潜在空间轨迹在关键推理转折点呈现语义上有意义的位移，表明引导影响的是内部表征而非表层文本。
最佳引导强度 β 取决于模型容量，较大模型在中等至较高 β 值时受益较大。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。