QUICK REVIEW

[論文レビュー] Opening the Black Box: A Survey on the Mechanisms of Multi-Step Reasoning in Large Language Models

Liangming Pan, Jason Liang|arXiv (Cornell University)|Jan 2, 2026

Topic Modeling被引用数 0

ひとこと要約

LLMがマルチステップ推論をどのように実現するかを包括的に整理し、暗黙的推論と明示的推論を区別し、内部機序・学習ダイナミクス・Chain-of-Thoughtの説明の信頼性を検討するサーベイ。

ABSTRACT

Large Language Models (LLMs) have demonstrated remarkable abilities to solve problems requiring multiple reasoning steps, yet the internal mechanisms enabling such capabilities remain elusive. Unlike existing surveys that primarily focus on engineering methods to enhance performance, this survey provides a comprehensive overview of the mechanisms underlying LLM multi-step reasoning. We organize the survey around a conceptual framework comprising seven interconnected research questions, from how LLMs execute implicit multi-hop reasoning within hidden activations to how verbalized explicit reasoning remodels the internal computation. Finally, we highlight five research directions for future mechanistic studies.

研究の動機と目的

LLMsにおける暗黙的マルチホップ推論の内部機序を説明する。
学習中にマルチステップ推論能力がどのように出現するか（グロッキング）。
明示的Chain-of-Thought promptingが内部計算をどのように再構成するかを分析する。
マルチステップ推論における近道の普及と原因を評価する。
因果分析と信頼できるCoT解説の将来の方向性を論じる。

提案手法

因果的探査、機序的追跡、表現分析による機械的研究をレビューする。
暗黙的推論における層の特化とデータフローに関する知見を統合する。
反復ヘッド、CoTによる外部メモリ、状態維持を用いた明示的推論の証拠を要約する。
Prompt構造や例示がCoT有効性に影響する要因を分析する。
CoTが説明としての信頼性を持つか、信頼性を損なう要因を評価する。

Figure 1: The cognitive framework and organizational structure of this survey. We explore the mechanisms of multi-step reasoning through two distinct paradigms: Implicit Reasoning and Explicit Reasoning , through seven interconnected Research Questions . The bottom panel highlights five strategic di

実験結果

リサーチクエスチョン

RQ1内部の活性化層に潜むマルチステップ推論を実現する機構は何か。
RQ2学習中に潜在的なマルチステップ推論能力がどのように出現するのか（例：グロッキング）。
RQ3モデルは genuine なマルチステップ推論よりも近道（表面的なヒューリスティック）に依存するのか。その近道はどのような形をとるのか。
RQ4Chain-of-Thought promptingは内部計算をどのように再構成し、拡張推論を可能にするのか。
RQ5CoTはモデルの意思決定過程を信頼できる説明として提供するのか。

主な発見

暗黙的推論は層ごとに機能的特化を伴う層状で構造化されたプロセスである。
推論能力は学習過程の位相転換（グロッキング現象）を通じて出現することがある。
モデルは推論を模倣する近道（事実的または表面的パターンのヒューリスティック）に頻繁に依存する。
Chain-of-Thought promptingは外部メモリと状態追跡を備えた推論モードを作り出す。
CoTは計算深度を拡張し、モジュラー推論と一般化を支援することで性能を向上させるが、CoTの説明はしばしば信頼できない。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。