QUICK REVIEW

[論文レビュー] Evidence-Decision-Feedback: Theory-Driven Adaptive Scaffolding for LLM Agents

Clayton Cohn, Siyuan Guo|arXiv (Cornell University)|Feb 1, 2026

Intelligent Tutoring Systems and Adaptive Learning被引用数 0

ひとこと要約

We introduce Evidence-Decision-Feedback (EDF), a theory-driven framework for adaptive scaffolding in LLM-based pedagogical agents, instantiated by Copa and evaluated in authentic classrooms. EDF aligns feedback with learner understanding, enables scaffold fading, and supports interpretable, evidence-grounded explanations.

ABSTRACT

Multi-agent LLM architectures offer opportunities for pedagogical agents to help students construct domain knowledge and develop critical-thinking skills, yet many operate on a "one-size-fits-all" basis, limiting their ability to provide personalized support. To address this, we introduce Evidence-Decision-Feedback (EDF), a theoretical framework for adaptive scaffolding using LLMs. EDF integrates elements of intelligent tutoring systems and agentic behavior by organizing interactions around evidentiary inference, pedagogical decision-making, and adaptive feedback. We instantiate EDF through Copa, an agentic collaborative peer agent for STEM+C problem-solving. In an authentic high school classroom study, we show that EDF-guided interactions align feedback with students' demonstrated understanding and task mastery; promote gradual scaffold fading; and support interpretable, evidence-grounded explanations without fostering overreliance.

研究の動機と目的

Motivate the need for adaptive, theory-grounded LLM pedagogical agents that personalize scaffolding.
Propose EDF as a modular framework grounded in Evidence-Centered Design, Stealth Assessment, SCT, ZPD, and social constructivism.
Instantiate EDF with Copa, a collaborative peer agent for STEM+C learning, and map EDF modules to Copa’s architecture.
Demonstrate EDF-generated, interpretable feedback in a high school classroom and assess its impact on learning and autonomy.

提案手法

Define EDF with three semi-autonomous modules: Evidence (learner model from data), Decision (dialogue policy), and Feedback (adaptive scaffolding).
Embed EDF in Copa, a four-sub-agent collaborative agent within the C2STEM environment, leveraging log and chat data, and Retrieval-Augmented Generation (RAG) for domain knowledge.
Use CoT prompting (CoTAL) to make the agent’s reasoning interpretable and to link data to policy decisions and feedback.
Evaluate across 33 sophomore dyads in authentic classrooms over three tasks with logged actions, dialogues, and CoT reasoning.
Compare multiple LLMs (Gemini, Claude, GPT families) and select GPT-5/GPT-5-Chat for asynchronous/synchronous reasoning.
Analyze interpretability via Grounding, Alignment, and Faithfulness links using keyword recall and SBERT-based semantic similarity.
Assess adaptivity, understanding-mastery alignment, reliance on agent, and interpretability through predefined research questions.

実験結果

リサーチクエスチョン

RQ1Copaのスキャフォールドは、学生が課題の熟達度を高めると適切に適応するか？
RQ2学生の理解が、Copaと対話する際に課題の熟達度と一致しているか？
RQ3熟達度が上がるにつれて学生はCopaへの依存を減らすか？
RQ4Copaのフィードバックは、証拠推論に関して解釈可能かどうか？

主な発見

Table 1: Dialogue policy adaptation	Table 2: Interpretability links (Grounding, Alignment, Faithfulness)
Dialogue Policy	Spearman’s ρ	Trend	p-value
PROBE_UNDERSTANDING	-0.34	Decreasing	0.034
SUGGEST_ACTION	0.33	Increasing	0.039
PUSH_LIMIT	0.42	Increasing	0.007
Grounding (Data → Evidence)	Keyword Recall	0.43	<0.001
Alignment (Evidence → Decision)	SBERT Similarity	0.64	<0.001
Faithfulness (Decision → Feedback)	SBERT Similarity	0.48	<0.001

Copaの対話ポリシーは熟達度とともに変化する：PROBE_UNDERSTANDINGは減少し、SUGGEST_ACTIONとPUSH_LIMITは増加する（ρ = -0.34, p = 0.034; ρ = 0.33, p = 0.039; ρ = 0.42, p = 0.007）。
学生の口頭によるデモンストレーションは課題熟達度と一致する：熟達度の高いデシリクスは、DEMONSTRATES_UNDERSTANDINGと probing 成功の関連が大きい（例：ρ = 0.40, p = 0.014; ρ = 0.34, p = 0.030）。
熟達度が上がるにつれ学生のCopaへの依存は低下する：エージェント支援比率は熟達度デシリクスとともに低下（ρ = -0.26, p < 0.001）。
Copaのフィードバックの解釈可能性は、Grounding、Alignment、Faithfulnessのリンクが有意かつ非ランダムであることによって示される（Grounding 0.43 対ベースライン 0.21; Alignment 0.64 対ベースライン 0.39; Faithfulness 0.48 対ベースライン 0.24; すべて p < 0.001）。
学生はCopaの prompting と思考促進の役割について肯定的な認識を示したが、理解の受容度とフィードバックの有用性はやや低く、 scaffolded 指導と直接的な解答の欲求の間に緊張があることを示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。