QUICK REVIEW

[论文解读] SentGraph: Hierarchical Sentence Graph for Multi-hop Retrieval-Augmented Question Answering

Junli Liang, Pengfei Zhou|arXiv (Cornell University)|Jan 6, 2026

Topic Modeling被引用 0

一句话总结

SentGraph 使用 refined Rhetorical Structure Theory 构建的句子级层次图，支持图引导的多跳检索与答案生成，性能优于基于块的方法与其他图方法。

ABSTRACT

Traditional Retrieval-Augmented Generation (RAG) effectively supports single-hop question answering with large language models but faces significant limitations in multi-hop question answering tasks, which require combining evidence from multiple documents. Existing chunk-based retrieval often provides irrelevant and logically incoherent context, leading to incomplete evidence chains and incorrect reasoning during answer generation. To address these challenges, we propose SentGraph, a sentence-level graph-based RAG framework that explicitly models fine-grained logical relationships between sentences for multi-hop question answering. Specifically, we construct a hierarchical sentence graph offline by first adapting Rhetorical Structure Theory to distinguish nucleus and satellite sentences, and then organizing them into topic-level subgraphs with cross-document entity bridges. During online retrieval, SentGraph performs graph-guided evidence selection and path expansion to retrieve fine-grained sentence-level evidence. Extensive experiments on four multi-hop question answering benchmarks demonstrate the effectiveness of SentGraph, validating the importance of explicitly modeling sentence-level logical dependencies for multi-hop reasoning.

研究动机与目标

提升对比块级方法在多跳问答检索方面的需求认知。
提出一个分层句子图框架，以建模细粒度句子关系。
将图构建下放到离线阶段，以实现在线检索与推理的高效。
在四个多跳问答基准上针对多种大型语言模型展示改进的性能。
评估令牌效率并给出消融以验证组件贡献。

提出的方法

使用改编的修辞结构理论离线构建分层句子逻辑图，以定义 nucleus- satellites 与 nucleus-nucleus 关系。
三层图：主题节点（Vt）、核心句子节点（Vc）、补充句子节点（Vs）；边捕捉主题间、主题-核心、核心-核心与核心-补充等关系。
通过实体-概念连接进行跨文档桥接，将主题在不同文档间联系起来。
在线检索采用由粗到细的锚点选择、证据自适应 refinement 以及图引导的路径扩展，组装简洁且证据丰富的上下文。
生成阶段，LLM 使用检索到的句子级证据来产生最终答案。
消融分析以量化锚点选择、证据 refinement 与路径扩展的影响。

实验结果

研究问题

RQ1句子级图若具备显式逻辑关系，是否能在多跳证据检索上超过基于块的图？
RQ2离线分层句子图构建是否能在维持或提升问答性能的同时降低在线计算？
RQ3细粒度证据选择与结构化推理路径对多跳问答的准确性与效率有何影响？
RQ4不同基础大模型在不同检索设置（BM25 与 BGE）下对 SentGraph 的增益有何差异？

主要发现

SentGraph 在四项多跳问答基准上实现了在稀疏检索（BM25）与密集检索（BGE）设置下的 state-of-the-art 表现。
句子级检索结合图推理显著优于基于段落的检索以及对粗粒度块进行操作的其他图方法。
在 BM25 下，SentGraph 的句子级检索在 HotpotQA 达到 48.80 EM、61.98 F1，在 2Wiki 为 44.40 EM、52.53 F1，在 MuSiQue 为 25.00 EM、35.09 F1，MultiHop 准确率为 68.80%。
在 BGE 下，SentGraph 的句子级检索在 HotpotQA 达到 57.60 EM、68.74 F1，在 2Wiki 为 54.20 EM、63.05 F1，在 MuSiQue 为 38.80 EM、52.01 F1，MultiHop 准确率为 73.00%。
消融显示锚点选择、证据 refinement 与引导路径扩展各自对增益贡献显著，超过约 20 个锚点后收益趋于递减。
相比于 KGP 等图基线，SentGraph 能减少输入输出的 token 数，体现了通过更细粒度的证据选择与结构化推理带来的更高效率。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。