QUICK REVIEW

[论文解读] Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation

Shengjie Ma, Chengjin Xu|arXiv (Cornell University)|Jul 15, 2024

Topic Modeling被引用 8

一句话总结

Think-on-Graph 2.0（ToG 2.0）是一个以知识图谱引导的检索增强生成框架，将知识图谱与非结构化文档整合，以实现深度、可解释的推理以及对复杂问答任务的准确性提升。

ABSTRACT

Retrieval-augmented generation (RAG) has improved large language models (LLMs) by using knowledge retrieval to overcome knowledge deficiencies. However, current RAG methods often fall short of ensuring the depth and completeness of retrieved information, which is necessary for complex reasoning tasks. In this work, we introduce Think-on-Graph 2.0 (ToG-2), a hybrid RAG framework that iteratively retrieves information from both unstructured and structured knowledge sources in a tight-coupling manner. Specifically, ToG-2 leverages knowledge graphs (KGs) to link documents via entities, facilitating deep and knowledge-guided context retrieval. Simultaneously, it utilizes documents as entity contexts to achieve precise and efficient graph retrieval. ToG-2 alternates between graph retrieval and context retrieval to search for in-depth clues relevant to the question, enabling LLMs to generate answers. We conduct a series of well-designed experiments to highlight the following advantages of ToG-2: 1) ToG-2 tightly couples the processes of context retrieval and graph retrieval, deepening context retrieval via the KG while enabling reliable graph retrieval based on contexts; 2) it achieves deep and faithful reasoning in LLMs through an iterative knowledge retrieval process of collaboration between contexts and the KG; and 3) ToG-2 is training-free and plug-and-play compatible with various LLMs. Extensive experiments demonstrate that ToG-2 achieves overall state-of-the-art (SOTA) performance on 6 out of 7 knowledge-intensive datasets with GPT-3.5, and can elevate the performance of smaller models (e.g., LLAMA-2-13B) to the level of GPT-3.5's direct reasoning. The source code is available on https://github.com/IDEA-FinAI/ToG-2.

研究动机与目标

通过以知识图谱引导的 RAG，推动解决大模型（LLMs）中的知识空缺和幻觉问题。
提出一种图引导的检索框架，使问题与知识图谱对齐，以实现深度推理。
将结构化的 KG 信息与非结构化文档上下文整合，以提高准确性和可解释性。
相较于基线，在多跳问答数据集上展示性能提升。

提出的方法

将 Tog 2.0 作为一种强化的 RAG 框架引入，使用知识图谱作为导航工具。
迭代执行关系裁剪、实体裁剪，以及 examine-and-reason 步骤，以构建多跳推理路径。
使用 Topic Prune（TP）选择起始实体，Relation Prune（RP）在实体之间选择关系，以及基于 DPR 的实体排序从知识图谱上下文中选择候选实体。
将来自KG的线索与非结构化文档上下文融合，以引导LLM推理，同时控制检索范围以提高效率。
提供线索查询以引导LLM，并对候选实体使用分块级相关性评分。
通过消融实验评估，以量化各组件对不同基准的准确性贡献。

实验结果

研究问题

RQ1基于知识图谱引导的检索是否能提升基于LLM的问答中的长距离推理和一致性？
RQ2将结构化的知识图谱导航与非结构化文档检索结合，是否能提升多跳问答的准确性和效率？
RQ3Topic pruning、relation pruning 和线索查询策略对推理性能的影响是什么？
RQ4ToG 2.0 在标准问答基准上与 Vanilla RAG、CoT、CoK 以及早期 ToG 相比如何？

主要发现

在使用 GPT-3.5-turbo 时，ToG 2.0 在 WebQSP、HotpotQA、QALD-10-en 上的性能优于基线（EM 分数：分别为 54.05、40.91、54.05；FEVER 的准确率为 58.54）。
与原始 ToG 相比，ToG 2.0 在 HotpotQA 上实现显著提升（14.6%），在 WebQSP（4.93%）、QALD-10-en（3.85%）和 FEVER（5.84%）上获得增益。
消融实验表明，Topic Prune 提升 WebQSP 的性能，Relation Prune 能减少推理次数和延迟，某些设定下可能存在权衡；线索查询提示在所有数据集上均有提升。
使用更弱的LLM（Llama-2-13B）时，ToG 2.0 的收益更明显，表明在模型容量较低时 KG+上下文有帮助。
与使用 Llama-2-13B 的 Vanilla RAG 相比，使用 GPT-3.5-turbo 的 ToG 2.0 在 WebQSP、HotpotQA、QALD-10-en 和 FEVER 上实现了更高的 EM。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。