Skip to main content
QUICK REVIEW

[论文解读] GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning

Costas Mavromatis, George Karypis|arXiv (Cornell University)|May 30, 2024
Topic Modeling被引用 13
一句话总结

GNN-RAG 使用图神经网络从知识图谱中检索密集子图信息,并将最短推理路径语言化,以引导经过调优的大型语言模型在检索增强的知识图谱问答中进行推理,凭借 Retrieval Augmentation (RA) 提升,在 WebQSP 和 CWQ 上达到最先进的结果。

ABSTRACT

Knowledge Graphs (KGs) represent human-crafted factual knowledge in the form of triplets (head, relation, tail), which collectively form a graph. Question Answering over KGs (KGQA) is the task of answering natural questions grounding the reasoning to the information provided by the KG. Large Language Models (LLMs) are the state-of-the-art models for QA tasks due to their remarkable ability to understand natural language. On the other hand, Graph Neural Networks (GNNs) have been widely used for KGQA as they can handle the complex graph information stored in the KG. In this work, we introduce GNN-RAG, a novel method for combining language understanding abilities of LLMs with the reasoning abilities of GNNs in a retrieval-augmented generation (RAG) style. First, a GNN reasons over a dense KG subgraph to retrieve answer candidates for a given question. Second, the shortest paths in the KG that connect question entities and answer candidates are extracted to represent KG reasoning paths. The extracted paths are verbalized and given as input for LLM reasoning with RAG. In our GNN-RAG framework, the GNN acts as a dense subgraph reasoner to extract useful graph information, while the LLM leverages its natural language processing ability for ultimate KGQA. Furthermore, we develop a retrieval augmentation (RA) technique to further boost KGQA performance with GNN-RAG. Experimental results show that GNN-RAG achieves state-of-the-art performance in two widely used KGQA benchmarks (WebQSP and CWQ), outperforming or matching GPT-4 performance with a 7B tuned LLM. In addition, GNN-RAG excels on multi-hop and multi-entity questions outperforming competing approaches by 8.9--15.5% points at answer F1.

研究动机与目标

  • 用最新、准确的知识图谱信息推动基于知识的问答,以减少LLM 幻觉。
  • 开发一种检索增强的方法,利用GNN对KG数据进行密集子图推理。
  • 通过将KG推理路径语言化以供LLM输入,架起基于图的推理与LLM自然语言处理的桥梁。
  • 探索检索增强,以在不大量调用LLM的前提下进一步提升KGQA性能。

提出的方法

  • 使用从KG检索得到的密集子图作为输入送入GNN,以为给定问题识别候选答案。
  • 提取将问题实体与GNN提出的答案连接起来的KG最短路径,以表示KG推理轨迹。
  • 将提取的推理路径语言化,并将其作为提示输入到基于LLM的RAG管道中以完成答案。
  • 尝试使用两种GNN,搭配不同的语言模型进行问题-关系匹配,以多样化检索信息。
  • 在推理路径提示上对一个轻量级LLM(例如 LLaMA2-Chat-7B)进行微调,以执行最终的KGQA推理。
  • 通过将GNN推导的路径与基于LLM的检索(RoG)或多个GNN/LMS检索器结合,引入检索增强(RA)以提升召回率。
Figure 2 : The landscape of existing KGQA methods. GNN-based methods reason on dense subgraphs as they can handle complex and multi-hop graph information. LLM-based methods employ the same LLM for both retrieval and reasoning due to its ability to understand natural language.
Figure 2 : The landscape of existing KGQA methods. GNN-based methods reason on dense subgraphs as they can handle complex and multi-hop graph information. LLM-based methods employ the same LLM for both retrieval and reasoning due to its ability to understand natural language.

实验结果

研究问题

  • RQ1与纯LLM检索相比,基于GNN的密集KG子图检索是否能提升KGQA推理能力?
  • RQ2从问题实体到候选答案的最短路径推理轨迹是否能为LLM推理提供可靠输入?
  • RQ3将GNN与LLM派生路径结合的检索增强(RA)是否能带来更优的KGQA性能和可信度?

主要发现

  • Gnn-Rag 在 WebQSP 和 CWQ 的多项指标上实现了最先进的性能,超越了若干基线。
  • GNN-based retrieval effectively handles multi-hop KGQA and retrieves necessary reasoning paths, improving F1 by significant margins on complex questions.
  • Retrieval augmentation (RA) further boosts performance, with Gnn-Rag +RA often surpassing RoG and matching or exceeding stronger LLM-based methods while using a smaller LLM (7B).
  • GNN reasoning provides better answer recall and path diversity for multi-hop questions, while LLMs contribute language understanding for final reasoning.
  • Gnn-Rag demonstrates faithfulness improvements by supplying correct multi-hop facts and reducing hallucinated or irrelevant information in reasoning paths.
  • Gnn-Rag improves weaker LLMs substantially and can be integrated with various LLMs without retraining.
Figure 3 : Gnn-Rag : The GNN reasons over a dense subgraph to retrieve candidate answers, along with the corresponding reasoning paths (shortest paths from question entities to answers). The retrieved reasoning paths –optionally combined with retrieval augmentation (RA)– are verbalized and given to
Figure 3 : Gnn-Rag : The GNN reasons over a dense subgraph to retrieve candidate answers, along with the corresponding reasoning paths (shortest paths from question entities to answers). The retrieved reasoning paths –optionally combined with retrieval augmentation (RA)– are verbalized and given to

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。