Skip to main content
QUICK REVIEW

[论文解读] DA-RAG: Dynamic Attributed Community Search for Retrieval-Augmented Generation

Xingyuan Zeng, Zuohan Wu|arXiv (Cornell University)|Feb 9, 2026
Information Retrieval and Search Behavior被引用 0
一句话总结

DA-RAG 引入 Embedding-Attributed Community Search (EACS) 和 chunk-layer 图索引,以动态检索高阶、面向查询的子图用于 RAG,在准确性和效率方面优于基线。

ABSTRACT

Owing to their unprecedented comprehension capabilities, large language models (LLMs) have become indispensable components of modern web search engines. From a technical perspective, this integration represents retrieval-augmented generation (RAG), which enhances LLMs by grounding them in external knowledge bases. A prevalent technical approach in this context is graph-based RAG (G-RAG). However, current G-RAG methodologies frequently underutilize graph topology, predominantly focusing on low-order structures or pre-computed static communities. This limitation affects their effectiveness in addressing dynamic and complex queries. Thus, we propose DA-RAG, which leverages attributed community search (ACS) to extract relevant subgraphs based on the queried question dynamically. DA-RAG captures high-order graph structures, allowing for the retrieval of self-complementary knowledge. Furthermore, DA-RAG is equipped with a chunk-layer oriented graph index, which facilitates efficient multi-granularity retrieval while significantly reducing both computational and economic costs. We evaluate DA-RAG on multiple datasets, demonstrating that it outperforms existing RAG methods by up to 40% in head-to-head comparisons across four metrics while reducing index construction time and token overhead by up to 37% and 41%, respectively.

研究动机与目标

  • 通过利用静态社区之外的高阶图结构来提升 RAG 的检索效果。
  • 提出 Embedding-Attributed Community Search,在动态提取结构上紧密、与查询相关的子图。
  • 设计一个面向块层的图索引,以实现高效的多粒度检索,而不依赖于大量聚类。
  • 实证显示 DA-RAG 的性能优于基线并降低索引与标记成本。

提出的方法

  • 定义 Embedding-Attributed Community Search (EACS),在 k-truss 连通子图中最大化查询相关性,缓解自由乘客效应。
  • 开发三层离线索引:语义块层 (L_C)、知识图谱层 (L_KG) 与相似度层 (L_S);通过层间和层内边连接各层。
  • 使用自粗到细的在线检索:先在 L_C 识别 H_C,再剪枝为 G_KG^work 与 G_S^work,最后在精 refined 子图上应用 EACS 获得 H_KG 与 H_S。
  • 通过候选评估循环结合大模型辅助评分与上下文预算约束,动态确定 k-truss 的 k 值。
  • 提出 Q-Peel 启发式方法以高效解决 EACS,给出 NP-hard 性质与分析复杂度。
  • 提供在 UltraDomain 数据集上对多基线的端到端评估,衡量有效性和效率。
Figure 1. Differences between existing methods and our method. (a) Methods w/o community concern are limited to low-order graph topology, capturing only partial aspects. (b) Methods with static community partition could return a diverging and unfocused response. (c) Our method retrieves a query-rele
Figure 1. Differences between existing methods and our method. (a) Methods w/o community concern are limited to low-order graph topology, capturing only partial aspects. (b) Methods with static community partition could return a diverging and unfocused response. (c) Our method retrieves a query-rele

实验结果

研究问题

  • RQ1与最先进基线相比,DA-RAG 在检索质量和答案准确性方面的表现如何?
  • RQ2DA-RAG 在索引构建和在线检索方面有多高的效率?
  • RQ3DA-RAG 检索得到的子图在结构凝聚性和语义相关性方面是否优于其他方法?
  • RQ4EACS 的自适应 k 确定对不同查询的性能有何影响?

主要发现

  • DA-RAG 在四个指标的头对头比较中领先基线,优势高达 40%。
  • 索引构建时间和标记开销分别降低至原来的最高 37% 和 41%。
  • 在线检索阶段,标记消耗平均下降 73.8%(在某些数据集上最高达 88.76%),延迟与 GraphRAG-Global 相当。
  • DA-RAG 采用自粗到细的检索策略和 EACS,产出高质量、具凝聚性的子图。
  • 自适应的 k 确定过程与 Q-Peel 启发式方法提供高效、面向查询的子图提取,具有可证明的凝聚性和受限的推理跳数。
Figure 2. Overview of the DA-RAG framework: (a) Offline Indexing creates a novel graph index from source documents, comprising a high-level layer ( $L_{C}$ ) and two granular layers ( $L_{KG}$ and $L_{S}$ ). (b) Online Retrieval employs a coarse-to-fine strategy. (c) EACS Formulation defines the sub
Figure 2. Overview of the DA-RAG framework: (a) Offline Indexing creates a novel graph index from source documents, comprising a high-level layer ( $L_{C}$ ) and two granular layers ( $L_{KG}$ and $L_{S}$ ). (b) Online Retrieval employs a coarse-to-fine strategy. (c) EACS Formulation defines the sub

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。