Skip to main content
QUICK REVIEW

[论文解读] Seeing the Trees for the Forest: Leveraging Tree-Shaped Substructures in Property Graphs

Daniel Aarao Reis Arturi, Christoph Kohnen|arXiv (Cornell University)|Mar 12, 2026
Graph Theory and Algorithms被引用 0
一句话总结

本文主张将属性图中的树状子结构作为一等公民,并证明受 XML 启发的结构性索引可以在关系型后端显著加速路径查询,提出端到端树感知图查询的愿景与研究议程。

ABSTRACT

Property graphs often contain tree-shaped substructures, yet they are not captured by existing proposals for graph schemas; likewise, query languages and query engines offer little-to-no native support for managing them systematically. As a first contribution, we report on a micro experiment that demonstrates the optimization potential of treating tree-shaped substructures as first class citizens in graph database systems. In particular, we show that in systems backed by relational engines, we can achieve substantial speedups by leveraging structural indexes, as originally developed for XML databases, to accelerate path queries. Based on our findings, we put forward a vision in which tree-shaped substructures are systematically managed throughout the graph query lifecycle, from modeling and schema design to indexing and query processing, and outline arising research questions.

研究动机与目标

  • Real-world property graphs contain significant tree-shaped substructures.
  • Demonstrate potential performance gains by applying structural indexes (PrePost, Dewey) to tree patterns on relational backends.
  • Propose a research agenda for modeling, indexing, and processing trees within graph query lifecycles.
  • Outline challenges for schema design, updates, and end-to-end query optimization in tree-aware GDBMSs.

提出的方法

  • Prototype implementation of PrePost and Dewey tree indexes on three GDBMS backends: Neo4j, Kuzu, and Apache AGE.
  • Evaluation of three tree-based queries (descendants, leaves, ancestor/descendant) across synthetic trees/forests and LDBC SNB data.
  • Comparison of baseline queries against index-augmented queries to measure speedups and slowdowns.
  • Analysis of performance variability across graph sizes, tree shapes, and edge directions.

实验结果

研究问题

  • RQ1Can structural indexes for tree-shaped substructures accelerate query evaluation in property graphs backed by relational engines?
  • RQ2How do PrePost and Dewey indexes perform across different tree shapes, sizes, and graph datasets?
  • RQ3What are the practical challenges and research directions for integrating tree-aware indexing into schema design, updates, and end-to-end query processing?

主要发现

  • Structural indexes (PrePost, Dewey) yield multi-order-of-magnitude speedups for tree-oriented queries on relational backends across most tested graphs and queries.
  • Kuzu shows strong speedups (up to ~33x) for ancestor/descendant queries on LDBC SNB and synthetic trees, while Apache AGE often achieves even higher speedups (up to >10^3x).
  • Neo4j’s native graph engine largely sees little to no improvement from these tree-based indexes, indicating limited benefit for native graph engines in this setup.
  • PrePost generally outperforms or matches Dewey in most configurations, though Dewey can outperform in some scenarios depending on the data layout and updates.
  • Speedups increase with larger graphs and deeper trees, with substantial gains when queries avoid full scans and structural joins.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。