Skip to main content
QUICK REVIEW

[论文解读] Text2Cypher: Bridging Natural Language and Graph Databases

Makbule Gülçin Özsoy, Leila Messallem|arXiv (Cornell University)|Dec 13, 2024
Semantic Web and Ontologies被引用 5
一句话总结

论文通过合并公开来源,构建了一个大型、清洁的 Text2Cypher 数据集,基于多模型基线进行评估,并展示微调可提升 Cypher 查询翻译准确性。

ABSTRACT

Knowledge graphs use nodes, relationships, and properties to represent arbitrarily complex data. When stored in a graph database, the Cypher query language enables efficient modeling and querying of knowledge graphs. However, using Cypher requires specialized knowledge, which can present a challenge for non-expert users. Our work Text2Cypher aims to bridge this gap by translating natural language queries into Cypher query language and extending the utility of knowledge graphs to non-technical expert users. While large language models (LLMs) can be used for this purpose, they often struggle to capture complex nuances, resulting in incomplete or incorrect outputs. Fine-tuning LLMs on domain-specific datasets has proven to be a more promising approach, but the limited availability of high-quality, publicly available Text2Cypher datasets makes this challenging. In this work, we show how we combined, cleaned and organized several publicly available datasets into a total of 44,387 instances, enabling effective fine-tuning and evaluation. Models fine-tuned on this dataset showed significant performance gains, with improvements in Google-BLEU and Exact Match scores over baseline models, highlighting the importance of high-quality datasets and fine-tuning in improving Text2Cypher performance.

研究动机与目标

  • 鼓励将自然语言翻译为 Cypher,以便让非专业人士更容易访问图数据库。
  • 通过整合公开来源,创建一个大型、干净且可用的 Text2Cypher 数据集。
  • 对 Text2Cypher 任务基线模型与微调模型进行基准测试。
  • 证明微调在性能上优于基线模型。

提出的方法

  • 汇总并统一 16 个公开的 Text2Cypher 数据集为单一格式,字段包括:question, schema, cypher, data_source, database_reference, instance_id。
  • 通过人工检查、移除无效查询,以及在本地 Neo4j 数据库中使用 EXPLAIN 进行语法验证来清洗数据。
  • 将数据分为训练集(约 39,554 条)和测试集(约 4,833 条),共 44,387 条,并分析分布。
  • 使用翻译基准(Google-BLEU)和执行基准(Exact Match)指标,对一系列基线模型与微调模型进行基准测试。
  • 在新数据集上对选定模型进行微调,并与基线进行比较以量化提升。
Figure 1: User wants to write a Cypher query for ‘What are the movies of Tom Hanks‘. A Text2Cypher model translates the input natural language question into Cypher, i.e., ‘MATCH (actor:Person {name: "Tom Hanks"})-[:ACTED_IN]->(movie:Movie) RETURN movie.title AS movies‘
Figure 1: User wants to write a Cypher query for ‘What are the movies of Tom Hanks‘. A Text2Cypher model translates the input natural language question into Cypher, i.e., ‘MATCH (actor:Person {name: "Tom Hanks"})-[:ACTED_IN]->(movie:Movie) RETURN movie.title AS movies‘

实验结果

研究问题

  • RQ1一个庞大、统一的 Text2Cypher 数据集是否能提升自然语言到 Cypher 翻译模型的性能?
  • RQ2微调模型在 Text2Cypher 的翻译和执行指标上是否优于其基线?
  • RQ3哪些模型族(open-weighted、closed-foundational)在 Text2Cypher 微调中受益最大?
  • RQ4Text2Cypher 模型的翻译评估与执行评估如何比较?

主要发现

  • 最终数据集包含 44,387 条实例,其中 39,554 条用于训练,4,833 条用于测试。
  • 微调模型在 Google-BLEU 和 Exact Match 指标上始终优于其基线版本。
  • 在基线模型中,OpenAI/GPT-4o 与 Gemini-1.5-Pro-001 在某些设置下表现领先,通常较大的模型表现更好。
  • 在微调模型中,改进包括最高约 0.34 的 Google-BLEU 和约 0.11 的 Exact Match,相对于基线。
  • 最佳微调结果由 Finetuned-OpenAI/Gpt4o、Finetuned-OpenAI/Gpt4o-mini,以及 Finetuned-GoogleAIStudio/Gemini-1.5-Flash-001 实现。
  • 该数据集和微调方法凸显了高质量数据和微调对 Text2Cypher 的重要性。
Figure 2: Relational databases uses SQL-based query languages, while Graph databases commonly uses Cypher query language. The figure shows an example representation of Person, Location, Gender and Marriage entities and relationships on a relational and graph database.
Figure 2: Relational databases uses SQL-based query languages, while Graph databases commonly uses Cypher query language. The figure shows an example representation of Person, Location, Gender and Marriage entities and relationships on a relational and graph database.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。