QUICK REVIEW

[论文解读] CHESS: Contextual Harnessing for Efficient SQL Synthesis

Shayan Talaei, Mohammadreza Pourreza|arXiv (Cornell University)|May 27, 2024

Service-Oriented Architecture and Web Services被引用 5

一句话总结

CHESS 提供一个端到端的基于大语言模型的文本到 SQL 流水线，能够检索上下文、裁剪模式并生成 SQL，在 BIRD 数据集上达到最先进的结果，在 Spider 上的开源表现也很强。

ABSTRACT

Translating natural language questions into SQL queries, known as text-to-SQL, is a long-standing research problem. Effective text-to-SQL synthesis can become very challenging due to (i) the extensive size of database catalogs (descriptions of tables and their columns) and database values, (ii) reasoning over large database schemas, (iii) ensuring the functional validity of the generated queries, and (iv) navigating the ambiguities of natural language questions. We introduce CHESS, a Large Language Model (LLM) based multi-agent framework for efficient and scalable SQL synthesis, comprising four specialized agents, each targeting one of the aforementioned challenges: the Information Retriever (IR) extracts relevant data, the Schema Selector (SS) prunes large schemas, the Candidate Generator (CG) generates high-quality candidates and refines queries iteratively, and the Unit Tester (UT) validates queries through LLM-based natural language unit tests. Our framework offers configurable features that adapt to various deployment constraints, including 1) Supporting industrial-scale databases: leveraging the Schema Selector agent, CHESS efficiently narrows down very large database schemas into manageable sub-schemas, boosting system accuracy by approximately $2\%$ and reducing the number of LLM tokens by $ imes 5$. 2) State-of-the-Art privacy-preserving performance: Among the methods using open-source models, CHESS achieves state-of-the-art performance, resulting in a high-performing, privacy-preserving system suitable for industrial deployment. 3) Scalablity with additional compute budget: In settings with high computational budgets, CHESS achieves $71.10\%$ accuracy on the BIRD test set, within $2\%$ of the leading proprietary method, while requiring approximately $83\%$ fewer LLM calls.

研究动机与目标

解决将自然语言问题翻译成针对具有大型模式和目录的真实世界数据库的 SQL 的挑战。
开发可扩展的检索机制，将数据库值和目录纳入 SQL 生成。
提出一种高效的模式裁剪方法，将输入缩减到 SQL 生成器所需的最小范围。
通过消融研究和与开源及专有模型的对比，展示端到端的性能提升。

提出的方法

三阶段流水线：实体/上下文检索、模式选择和 SQL 生成。
使用分层检索（关键词提取、局部敏感哈希 LSH、向量数据库）获取相关值和目录描述。
自适应、多阶段的模式裁剪（列筛选、表选择、最终列选择），以获得最小且足够的模式。
候选 SQL 生成后再通过模型反馈与自一致性进行修订，以选择最频繁正确的答案。
对值进行 LSH 编码的预处理和对目录的向量数据库处理，以在有限上下文窗口内实现高效检索。

实验结果

研究问题

RQ1如何通过检索值和目录元数据来提升现实世界数据库的文本到 SQL 的准确性？
RQ2自适应模式裁剪是否能在不丢失正确 SQL 生成所需信息的情况下减小输入规模？
RQ3与以往方法相比，检索、裁剪和生成模块结合对端到端 SQL 准确性有何影响？
RQ4CHESS 在 BIRD 和 Spider 等挑战性基准上，与开源和专有的大语言模型相比表现如何？

主要发现

方法	测试 EX	开发 EX
CHESS + proprietary (ours)	66.69	65.00
MCS-SQL + GPT-4	65.45	63.36
CHESS + Open LLMs (ours)	–	61.50
SFT CodeS-15B	60.37	58.47
DTS-SQL + DeepSeek 7B	60.31	55.80
MAC-SQL + GPT-4	57.56	59.59

在使用专有模型时，CHESS 在 BIRD 的开发集和测试集上达到最先进的执行准确率：65.00% dev EX 和 66.69% test EX。
在开源 LLM 下，CHESS 达到 BIRD 开发集的最高开源表现，61.5% EX，是现有开放方法中的最高值。
在 Spider 测试集中，CHESS 达到 87.2% EX，在已发表方法的比较中名列第二。
消融研究显示实体/上下文检索模块约贡献 ~5% 的准确性提升，表选择和修订步骤对性能影响显著。
端到端的开源 CHESS 流水线在隐私保护部署下取得强劲结果，缩小与闭源方法的差距。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。