QUICK REVIEW

[论文解读] SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended)

Ruoxi Sun, Sercan Ö. Arık|arXiv (Cornell University)|May 26, 2023

Semantic Web and Ontologies被引用 25

一句话总结

SQL-PaLM 将 PaLM-2 应用于文本到 SQL，利用少量样本的基于执行的一致性与微调，在 Spider 的测试套件上达到最先进的准确率，并在 Spider 变体上展示鲁棒性。

ABSTRACT

Text-to-SQL, the process of translating natural language into Structured Query Language (SQL), represents a transformative application of large language models (LLMs), potentially revolutionizing how humans interact with data. This paper introduces the SQL-PaLM framework, a comprehensive solution for understanding and enhancing Text-to-SQL using LLMs, using in the learning regimes of few-shot prompting and instruction fine-tuning. With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error filtering. With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs. In particular, we investigate how performance can be improved through expanded training data coverage and diversity, synthetic data augmentation, and integrating query-specific database content. We propose a test-time selection method to further refine accuracy by integrating SQL outputs from multiple paradigms with execution feedback as guidance. Additionally, we tackle the practical challenge of navigating intricate databases with a significant number of tables and columns, proposing efficient techniques for accurately selecting relevant database elements to enhance Text-to-SQL performance. Our holistic approach yields substantial advancements in Text-to-SQL, as demonstrated on two key public benchmarks, Spider and BIRD. Through comprehensive ablations and error analyses, we shed light on the strengths and weaknesses of our framework, offering valuable insights into Text-to-SQL's future work.

研究动机与目标

通过同时利用 PaLM-2 进行少量提示和微调，激励并解决使用大型语言模型（LLMs）的文本到 SQL 问题。
引入基于执行的自一致性方法，以改进少样本 SQL 生成。
展示在 Spider 数据上对大型 LLM 的微调，并评估对 Spider 变体中的分布偏移的鲁棒性。
与强基线进行比较，包括基于上下文学习和微调的 SOTA 方法。
分析提示设计选择以及自一致性、执行筛选和模型适应之间的相互作用。

提出的方法

在少样本和微调设置中都采用 PaLM-2 作为 Text-to-SQL 的骨干模型。
为少样本提示设计一个基于执行的自一致性解码方案，以选择具有一致执行结果的 SQL。
在 Spider 训练数据及数据库模式和自然语言问题上对 PaLM-2 进行微调，以生成目标 SQL。
在 Spider 及其变体上使用执行准确度（EX）和测试套件准确度（TS）进行评估。
执行消融实验以衡量自一致性、执行筛选和提示设计对性能的影响。
将Few-shot SQL-PaLM 与 Fine-tuned SQL-PaLM 与基线进行比较，包括 PICARD、RASAT、RESDSQL，以及各种上下文学习提示。

实验结果

研究问题

RQ1在具有基于执行的一致性的少样本 SQL-PaLM 相对于最先进的微调和上下文学习方法，在 Text-to-SQL 上的表现如何？
RQ2执行筛选和自一致性对 SQL-PaLM 的准确率有何影响？
RQ3在鲁棒性和对 Spider 变体的泛化方面，PaLM-2 在 Spider 数据上的微调与少样本提示相比如何？
RQ4SQL-PaLM 是否在不同提示设计和 SQL 生成的难度水平下保持性能？

主要发现

少样本 SQL-PaLM 在 Spider dev 上达到 77.3% TS，超越微调 SOTA 3.8%，超越上下文学习 SOTA 3.1%。
微调 SQL-PaLM 在 Spider dev 上达到 78.2% TS，相较于先前的微调 SOTA 提高 4.7%。
消融结果显示自一致性和执行筛选显著提升性能；移除它们会降低 TS。
SQL-PaLM 在 Spider 变体（Spider-SYN、Spider-Realistic、Spider-DK）上展示出强鲁棒性，泛化能力优于基线。
少样本 SQL-PaLM 在简单提示下超越零-shot 的 ChatGPT 和少样本的 GPT-4，在 Spider 上表现优越，并在不同难度水平保持竞争力。
与基线（RESDSQL-3B+NatSQL、RASAT+PICARD、PICARD、DIN-SQL、CodeX、GPT-4 等）比较，SQL-PaLM 在使用简单提示方式时达到最高或接近最高的 TS。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。