QUICK REVIEW

[論文レビュー] Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Dawei Gao, Haibin Wang|arXiv (Cornell University)|Aug 29, 2023

Topic Modeling被引用数 23

ひとこと要約

本論文はLLMベースのText-to-SQLにおけるプロンプト設計をベンチマークし、DAIL-SQLを提案し、トークン効率の良いプロンプトと教師あり微調整によるオープンソースLLM分析を通じてSpiderの実行精度で最先端を示す。

ABSTRACT

Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To address this challenge, in this paper, we first conduct a systematical and extensive comparison over existing prompt engineering methods, including question representation, example selection and example organization, and with these experimental results, we elaborate their pros and cons. Based on these findings, we propose a new integrated solution, named DAIL-SQL, which refreshes the Spider leaderboard with 86.6% execution accuracy and sets a new bar. To explore the potential of open-source LLM, we investigate them in various scenarios, and further enhance their performance with supervised fine-tuning. Our explorations highlight open-source LLMs' potential in Text-to-SQL, as well as the advantages and disadvantages of the supervised fine-tuning. Additionally, towards an efficient and economic LLM-based Text-to-SQL solution, we emphasize the token efficiency in prompt engineering and compare the prior studies under this metric. We hope that our work provides a deeper understanding of Text-to-SQL with LLMs, and inspires further investigations and broad applications.

研究の動機と目的

質問表現、例の選択、例の組織化に渡って、LLMベースのText-to-SQLにおけるプロンプト設計を体系的に研究する。
Text-to-SQLにおけるin-context learningと教師あり微調整におけるオープンソースLLMの有効性を評価する。
性能とトークンコストのバランスをとる統合的で効率的な解決策（DAIL-SQL）を提案する。
実用的なText-to-SQL展開を導くために、トークン効率の基準でプロンプトを比較する。

提案手法

5つの質問表現を調査し、ゼロショットText-to-SQLにおける長所と短所を特定する。
さまざまな例の選択と組織戦略を用いたin-context learningを分析する。
DAIL-SQLを導入。DAIL Selection（質問とクエリを同時に考慮した選択）とDAIL Organization（Q-Sマッピングを維持しつつトークンコストを削減）。
豊富なスキーマ/キー情報を有する質問表現として、Code Representation Prompt (CRP)を採用する。
Text-to-SQLのためのオープンソースLLMの教師あり微調整を探究し、in-context learningと比較する。
プロンプト設計の選択肢全体でトークン効率を評価し、費用対効果の高い展開に資する。

実験結果

リサーチクエスチョン

RQ1どの質問表現とプロンプト設計の選択が、LLM全体でText-to-SQLの精度と効率を最大化するか？
RQ2Text-to-SQLにおけるin-context learningと教師あり微調整で、オープンソースLLMはどう機能するか？
RQ3高い実行精度を達成する際の、プロンプト情報量とトークンコストのトレードオフは何か？
RQ4提案されたDAIL-SQLアプローチはSpiderおよび関連ベンチマークの既存の最先端とどう比較されるか？

主な発見

DAIL-SQLはSpiderで86.2%の実行精度を達成し、前の最先端（85.3%）を上回る。
自己一貫性を用いると、追加の計算コストで実行精度は86.6%に達する。
オープンソースLLMはText-to-SQLに対して大きな潜在能力を示しており、特に教師あり微調整と組み合わせた場合に顕著である。
質問表現と組織戦略は、性能とトークン効率に決定的な影響を与える。
トークン効率の高いプロンプト設計は、より少ないトークンで強い性能を達成でき、実用的な展開を導く。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。