HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

CHESS: Contextual Harnessing for Efficient SQL Synthesis

Talaei Shayan ; Pourreza Mohammadreza ; Chang Yu-Chen ; Mirhoseini Azalia ; Saberi Amin

CHESS: Contextual Harnessing for Efficient SQL Synthesis

Abstract

Translating natural language questions into SQL queries, known astext-to-SQL, is a long-standing research problem. Effective text-to-SQLsynthesis can become very challenging due to (i) the extensive size of databasecatalogs (descriptions of tables and their columns) and database values, (ii)reasoning over large database schemas, (iii) ensuring the functional validityof the generated queries, and (iv) navigating the ambiguities of naturallanguage questions. We introduce CHESS, a Large Language Model (LLM) basedmulti-agent framework for efficient and scalable SQL synthesis, comprising fourspecialized agents, each targeting one of the aforementioned challenges: theInformation Retriever (IR) extracts relevant data, the Schema Selector (SS)prunes large schemas, the Candidate Generator (CG) generates high-qualitycandidates and refines queries iteratively, and the Unit Tester (UT) validatesqueries through LLM-based natural language unit tests. Our framework offersconfigurable features that adapt to various deployment constraints, including1) Supporting industrial-scale databases: leveraging the Schema Selector agent,CHESS efficiently narrows down very large database schemas into manageablesub-schemas, boosting system accuracy by approximately $2\%$ and reducing thenumber of LLM tokens by $\times 5$. 2) State-of-the-Art privacy-preservingperformance: Among the methods using open-source models, CHESS achievesstate-of-the-art performance, resulting in a high-performing,privacy-preserving system suitable for industrial deployment. 3) Scalablitywith additional compute budget: In settings with high computational budgets,CHESS achieves $71.10\%$ accuracy on the BIRD test set, within $2\%$ of theleading proprietary method, while requiring approximately $83\%$ fewer LLMcalls.

Code Repositories

shayantalaei/chess
Official
pytorch
yeounoh/lc_nl2sql
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
text-to-sql-on-bird-big-bench-for-large-scaleCHESS
Execution Accuracy % (Dev): 65
Execution Accuracy % (Test): 66.69

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
CHESS: Contextual Harnessing for Efficient SQL Synthesis | Papers | HyperAI