HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models

Karime Maamari Fadhil Abubaker Daniel Jaroslawicz Amine Mhedhbi

The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models

Abstract

Schema linking is a crucial step in Text-to-SQL pipelines. Its goal is to retrieve the relevant tables and columns of a target database for a user's query while disregarding irrelevant ones. However, imperfect schema linking can often exclude required columns needed for accurate query generation. In this work, we revisit schema linking when using the latest generation of large language models (LLMs). We find empirically that newer models are adept at utilizing relevant schema elements during generation even in the presence of large numbers of irrelevant ones. As such, our Text-to-SQL pipeline entirely forgoes schema linking in cases where the schema fits within the model's context window in order to minimize issues due to filtering required schema elements. Furthermore, instead of filtering contextual information, we highlight techniques such as augmentation, selection, and correction, and adopt them to improve the accuracy of our Text-to-SQL pipeline. Our approach ranks first on the BIRD benchmark achieving an accuracy of 71.83%.

Benchmarks

BenchmarkMethodologyMetrics
text-to-sql-on-bird-big-bench-for-large-scaleDistillery + GPT-4o
Execution Accuracy % (Dev): 67.21
Execution Accuracy % (Test): 71.83

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models | Papers | HyperAI