HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers

Chia-Hsuan Lee Oleksandr Polozov Matthew Richardson

KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers

Abstract

The goal of database question answering is to enable natural language querying of real-life relational databases in diverse application domains. Recently, large-scale datasets such as Spider and WikiSQL facilitated novel modeling techniques for text-to-SQL parsing, improving zero-shot generalization to unseen databases. In this work, we examine the challenges that still prevent these techniques from practical deployment. First, we present KaggleDBQA, a new cross-domain evaluation dataset of real Web databases, with domain-specific data types, original formatting, and unrestricted questions. Second, we re-examine the choice of evaluation tasks for text-to-SQL parsers as applied in real-life settings. Finally, we augment our in-domain evaluation task with database documentation, a naturally occurring source of implicit domain knowledge. We show that KaggleDBQA presents a challenge to state-of-the-art zero-shot parsers but a more realistic evaluation setting and creative use of associated database documentation boosts their accuracy by over 13.2%, doubling their performance.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
text-to-sql-on-kaggledbqaRAT-SQL
Exact Match (EM): 26.77
text-to-sql-on-kaggledbqaEdit-SQL
Exact Match (EM): 11.73

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp