HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Evaluating and Enhancing LLMs for Multi-turn Text-to-SQL with Multiple Question Types

Guo Ziming ; Ma Chao ; Sun Yinggang ; Zhao Tiancheng ; Wang Guangyao ; Huang Hai

Evaluating and Enhancing LLMs for Multi-turn Text-to-SQL with Multiple
  Question Types

Abstract

Recent advancements in large language models (LLMs) have significantlyadvanced text-to-SQL systems. However, most LLM-based methods often narrowlyfocus on SQL generation, neglecting the complexities of real-worldconversational queries. This oversight can lead to unreliable responses,particularly for ambiguous questions that cannot be directly addressed withSQL. To bridge this gap, we propose MMSQL, a comprehensive test suite designedto evaluate the question classification and SQL generation capabilities of LLMsby simulating real-world scenarios with diverse question types and multi-turnQ&A interactions. Using MMSQL, we assessed the performance of popular LLMs,including both open-source and closed-source models, and identified key factorsimpacting their performance in such scenarios. Moreover, we introduce anLLM-based multi-agent framework that employs specialized agents to identifyquestion types and determine appropriate answering strategies. Our experimentsdemonstrate that this approach significantly enhances the model's ability tonavigate the complexities of conversational dynamics, effectively handling thediverse and complex nature of user queries. Our dataset and code are publiclyavailable at https://mcxiaoxiao.github.io/MMSQL.

Code Repositories

mcxiaoxiao/MMSQL
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
mmsql-performance-on-mmsqlSQLCoder-8B
TDEX: 30.7
mmsql-performance-on-mmsqlGemini-1.5 Flash
TDEX: 65.8
mmsql-performance-on-mmsqlLlama3-8B
TDEX: 64.0
mmsql-performance-on-mmsqlGPT-4 Turbo
TDEX: 67.0
mmsql-performance-on-mmsqlLlama3-70B
TDEX: 62.8
mmsql-performance-on-mmsqlGPT-3.5 Turbo
TDEX: 64.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Evaluating and Enhancing LLMs for Multi-turn Text-to-SQL with Multiple Question Types | Papers | HyperAI