Text To Sql On Bird Big Bench For Large Scale
评估指标
Execution Accuracy % (Dev)
Execution Accuracy % (Test)
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | |||
|---|---|---|---|---|
| DSAIR + GPT-4o | 74.32 | 74.12 | - | - |
| XiYan-SQL | 73.34 | 75.63 | A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQL | |
| CHASE-SQL + Gemini | 73.14 | 74.06 | CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL | - |
| ExSL + granite-34b-code | 72.43 | 73.17 | - | - |
| Insights AI | 72.16 | 70.26 | - | - |
| OpenSearch-SQL+ v2 + GPT-4o | 69.3 | 72.28 | - | - |
| PURPLE + RED + GPT-4o | 68.12 | 70.21 | - | - |
| Arcwise + GPT-4o | 67.99 | 66.21 | - | - |
| Distillery + GPT-4o | 67.21 | 71.83 | The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models | - |
| RECAP + Gemini | 66.95 | 69.03 | - | - |
| MSL-SQL + DeepSeek-V2.5 | 66.82 | 64.00 | - | - |
| MSc-SQL | 65.6 | - | MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation | |
| ByteBrain | 65.45 | 68.87 | - | - |
| ExSL + granite-20b-code | 65.38 | 67.86 | - | - |
| CHESS | 65 | 66.69 | CHESS: Contextual Harnessing for Efficient SQL Synthesis | |
| SCL-SQL | 64.73 | 65.23 | - | - |
| SFT CodeS-15B + SQLFixAgent | 64.62 | - | - | - |
| MCS-SQL + GPT-4 | 63.36 | 65.45 | - | - |
| PURPLE + GPT-4o | 62.97 | 64.51 | - | - |
| GRA-SQL | 62.58 | 63.22 | - | - |
0 of 40 row(s) selected.