Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UNITE: A Unified Benchmark for Text-to-SQL Evaluation (2305.16265v3)

Published 25 May 2023 in cs.CL

Abstract: A practical text-to-SQL system should generalize well on a wide variety of natural language questions, unseen database schemas, and novel SQL query structures. To comprehensively evaluate text-to-SQL systems, we introduce a UNIfied benchmark for Text-to-SQL Evaluation (UNITE). It is composed of publicly available text-to-SQL datasets, containing natural language questions from more than 12 domains, SQL queries from more than 3.9K patterns, and 29K databases. Compared to the widely used Spider benchmark, we introduce $\sim$120K additional examples and a threefold increase in SQL patterns, such as comparative and boolean questions. We conduct a systematic study of six state-of-the-art (SOTA) text-to-SQL parsers on our new benchmark and show that: 1) Codex performs surprisingly well on out-of-domain datasets; 2) specially designed decoding methods (e.g. constrained beam search) can improve performance for both in-domain and out-of-domain settings; 3) explicitly modeling the relationship between questions and schemas further improves the Seq2Seq models. More importantly, our benchmark presents key challenges towards compositional generalization and robustness issues -- which these SOTA models cannot address well. Our code and data processing script are available at https://github.com/awslabs/unified-text2sql-benchmark

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (18)
  1. Wuwei Lan (12 papers)
  2. Zhiguo Wang (100 papers)
  3. Anuj Chauhan (3 papers)
  4. Henghui Zhu (24 papers)
  5. Alexander Li (6 papers)
  6. Jiang Guo (22 papers)
  7. Sheng Zhang (212 papers)
  8. Chung-Wei Hang (14 papers)
  9. Joseph Lilien (2 papers)
  10. Yiqun Hu (8 papers)
  11. Lin Pan (23 papers)
  12. Mingwen Dong (6 papers)
  13. Jun Wang (990 papers)
  14. Jiarong Jiang (8 papers)
  15. Stephen Ash (3 papers)
  16. Vittorio Castelli (24 papers)
  17. Patrick Ng (29 papers)
  18. Bing Xiang (74 papers)
Citations (7)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com