Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL (2412.12522v1)

Published 17 Dec 2024 in cs.CL and cs.AI

Abstract: Recently, LLMs have significantly improved the performance of text-to-SQL systems. Nevertheless, many state-of-the-art (SOTA) approaches have overlooked the critical aspect of system robustness. Our experiments reveal that while LLM-driven methods excel on standard datasets, their accuracy is notably compromised when faced with adversarial perturbations. To address this challenge, we propose a robust text-to-SQL solution, called Solid-SQL, designed to integrate with various LLMs. We focus on the pre-processing stage, training a robust schema-linking model enhanced by LLM-based data augmentation. Additionally, we design a two-round, structural similarity-based example retrieval strategy for in-context learning. Our method achieves SOTA SQL execution accuracy levels of 82.1% and 58.9% on the general Spider and Bird benchmarks, respectively. Furthermore, experimental results show that Solid-SQL delivers an average improvement of 11.6% compared to baselines on the perturbed Spider-Syn, Spider-Realistic, and Dr. Spider benchmarks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Geling Liu (1 paper)
  2. Yunzhi Tan (3 papers)
  3. Ruichao Zhong (3 papers)
  4. Yuanzhen Xie (8 papers)
  5. Lingchen Zhao (13 papers)
  6. Qian Wang (453 papers)
  7. Bo Hu (110 papers)
  8. Zang Li (15 papers)