Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
86 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
53 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

$R^3$: "This is My SQL, Are You With Me?" A Consensus-Based Multi-Agent System for Text-to-SQL Tasks (2402.14851v2)

Published 20 Feb 2024 in cs.CL, cs.AI, and cs.DB

Abstract: LLMs have demonstrated strong performance on various tasks. To unleash their power on the Text-to-SQL task, we propose $R3$ (Review-Rebuttal-Revision), a consensus-based multi-agent system for Text-to-SQL tasks. $R3$ outperforms the existing single LLM Text-to-SQL systems as well as the multi-agent Text-to-SQL systems by $1.3\%$ to $8.1\%$ on Spider and Bird. Surprisingly, we find that for Llama-3-8B, $R3$ outperforms chain-of-thought prompting by over 20\%, even outperforming GPT-3.5 on the development set of Spider.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Palm 2 technical report. arXiv preprint arXiv:2305.10403.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  3. Lgesql: line graph enhanced text-to-sql model with mixed local and non-local relations. arXiv preprint arXiv:2106.01093.
  4. Improving in-context few-shot learning via self-supervised training. arXiv preprint arXiv:2205.01703.
  5. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. Transactions on Machine Learning Research.
  6. Seed: Domain-specific data curation with large language models. arXiv e-prints, pages arXiv–2310.
  7. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
  8. You are what you annotate: Towards better models through annotator representations. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 12475–12498, Singapore. Association for Computational Linguistics.
  9. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  10. C3: Zero-shot text-to-sql with chatgpt. arXiv preprint arXiv:2307.07306.
  11. Text-to-sql empowered by large language models: A benchmark evaluation. arXiv preprint arXiv:2308.15363.
  12. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
  13. Resdsql: Decoupling schema linking and skeleton parsing for text-to-sql. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 13067–13075.
  14. Can llm already serve as a database interface. A big bench for large-scale database grounded text-to-sqls. CoRR abs/2305.03111.
  15. Yixin Liu and Pengfei Liu. 2021. Simcls: A simple framework for contrastive learning of abstractive summarization. arXiv preprint arXiv:2106.01890.
  16. Henry B Mann and Donald R Whitney. 1947. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, pages 50–60.
  17. R OpenAI. 2023. Gpt-4 technical report. arXiv, pages 2303–08774.
  18. Barbara Plank. 2022. The “problem” of human label variation: On ground truth in data, modeling and evaluation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10671–10682, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  19. Mohammadreza Pourreza and Davood Rafiei. 2023. Din-sql: Decomposed in-context learning of text-to-sql with self-correction. arXiv preprint arXiv:2304.11015.
  20. Rasat: Integrating relational structures into pretrained seq2seq model for text-to-sql. arXiv preprint arXiv:2205.06983.
  21. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  22. Picard: Parsing incrementally for constrained auto-regressive decoding from language models. arXiv preprint arXiv:2109.05093.
  23. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  24. Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. arXiv preprint arXiv:1911.04942.
  25. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
  26. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  27. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
  28. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. arXiv preprint arXiv:1809.08887.
  29. Semantic evaluation for text-to-sql with distilled test suites. arXiv preprint arXiv:2010.02840.
Citations (3)

Summary

We haven't generated a summary for this paper yet.