Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm (2402.10671v3)

Published 16 Feb 2024 in cs.CL

Abstract: In-context learning of large-LLMs has achieved remarkable success in the field of natural language processing, while extensive case studies reveal that the single-step chain-of-thought prompting approach faces challenges such as attention diffusion and inadequate performance in complex tasks like text-to-SQL. To improve the contextual learning capabilities of LLMs in text-to-SQL, a workflow paradigm method is proposed, aiming to enhance the attention and problem-solving scope of LLMs through decomposition. Specifically, the information determination module for eliminating redundant information and the brand-new prompt structure based on problem classification greatly enhance the model's attention. Additionally, the inclusion of self-correction and active learning modules greatly expands the problem-solving scope of LLMs, hence improving the upper limit of LLM-based approaches. Extensive experiments conducted on three datasets demonstrate that our approach outperforms other methods by a significant margin. About 2-3 percentage point improvements compared to the existing baseline on the Spider Dev, Spider-Realistic, and Bird Dev datasets and new SOTA results on the Spider Test dataset are achieved. Our code is available on GitHub: \url{https://github.com/FlyingFeather/DEA-SQL}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. LGESQL: Line graph enhanced text-to-SQL model with mixed local and non-local relations. In ACL.
  2. Shuaichen Chang and Eric Fosler-Lussier. 2023a. How to prompt llms for text-to-sql: A study in zero-shot, single-domain, and cross-domain settings. arXiv preprint arXiv:2305.11853.
  3. Shuaichen Chang and Eric Fosler-Lussier. 2023b. Selective demonstrations for cross-domain text-to-sql. In Findings of EMNLP.
  4. Structure-grounded pretraining for text-to-sql. In NAACL.
  5. A survey for in-context learning. arXiv preprint arXiv:2301.00234.
  6. C3: Zero-shot text-to-sql with chatgpt. arXiv preprint arXiv:2307.07306.
  7. Improving text-to-sql evaluation methodology. In ACL.
  8. Text-to-sql empowered by large language models: A benchmark evaluation. arXiv preprint arXiv:2308.15363.
  9. Towards complex text-to-sql in cross-domain database with intermediate representation. In ACL.
  10. S2sql: Injecting syntax to question-schema interaction graph encoder for text-to-sql parsers. In ACL.
  11. Re-examining the role of schema linking in text-to-sql. In EMNLP.
  12. Resdsql: Decoupling schema linking and skeleton parsing for text-to-sql. In AAAI.
  13. Graphix-t5: Mixing pre-trained transformers with graph-aware layers for text-to-sql parsing. In ACL.
  14. Few-shot aspect category sentiment analysis via meta-learning. ACM Transactions on Information Systems.
  15. A comprehensive evaluation of chatgpt’s zero-shot text-to-sql capability. arXiv preprint arXiv:2303.13547.
  16. OpenAI. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  17. Training language models to follow instructions with human feedback. In NeurIPS.
  18. Mohammadreza Pourreza and Davood Rafiei. 2023. DIN-SQL: Decomposed in-context learning of text-to-SQL with self-correction. In NeurIPS.
  19. A survey on text-to-sql parsing: Concepts, methods, and future directions. arXiv preprint arXiv:2208.13629.
  20. Evaluating the text-to-sql capabilities of large language models. arXiv preprint arXiv:2204.00498.
  21. Picard: Parsing incrementally for constrained auto-regressive decoding from language models. In EMNLP.
  22. Exploring chain-of-thought style prompting for text-to-sql. In EMNLP.
  23. Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. In ACL.
  24. Mac-sql: Multi-agent collaboration for text-to-sql. arXiv preprint arXiv:2312.11242.
  25. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS.
  26. A paradigm shift in machine translation: Boosting translation performance of large language models. arXiv preprint arXiv:2309.11674.
  27. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. In EMNLP.
  28. Act-sql: In-context learning for text-to-sql with automatically-generated chain-of-thought. In Findings of EMNLP.
  29. Semantic evaluation for text-to-sql with distilled test suite. In EMNLP.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Yuanzhen Xie (8 papers)
  2. Xinzhou Jin (4 papers)
  3. Tao Xie (117 papers)
  4. MingXiong Lin (2 papers)
  5. Liang Chen (360 papers)
  6. Chenyun Yu (10 papers)
  7. Lei Cheng (71 papers)
  8. Bo Hu (110 papers)
  9. Zang Li (15 papers)
  10. Chengxiang Zhuo (6 papers)
Citations (11)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets