Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries (2312.13671v1)

Published 21 Dec 2023 in cs.CL and cs.LG

Abstract: Tabular data analysis is crucial in various fields, and LLMs show promise in this area. However, current research mostly focuses on rudimentary tasks like Text2SQL and TableQA, neglecting advanced analysis like forecasting and chart generation. To address this gap, we developed the Text2Analysis benchmark, incorporating advanced analysis tasks that go beyond the SQL-compatible operations and require more in-depth analysis. We also develop five innovative and effective annotation methods, harnessing the capabilities of LLMs to enhance data quality and quantity. Additionally, we include unclear queries that resemble real-world user questions to test how well models can understand and tackle such challenges. Finally, we collect 2249 query-result pairs with 347 tables. We evaluate five state-of-the-art models using three different metrics and the results show that our benchmark presents introduces considerable challenge in the field of tabular data analysis, paving the way for more advanced research opportunities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Chen, W. 2023. Large Language Models are few(1)-shot Table Reasoners. In Findings of the Association for Computational Linguistics: EACL 2023, 1120–1130. Dubrovnik, Croatia: Association for Computational Linguistics.
  2. Toward effective insight management in visual analytics systems. In 2009 IEEE Pacific Visualization Symposium, 49–56. IEEE.
  3. Cohen, J. 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, 20(1): 37–46.
  4. Date, C. J. 1989. A Guide to the SQL Standard. Addison-Wesley Longman Publishing Co., Inc.
  5. Research challenges and opportunities in business analytics. Journal of Business Analytics, 1(1): 2–12.
  6. QuickInsights: Quick and Automatic Discovery of Insights from Multi-Dimensional Data. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD ’19, 317–332. New York, NY, USA: Association for Computing Machinery. ISBN 9781450356435.
  7. Language to Logical Form with Neural Attention. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 33–43. Association for Computational Linguistics.
  8. TaPas: Weakly Supervised Table Parsing via Pre-training. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 4320–4333.
  9. A flexible forecasting model for production systems. arXiv preprint arXiv:2105.01098.
  10. StructGPT: A general framework for Large Language Model to Reason on Structured Data.
  11. A Deep Dive into Deep Learning Approaches for Text-to-SQL Systems, 2846–2851. Association for Computing Machinery. ISBN 9781450383431.
  12. StarCoder: may the source be with you!
  13. TAPEX: Table Pre-training via Learning a Neural SQL Executor. In International Conference on Learning Representations.
  14. DeepEye: Towards Automatic Data Visualization. In 2018 IEEE 34th International Conference on Data Engineering (ICDE), 101–112. IEEE Computer Society.
  15. Metainsight: Automatic discovery of structured knowledge for exploratory data analysis. In Proceedings of the 2021 International Conference on Management of Data, 1262–1274.
  16. Demonstration of InsightPilot: An LLM-Empowered Automated Data Exploration System. arXiv preprint arXiv:2304.00477.
  17. Formalizing Visualization Design Knowledge as Constraints: Actionable and Extensible Models in Draco. IEEE Transactions on Visualization and Computer Graphics (TVCG), 25(1): 438–448.
  18. CodeGen2: Lessons for Training LLMs on Programming and Natural Languages. arXiv preprint.
  19. OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774.
  20. Compositional Semantic Parsing on Semi-Structured Tables. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1470–1480. Beijing, China: Association for Computational Linguistics.
  21. Forecasting at scale. The American Statistician, 72(1): 37–45.
  22. Creating a Coding Assistant with StarCoder. Hugging Face Blog. Https://huggingface.co/blog/starchat.
  23. Know What I don’t Know: Handling Ambiguous and Unknown Questions for Text-to-SQL. In Findings of the Association for Computational Linguistics: ACL 2023, 5701–5714. Toronto, Canada: Association for Computational Linguistics.
  24. Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning.
  25. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. CoRR, abs/1709.00103.
  26. Table2Charts: Recommending Charts by Learning Shared Table Representations. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, 2389–2399. ISBN 9781450383325.
Citations (14)

Summary

We haven't generated a summary for this paper yet.