Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study (2404.17136v1)

Published 26 Apr 2024 in cs.DB, cs.AI, and cs.CL

Abstract: The Natural Language to Visualization (NL2Vis) task aims to transform natural-language descriptions into visual representations for a grounded table, enabling users to gain insights from vast amounts of data. Recently, many deep learning-based approaches have been developed for NL2Vis. Despite the considerable efforts made by these approaches, challenges persist in visualizing data sourced from unseen databases or spanning multiple tables. Taking inspiration from the remarkable generation capabilities of LLMs, this paper conducts an empirical study to evaluate their potential in generating visualizations, and explore the effectiveness of in-context learning prompts for enhancing this task. In particular, we first explore the ways of transforming structured tabular data into sequential text prompts, as to feed them into LLMs and analyze which table content contributes most to the NL2Vis. Our findings suggest that transforming structured tabular data into programs is effective, and it is essential to consider the table schema when formulating prompts. Furthermore, we evaluate two types of LLMs: finetuned models (e.g., T5-Small) and inference-only models (e.g., GPT-3.5), against state-of-the-art methods, using the NL2Vis benchmarks (i.e., nvBench). The experimental results reveal that LLMs outperform baselines, with inference-only models consistently exhibiting performance improvements, at times even surpassing fine-tuned models when provided with certain few-shot demonstrations through in-context learning. Finally, we analyze when the LLMs fail in NL2Vis, and propose to iteratively update the results using strategies such as chain-of-thought, role-playing, and code-interpreter. The experimental results confirm the efficacy of iterative updates and hold great potential for future study.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. [n. d.]. Amazon’s QuickSight. https://aws.amazon.com/cn/blogs/aws/amazon-quicksight-q-to-answer-ad-hoc-business-questions.
  2. [n. d.]. ChatExcel. https://chatexcel.com.
  3. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021).
  4. Language Models are Few-Shot Learners. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 33. 1877–1901.
  5. ChatGPT. 2022. ChatGPT. https://openai.com/blog/chatgpt.
  6. Deep Reinforcement Learning from Human Preferences. In Advances in Neural Information Processing Systems, Vol. 30.
  7. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
  8. DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology (Charlotte, NC, USA) (UIST ’15). Association for Computing Machinery, New York, NY, USA, 489–500.
  9. TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 1978–1988.
  10. GPT3.5. 2023. GPT3.5. https://platform.openai.com/docs/models/gpt-3-5.
  11. GPT4. 2023. GPT4. https://openai.com/research/gpt-4.
  12. From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10867–10877.
  13. Applying pragmatics principles for interaction with visual analytics. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2017), 309–318.
  14. ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory. arXiv preprint arXiv:2306.03901 (2023).
  15. Natural language to SQL: Where are we today? Proceedings of the VLDB Endowment 13, 10 (2020), 1737–1750.
  16. Chuan Li. 2020. Demystifying gpt-3 language model: A technical overview. https://lambdalabs.com/blog/demystifying-gpt-3. [Online; accessed 1-Aug-2022].
  17. SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models. arXiv preprint arXiv:2305.19308 (2023).
  18. Graphix-T5: mixing pre-trained transformers with graph-aware layers for text-to-SQL parsing. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence (AAAI’23/IAAI’23/EAAI’23). AAAI Press, Article 1467, 9 pages.
  19. Enabling Programming Thinking in Large Language Models Toward Code Generation. CoRR abs/2305.06599 (2023).
  20. Large Language Model-Aware In-Context Learning for Code Generation. CoRR abs/2310.09748 (2023).
  21. StarCoder: may the source be with you! arXiv preprint arXiv:2305.06161 (2023).
  22. A comprehensive evaluation of ChatGPT’s zero-shot Text-to-SQL capability. arXiv preprint arXiv:2303.13547 (2023).
  23. Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation. In Advances in Neural Information Processing Systems, Vol. 36. 21558–21572.
  24. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv. 55, 9, Article 195 (jan 2023), 35 pages.
  25. DeepEye: Creating Good Data Visualizations by Keyword Search. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD ’18). Association for Computing Machinery, New York, NY, USA, 1733–1736.
  26. nvBench: A large-scale synthesized dataset for cross-domain natural language to visualization task. arXiv preprint arXiv:2112.12926 (2021).
  27. Synthesizing Natural Language to Visualization (NL2VIS) Benchmarks from NL2SQL Benchmarks. In Proceedings of the 2021 International Conference on Management of Data (Virtual Event, China) (SIGMOD ’21). Association for Computing Machinery, New York, NY, USA, 1235–1247.
  28. Natural Language to Visualization by Neural Machine Translation. IEEE Transactions on Visualization and Computer Graphics 28, 1 (2022), 217–226.
  29. WizardCoder: Empowering Code Large Language Models with Evol-Instruct. CoRR abs/2306.08568 (2023).
  30. New trends in machine translation using large language models: Case examples with chatgpt. arXiv preprint arXiv:2305.01181 (2023).
  31. Paula Maddigan and Teo Susnjak. 2023. Chat2vis: Fine-tuning data visualisations using multilingual natural language text and pre-trained large language models. arXiv preprint arXiv:2303.14292 (2023).
  32. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 55–60.
  33. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474 (2022).
  34. Gpt-4 technical report.
  35. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, Vol. 35. 27730–27744.
  36. LLM is Like a Box of Chocolates: the Non-determinism of ChatGPT in Code Generation. arXiv preprint arXiv:2308.02828 (2023).
  37. Mohammadreza Pourreza and Davood Rafiei. 2023. DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 36. 36339–36348.
  38. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1–67.
  39. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950 (2023).
  40. Eviza: A natural language interface for visual analysis. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. 365–377.
  41. Inferencing underspecified natural language utterances in visual analysis. In Proceedings of the 24th International Conference on Intelligent User Interfaces (Marina del Ray, California) (IUI ’19). Association for Computing Machinery, New York, NY, USA, 40–51.
  42. Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 922–938.
  43. RGVisNet: A Hybrid Retrieval-Generation Neural Framework Towards Automatic Data Visualization Generation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD ’22). Association for Computing Machinery, New York, NY, USA, 1646–1655.
  44. Evaluation of ChatGPT as a question answering system for answering complex questions. arXiv preprint arXiv:2303.07992 (2023).
  45. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
  46. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 30.
  47. Randle Aaron M. Villanueva and Zhuo Job Chen. 2019. ggplot2: Elegant Graphics for Data Analysis (2nd ed.). Measurement: Interdisciplinary Research and Perspectives 17, 3 (2019), 160–167.
  48. NaturalCC: an open-source toolkit for code intelligence. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings. 149–153.
  49. Document-Level Machine Translation with Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Singapore, 16646–16661.
  50. Code4Struct: Code Generation for Few-Shot Event Structure Prediction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Toronto, Canada, 3640–3663.
  51. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 8696–8708.
  52. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 35. 24824–24837.
  53. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. arXiv preprint arXiv:1809.08887 (2018).
  54. TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT. arXiv preprint arXiv:2307.08674 (2023).
  55. Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow. arXiv preprint arXiv:2306.07209 (2023).
  56. NL2Formula: Generating Spreadsheet Formulas from Natural Language Queries. In Findings of the Association for Computational Linguistics: EACL 2024, St. Julian’s, Malta, March 17-22, 2024. 2377–2388.
  57. Li Zhong and Zilong Wang. 2023. A Study on Robustness and Reliability of Large Language Model Code Generation. arXiv preprint arXiv:2308.10335 (2023).
  58. Animated Vega-Lite: Unifying Animation with a Grammar of Interactive Graphics. IEEE Transactions on Visualization and Computer Graphics 29, 1 (2023), 149–159.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yang Wu (175 papers)
  2. Yao Wan (70 papers)
  3. Hongyu Zhang (147 papers)
  4. Yulei Sui (29 papers)
  5. Wucai Wei (1 paper)
  6. Wei Zhao (309 papers)
  7. Guandong Xu (93 papers)
  8. Hai Jin (83 papers)
Citations (12)