Papers
Topics
Authors
Recent
AI Research Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 85 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 16 tok/s Pro
GPT-5 High 10 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 455 tok/s Pro
Claude Sonnet 4 31 tok/s Pro
2000 character limit reached

Structsum Generation for Faster Text Comprehension (2401.06837v2)

Published 12 Jan 2024 in cs.CL and cs.AI

Abstract: We consider the task of generating structured representations of text using LLMs. We focus on tables and mind maps as representative modalities. Tables are more organized way of representing data, while mind maps provide a visually dynamic and flexible approach, particularly suitable for sparse content. Despite the effectiveness of LLMs on different tasks, we show that current models struggle with generating structured outputs. In response, we present effective prompting strategies for both of these tasks. We introduce a taxonomy of problems around factuality, global and local structure, common to both modalities and propose a set of critiques to tackle these issues resulting in an absolute improvement in accuracy of +37pp (79%) for mind maps and +15pp (78%) for tables. To evaluate semantic coverage of generated structured representations we propose Auto-QA, and we verify the adequacy of Auto-QA using SQuAD dataset. We further evaluate the usefulness of structured representations via a text comprehension user study. The results show a significant reduction in comprehension time compared to text when using table (42.9%) and mind map (31.9%), without loss in accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Table-to-text generation and pre-training with TabT5. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6758–6766, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  2. Global reasoning over database structures for text-to-SQL parsing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3659–3664, Hong Kong, China. Association for Computational Linguistics.
  3. Attributed question answering: Evaluation and modeling for attributed large language models.
  4. Tony Buzan. 1996. The Mind Map Book: How to Use Radiant Thinking to Maximize Your Brain’s Untapped Potential. Plume.
  5. Jianpeng Cheng and Mirella Lapata. 2016. Neural summarization by extracting sentences and words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 484–494, Berlin, Germany. Association for Computational Linguistics.
  6. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113.
  7. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  8. Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary. Transactions of the Association for Computational Linguistics, 9:774–789.
  9. Constructivist-visual mind map teaching approach and the quality of students’ cognitive structures. Journal of Science Education and Technology, 20(2):186–200.
  10. Successive prompting for decomposing complex questions. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1251–1265, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  11. QAFactEval: Improved QA-based factual consistency evaluation for summarization. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2587–2601, Seattle, United States. Association for Computational Linguistics.
  12. RARR: Researching and revising what language models say, using language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16477–16508, Toronto, Canada. Association for Computational Linguistics.
  13. Enabling large language models to generate text with citations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6465–6488, Singapore. Association for Computational Linguistics.
  14. Gemini Team. 2023. Gemini: A family of highly capable multimodal models. Technical report, Google.
  15. Wiki-40b: Multilingual language model dataset. In LREC 2020.
  16. Teaching machines to read and comprehend. Advances in neural information processing systems, 28.
  17. TaPas: Weakly supervised table parsing via pre-training. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4320–4333, Online. Association for Computational Linguistics.
  18. Efficient mind-map generation via sequence-to-graph and reinforced graph refinement. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8130–8141, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  19. Efficient attentions for long document summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1419–1436, Online. Association for Computational Linguistics.
  20. Decomposed prompting: A modular approach for solving complex tasks. In The Eleventh International Conference on Learning Representations.
  21. Prometheus: Inducing fine-grained evaluation capability in language models.
  22. Scalable Micro-planned Generation of Discourse from Structured Data. Computational Linguistics, 45(4):737–763.
  23. A sequence-to-sequence&set model for text-to-table generation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5358–5370, Toronto, Canada. Association for Computational Linguistics.
  24. Xuanfan Ni and Piji Li. 2023. Unified text structuralization with instruction-tuned language models. arXiv preprint arXiv:2303.14956.
  25. OpenAI. 2022. Chatgpt: Optimizing language models for dialogue.
  26. OpenAI. 2023. Gpt-4 technical report.
  27. PaLM-2-unicorn. 2023. PaLM-2 google ai blog. https://blog.google/technology/ai/google-palm-2-ai-large-language-model/. Accessed: 2023-05-10.
  28. PaLM2. 2023. PaLM2 technical report. https://ai.google/static/documents/palm2techreport.pdf. Accessed: 2023-05-10.
  29. Stable: Table generation framework for encoder-decoder models. arXiv preprint arXiv:2206.04045.
  30. Ratish Puduppully and Mirella Lapata. 2021. Data-to-text Generation with Macro Planning. Transactions of the Association for Computational Linguistics, 9:510–527.
  31. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383–2392, Austin, Texas. Association for Computational Linguistics.
  32. Struc-bench: Are large language models really good at generating complex structured data?
  33. Ul2: Unifying language learning paradigms. In The Eleventh International Conference on Learning Representations.
  34. Shepherd: A critic for language model generation. arXiv preprint arXiv:2308.04592.
  35. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652.
  36. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  37. Revealing semantic structures of texts: Multi-grained framework for automatic mind-map generation. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 5247–5254. International Joint Conferences on Artificial Intelligence Organization.
  38. Text-to-table: A new way of information extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2518–2533, Dublin, Ireland. Association for Computational Linguistics.
  39. Least-to-most prompting enables complex reasoning in large language models. In The Eleventh International Conference on Learning Representations.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.