Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization (2408.15801v1)

Published 28 Aug 2024 in cs.CL

Abstract: In an era where digital text is proliferating at an unprecedented rate, efficient summarization tools are becoming indispensable. While LLMs have been successfully applied in various NLP tasks, their role in extractive text summarization remains underexplored. This paper introduces EYEGLAXS (Easy Yet Efficient LLM for eXtractive Summarization), a framework that leverages LLMs, specifically LLAMA2-7B and ChatGLM2-6B, for extractive summarization of lengthy text documents. Instead of abstractive methods, which often suffer from issues like factual inaccuracies and hallucinations, EYEGLAXS focuses on extractive summarization to ensure factual and grammatical integrity. Utilizing state-of-the-art techniques such as Flash Attention and Parameter-Efficient Fine-Tuning (PEFT), EYEGLAXS addresses the computational and resource challenges typically associated with LLMs. The system sets new performance benchmarks on well-known datasets like PubMed and ArXiv. Furthermore, we extend our research through additional analyses that explore the adaptability of LLMs in handling different sequence lengths and their efficiency in training on smaller datasets. These contributions not only set a new standard in the field but also open up promising avenues for future research in extractive text summarization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Longformer: The long-document transformer. CoRR, abs/2004.05150.
  2. Gosum: Extractive summarization of long documents by reinforcement learning and graph organized discourse state.
  3. Gencomparesum: a hybrid unsupervised summarization method using salience. In Proceedings of the 21st workshop on biomedical language processing, pages 220–240.
  4. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
  5. Extending context window of large language models via positional interpolation. arXiv e-prints, pages arXiv–2306.
  6. Toward unifying text segmentation and long document summarization. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 106–118, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  7. A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 615–621, New Orleans, Louisiana. Association for Computational Linguistics.
  8. Tri Dao. 2023. FlashAttention-2: Faster attention with better parallelism and work partitioning.
  9. FlashAttention: Fast and memory-efficient exact attention with IO-awareness. In Advances in Neural Information Processing Systems.
  10. Günes Erkan and Dragomir R Radev. 2004. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of artificial intelligence research, 22:457–479.
  11. Larger-scale transformers for multilingual masked language modeling. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), pages 29–33, Online. Association for Computational Linguistics.
  12. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations.
  13. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38.
  14. Challenges and applications of large language models. arXiv preprint arXiv:2307.10169.
  15. Content selection in deep learning models of summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1818–1828, Brussels, Belgium. Association for Computational Linguistics.
  16. Scaling down to scale up: A guide to parameter-efficient fine-tuning.
  17. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
  18. Lost in the middle: How language models use long contexts.
  19. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. CoRR, abs/2107.13586.
  20. P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 61–68, Dublin, Ireland. Association for Computational Linguistics.
  21. Yang Liu and Mirella Lapata. 2019. Text summarization with pretrained encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3730–3740, Hong Kong, China. Association for Computational Linguistics.
  22. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
  23. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1864–1874, Dublin, Ireland. Association for Computational Linguistics.
  24. Train short, test long: Attention with linear biases enables input length extrapolation. In International Conference on Learning Representations.
  25. HiStruct+: Improving extractive text summarization with hierarchical structure information. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1292–1308, Dublin, Ireland. Association for Computational Linguistics.
  26. Roformer: Enhanced transformer with rotary position embedding. CoRR, abs/2104.09864.
  27. Llama 2: Open foundation and fine-tuned chat models.
  28. Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing & Management, 43(6):1606–1618.
  29. GPT-NER: Named Entity Recognition via Large Language Models. arXiv e-prints, page arXiv:2304.10428.
  30. Wen Xiao and Giuseppe Carenini. 2019. Extractive summarization of long documents by combining global and local context. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3011–3021, Hong Kong, China. Association for Computational Linguistics.
  31. Pre-trained language models with domain knowledge for biomedical extractive summarization. Knowledge-Based Systems, 252:109460.
  32. Big bird: Transformers for longer sequences. CoRR, abs/2007.14062.
  33. Glm-130b: An open bilingual pre-trained model. arXiv preprint arXiv:2210.02414.
  34. Extractive summarization via chatgpt for faithful summary generation. arXiv e-prints, pages arXiv–2304.
  35. Siren’s song in the ai ocean: A survey on hallucination in large language models.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Léo Hemamou (4 papers)
  2. Mehdi Debiane (1 paper)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets