Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Low Resource Summarization using Pre-trained Language Models (2310.02790v1)

Published 4 Oct 2023 in cs.CL

Abstract: With the advent of Deep Learning based Artificial Neural Networks models, NLP has witnessed significant improvements in textual data processing in terms of its efficiency and accuracy. However, the research is mostly restricted to high-resource languages such as English and low-resource languages still suffer from a lack of available resources in terms of training datasets as well as models with even baseline evaluation results. Considering the limited availability of resources for low-resource languages, we propose a methodology for adapting self-attentive transformer-based architecture models (mBERT, mT5) for low-resource summarization, supplemented by the construction of a new baseline dataset (76.5k article, summary pairs) in a low-resource language Urdu. Choosing news (a publicly available source) as the application domain has the potential to make the proposed methodology useful for reproducing in other languages with limited resources. Our adapted summarization model \textit{urT5} with up to 44.78\% reduction in size as compared to \textit{mT5} can capture contextual information of low resource language effectively with evaluation score (up to 46.35 ROUGE-1, 77 BERTScore) at par with state-of-the-art models in high resource language English \textit{(PEGASUS: 47.21, BART: 45.14 on XSUM Dataset)}. The proposed method provided a baseline approach towards extractive as well as abstractive summarization with competitive evaluation results in a limited resource setup.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 61:65–170, 2018.
  2. Mining text data. Springer, 2015.
  3. Hans Peter Luhn. The automatic creation of literature abstracts. IBM Journal of research and development, 2(2):159–165, 1958.
  4. Karen Sparck Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation, 28(1):11–21, 1972.
  5. Harold P Edmundson. New methods in automatic extracting. Journal of the ACM (JACM), 16(2):264–285, 1969.
  6. A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685, 2015.
  7. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
  8. Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27, 2014.
  9. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023, 2016.
  10. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368, 2017.
  11. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  12. Improving language understanding by generative pre-training. 2018.
  13. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
  14. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions of the Association for Computational Linguistics, 7:597–610, 2019.
  15. Cross-lingual language model pretraining. arXiv preprint arXiv:1901.07291, 2019.
  16. How multilingual is multilingual bert? arXiv preprint arXiv:1906.01502, 2019.
  17. Learning to rank using gradient descent. In Proceedings of the 22nd international conference on Machine learning, pages 89–96, 2005.
  18. Phrase clustering for discriminative learning. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 1030–1038, 2009.
  19. Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pages 93–98, 2016.
  20. Convolutional neural network architectures for matching natural language sentences. Advances in neural information processing systems, 27, 2014.
  21. Neural summarization by extracting sentences and words. arXiv preprint arXiv:1603.07252, 2016.
  22. Neural document summarization by jointly learning to score and select sentences. arXiv preprint arXiv:1807.02305, 2018.
  23. Multi-reward reinforced summarization with saliency and entailment. arXiv preprint arXiv:1804.06451, 2018.
  24. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  25. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
  26. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543, 2014.
  27. Enriching word vectors with subword information. Transactions of the association for computational linguistics, 5:135–146, 2017.
  28. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. doi:10.18653/v1/N18-1202. URL https://aclanthology.org/N18-1202.
  29. Language modeling teaches you more syntax than translation does: Lessons learned through auxiliary task analysis. arXiv preprint arXiv:1809.10040, 2018.
  30. Fine-tuned language models for text classification. arXiv preprint arXiv:1801.06146, 194, 2018.
  31. OpenAI. Gpt-4 technical report, 2023.
  32. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100, 2022.
  33. Gpt-neox-20b: An open-source autoregressive language model. arXiv preprint arXiv:2204.06745, 2022.
  34. A survey on cross-lingual summarization. Transactions of the Association for Computational Linguistics, 10:1304–1323, 2022.
  35. Ethnologue: Languages of the world, twentyfifth edition. SIL International, Dallas, 2022.
  36. Urdu summary corpus. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 796–800, 2016.
  37. Context aware emotion detection from low resource urdu language using deep neural network. Transactions on Asian and Low-Resource Language Information Processing, 2022.
  38. Extractive text summarization models for urdu language. Information Processing & Management, 57(6):102383, 2020.
  39. Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. arXiv preprint arXiv:1808.08745, 2018.
  40. mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934, 2020.
  41. Muril: Multilingual representations for indian languages. arXiv preprint arXiv:2103.10730, 2021.
  42. Load what you need: Smaller versions of multilingual bert. arXiv preprint arXiv:2010.05609, 2020.
  43. Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
  44. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675, 2019.
  45. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38, 2023.
  46. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Mubashir Munaf (1 paper)
  2. Hammad Afzal (4 papers)
  3. Naima Iltaf (1 paper)
  4. Khawir Mahmood (2 papers)