Implementing Deep Learning-Based Approaches for Article Summarization in Indian Languages (2212.05702v1)

Published 12 Dec 2022 in cs.CL and cs.LG

Abstract: The research on text summarization for low-resource Indian languages has been limited due to the availability of relevant datasets. This paper presents a summary of various deep-learning approaches used for the ILSUM 2022 Indic language summarization datasets. The ISUM 2022 dataset consists of news articles written in Indian English, Hindi, and Gujarati respectively, and their ground-truth summarizations. In our work, we explore different pre-trained seq2seq models and fine-tune those with the ILSUM 2022 datasets. In our case, the fine-tuned SoTA PEGASUS model worked the best for English, the fine-tuned IndicBART model with augmented data for Hindi, and again fine-tuned PEGASUS model along with a translation mapping-based approach for Gujarati. Our scores on the obtained inferences were evaluated using ROUGE-1, ROUGE-2, and ROUGE-4 as the evaluation metrics.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (5)

Rahul Tangsali (4 papers)
Aabha Pingle (3 papers)
Aditya Vyawahare (4 papers)
Isha Joshi (6 papers)
Raviraj Joshi (76 papers)

Citations (5)

View on Semantic Scholar

Implementing Deep Learning-Based Approaches for Article Summarization in Indian Languages (2212.05702v1)

Related Papers