Indian Language Summarization using Pretrained Sequence-to-Sequence Models (2303.14461v1)

Published 25 Mar 2023 in cs.CL

Abstract: The ILSUM shared task focuses on text summarization for two major Indian languages- Hindi and Gujarati, along with English. In this task, we experiment with various pretrained sequence-to-sequence models to find out the best model for each of the languages. We present a detailed overview of the models and our approaches in this paper. We secure the first rank across all three sub-tasks (English, Hindi and Gujarati). This paper also extensively analyzes the impact of k-fold cross-validation while experimenting with limited data size, and we also perform various experiments with a combination of the original and a filtered version of the data to determine the efficacy of the pretrained models.

View on arXiv

Authors (4)

Ashok Urlana (13 papers)
Sahil Manoj Bhatt (1 paper)
Nirmal Surange (7 papers)
Manish Shrivastava (62 papers)

Citations (10)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Indian Language Summarization using Pretrained Sequence-to-Sequence Models (2303.14461v1)

Summary

Related Papers