Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Indian Language Summarization using Pretrained Sequence-to-Sequence Models (2303.14461v1)

Published 25 Mar 2023 in cs.CL

Abstract: The ILSUM shared task focuses on text summarization for two major Indian languages- Hindi and Gujarati, along with English. In this task, we experiment with various pretrained sequence-to-sequence models to find out the best model for each of the languages. We present a detailed overview of the models and our approaches in this paper. We secure the first rank across all three sub-tasks (English, Hindi and Gujarati). This paper also extensively analyzes the impact of k-fold cross-validation while experimenting with limited data size, and we also perform various experiments with a combination of the original and a filtered version of the data to determine the efficacy of the pretrained models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ashok Urlana (13 papers)
  2. Sahil Manoj Bhatt (1 paper)
  3. Nirmal Surange (7 papers)
  4. Manish Shrivastava (62 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.