Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decoding the Diversity: A Review of the Indic AI Research Landscape (2406.09559v1)

Published 13 Jun 2024 in cs.CL, cs.AI, and cs.LG

Abstract: This review paper provides a comprehensive overview of LLM research directions within Indic languages. Indic languages are those spoken in the Indian subcontinent, including India, Pakistan, Bangladesh, Sri Lanka, Nepal, and Bhutan, among others. These languages have a rich cultural and linguistic heritage and are spoken by over 1.5 billion people worldwide. With the tremendous market potential and growing demand for NLP based applications in diverse languages, generative applications for Indic languages pose unique challenges and opportunities for research. Our paper deep dives into the recent advancements in Indic generative modeling, contributing with a taxonomy of research directions, tabulating 84 recent publications. Research directions surveyed in this paper include LLM development, fine-tuning existing LLMs, development of corpora, benchmarking and evaluation, as well as publications around specific techniques, tools, and applications. We found that researchers across the publications emphasize the challenges associated with limited data availability, lack of standardization, and the peculiar linguistic complexities of Indic languages. This work aims to serve as a valuable resource for researchers and practitioners working in the field of NLP, particularly those focused on Indic languages, and contributes to the development of more accurate and efficient LLM applications for these languages.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Sankalp KJ (3 papers)
  2. Vinija Jain (42 papers)
  3. Sreyoshi Bhaduri (10 papers)
  4. Tamoghna Roy (10 papers)
  5. Aman Chadha (109 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com