2000 character limit reached
PreCogIIITH at HinglishEval : Leveraging Code-Mixing Metrics & Language Model Embeddings To Estimate Code-Mix Quality (2206.07988v1)
Published 16 Jun 2022 in cs.AI
Abstract: Code-Mixing is a phenomenon of mixing two or more languages in a speech event and is prevalent in multilingual societies. Given the low-resource nature of Code-Mixing, machine generation of code-mixed text is a prevalent approach for data augmentation. However, evaluating the quality of such machine generated code-mixed text is an open problem. In our submission to HinglishEval, a shared-task collocated with INLG2022, we attempt to build models factors that impact the quality of synthetically generated code-mix text by predicting ratings for code-mix quality.
- Prashant Kodali (6 papers)
- Tanmay Sachan (1 paper)
- Akshay Goindani (4 papers)
- Anmol Goel (9 papers)
- Naman Ahuja (4 papers)
- Manish Shrivastava (62 papers)
- Ponnurangam Kumaraguru (129 papers)