Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Survey on Publicly Available Sinhala Natural Language Processing Tools and Research (1906.02358v23)

Published 5 Jun 2019 in cs.CL

Abstract: Sinhala is the native language of the Sinhalese people who make up the largest ethnic group of Sri Lanka. The language belongs to the globe-spanning language tree, Indo-European. However, due to poverty in both linguistic and economic capital, Sinhala, in the perspective of Natural Language Processing tools and research, remains a resource-poor language which has neither the economic drive its cousin English has nor the sheer push of the law of numbers a language such as Chinese has. A number of research groups from Sri Lanka have noticed this dearth and the resultant dire need for proper tools and research for Sinhala natural language processing. However, due to various reasons, these attempts seem to lack coordination and awareness of each other. The objective of this paper is to fill that gap of a comprehensive literature survey of the publicly available Sinhala natural language tools and research so that the researchers working in this field can better utilize contributions of their peers. As such, we shall be uploading this paper to arXiv and perpetually update it periodically to reflect the advances made in the field.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (583)
  1. R. Englebretson and C. Genetti, “Santa barbara papers in linguistics: Proceeding from the workshop on sinhala linguistics,” Santa Barbara, CA: Department of Linguistics at the University of California, Santa Barbara, 2005.
  2. Department of Census and Statistics Sri Lanka. Percentage of population aged 10 years and over in major ethnic groups by district and ability to speak sinhala, tamil and english languages. [Online]. Available: https://goo.gl/nnVZSd
  3. Department of Census and Statistics, Sri Lanka. (2012) Census of Population and Housing of Sri Lanka. [Online]. Available: https://bit.ly/3bAgcXE
  4. H. Young. A language family tree - in pictures — education — the guardian. [Online]. Available: https://www.theguardian.com/education/gallery/2015/jan/23/a-language-family-tree-in-pictures
  5. A. B. Kanduboda, “The role of animacy in determining noun phrase cases in the sinhalese and japanese languages,” Science of words, vol. 24, pp. 5–20, 2011.
  6. P. Fernando, “Palaeographical development of the brahmi script in ceylon from 3rd century bc to 7th century ad,” 1949.
  7. D. Bandara, N. Warnajith, A. Minato, and S. Ozawa, “Creation of precise alphabet fonts of early brahmi script from photographic data of ancient sri lankan inscriptions,” Canadian Journal on Artificial Intelligence, Machine Learning and Pattern Recognition, vol. 3, no. 3, pp. 33–39, 2012.
  8. M. H. Sirisoma, “Brahmi inscriptions of sri lanka from 3rd century bc to 65 ad,” pp. 3–54, 1990.
  9. M. Dias, “Lakdiwa sellipiwalin heliwana sinhala bhashawe prathyartha namayange vikashanaya,” Department of Archaeology, Colombo Sri Lanka, p. 1, 1996.
  10. A. S. Hettiarachchi, “Investigation of 2nd, 3rd and 4th century inscriptions,” Inscriptions: Volume Two, Archaeological Department Centenary (1890–1990), Commemorative Series. Colombo: Department of Archaeology, pp. 57–104, 1990.
  11. B. Hettige and A. S. Karunananda, “Computational model of grammar for english to sinhala machine translation,” in Advances in ICT for Emerging Regions (ICTer), 2011 International Conference on.   IEEE, 2011, pp. 26–31.
  12. A. Herath, Y. Hyodo, Y. Kawada, T. Ikeda, and S. Herath, “A practical machine translation system from japanese to modern sinhalese,” Gifu University, pp. 153–162, 1994.
  13. N. de Silva, “Sinhala Text Classification: Observations from the Perspective of a Resource Poor Language,” 2015.
  14. Y. Wijeratne, N. de Silva, and Y. Shanmugarajah, “Natural Language Processing for Government: Problems and Potential,” LIRNEasia, 2019.
  15. E. D. Liddy, “Natural language processing,” 2001.
  16. D. C. Wimalasuriya and D. Dou, “Ontology-based information extraction: An introduction and a survey of current approaches,” Journal of Information Science, vol. 36, no. 3, pp. 306–323, 2010.
  17. U. Consortium et al., “The unicode standard: A technical introduction,” online document, http://www. unicode. org/unicode/standards/principles. html, 1996.
  18. R. A. Van der Sandt, “Presupposition projection as anaphora resolution,” Journal of semantics, vol. 9, no. 4, pp. 333–377, 1992.
  19. S. Lappin and H. J. Leass, “An algorithm for pronominal anaphora resolution,” Computational linguistics, vol. 20, no. 4, pp. 535–561, 1994.
  20. W. M. Soon, H. T. Ng, and D. C. Y. Lim, “A machine learning approach to coreference resolution of noun phrases,” Computational linguistics, vol. 27, no. 4, pp. 521–544, 2001.
  21. V. Ng and C. Cardie, “Improving machine learning approaches to coreference resolution,” in Proceedings of the 40th annual meeting on association for computational linguistics.   Association for Computational Linguistics, 2002, pp. 104–111.
  22. D. R. Shanahan, “A living document: reincarnating the research article,” Trials, vol. 16, no. 1, pp. 1–5, 2015.
  23. N. M. Sopinka, L. E. Coristine, M. C. DeRosa, C. M. Rochman, B. L. Owens, and S. J. Cooke, “Envisioning the scientific paper of the future,” Facets, vol. 5, no. 1, pp. 1–16, 2020.
  24. M. Gabelica, R. Bojčić, and L. Puljak, “Many researchers were not compliant with their published data sharing statement: mixed-methods study,” Journal of Clinical Epidemiology, 2022.
  25. I. Wijesiri, M. Gallage, B. Gunathilaka, M. Lakjeewa, D. Wimalasuriya, G. Dias, R. Paranavithana, and N. de Silva, “Building a wordnet for Sinhala,” in Proceedings of the Seventh Global WordNet Conference, 2014, pp. 100–108.
  26. T. Miyagishi, “Accusative subject of subordinate clause in literary sinhala,” Journal of Yasuda Women’s University, vol. 33, 2005.
  27. A. P. B. Kanduboda, “On the usage of sinhalese differential object markers object marker /wa/ vs. object marker /ta/,” Theory and Practice in Language Studies, vol. 3, no. 7, p. 1081, 2013.
  28. C. Liyanage, R. Pushpananda, D. L. Herath, and R. Weerasinghe, “A computational grammar of Sinhala,” in International Conference on Intelligent Text Processing and Computational Linguistics.   Springer, 2012, pp. 188–200.
  29. C. Jany, “The relationship between case marking and s, a, and o in spoken sinhala,” Santa Barbara Papers in Linguistics, no. 17, pp. 68–84, 2006.
  30. J. Garland, “Morphological typology and the complexity of nominal morphology in sinhala,” Santa Barbara Papers in Linguistics, no. 17, pp. 1–19, 2005.
  31. M. Henderson, “Between lexical and lexico-grammatical classification: nominal classification in sinhala,” Santa Barbara Papers in Linguistics, p. 29, 2005.
  32. T. Noguchi, “Shinharago nyuumon [introductory to the sinhalese language],” Tokyo: Daigaku Shorin, 1984.
  33. T. Miyagishi, “A comparison of word order between japanese and sinhalese,” Bulletin of Japanese Language and Literature, pp. 101–107, 2003.
  34. S. P. Singh, A. Kumar, P. Sahu, and P. Verma, “Syntax based machine translation using blended methodology,” in 2016 2nd International Conference on Next Generation Computing Technologies (NGCT).   IEEE, 2016, pp. 242–247.
  35. S. Herath, T. Ikeda, S. Yokoyama, H. Isahara, and S. Ishizaki, “Sinhalese morphological analysis: a step towards machine processing of sinhalese,” in [Proceedings 1989] IEEE International Workshop on Tools for Artificial Intelligence.   IEEE, 1989, pp. 100–107.
  36. S. Herath, T. Ikeda, S. Ishizaki, and Y. Anzai, “Formalization of sinhalese morphology,” in Proc. of the 40th National Congress of IPSJ. Jyouhou Syori Gakkai, vol. 1, 1990, pp. 327–328.
  37. H. Li, J. Dunn, and A. Nini, “Register variation remains stable across 60 languages,” Corpus Linguistics and Linguistic Theory, 2022.
  38. V. K. Samaranayake, J. B. Disanayaka, and S. T. Nandasara, “A standard code for sinhala characters,” Proceedings, 9th Annual Sessions of the Computer Society of Sri Lanka, Colombo, 1989.
  39. V. K. Samaranayake, S. T. Nandasara, J. B. Disanayaka, A. R. Weerasinghe, and H. Wijayawardhana, “An introduction to unicode for sinhala characters,” University Of Colombo School of Computing, 2003.
  40. G. Dias and A. Goonetilleke, “Development of standards for Sinhala computing,” in 1st Regional Conference on ICT and E-Paradigms, 2004.
  41. G. V. Dias, “Challenges of enabling it in the sinhala language,” in 27th Internationalization and Unicode Conference, 2005.
  42. A. R. Weerasinghe, D. L. Herath, and K. Gamage, “The sinhala collation sequence and its representation in unicode,” Localization Focus, 2006.
  43. G. Sandeva, “Design and evaluation of user-friendly yet efficient sinhala input methods,” 2009.
  44. S. Herath, S. Ishizaki, T. Ikeda, Y. Anzai, and H. Aiso, “Machine processing of sinhala natural language: a step toward intelligent systems,” Cybernetics and systems, vol. 22, no. 3, pp. 331–348, 1991.
  45. S. T. Nandasara, “From the past to the present: Evolution of computing in the sinhala language,” IEEE Annals of the History of Computing, vol. 31, no. 1, pp. 32–45, 2009.
  46. S. T. Nandasara and Y. Mikami, “Bridging the digital divide in sri lanka: some challenges and opportunities in using sinhala in ict,” International Journal on Advances in ICT for Emerging Regions (ICTer), vol. 8, no. 1, 2016.
  47. B. Hettige and A. S. Karunananda, “A morphological analyzer to enable english to sinhala machine translation,” in Information and Automation, 2006. ICIA 2006. International Conference on.   IEEE, 2006, pp. 21–26.
  48. ——, “A parser for sinhala language-first step towards english to sinhala machine translation,” in Industrial and Information Systems, First International Conference on.   IEEE, 2006, pp. 583–587.
  49. ——, “First sinhala chatbot in action,” Proceedings of the 3rd Annual Sessions of Sri Lanka Association for Artificial Intelligence (SLAAI), University of Moratuwa, 2006.
  50. ——, “Developing lexicon databases for english to sinhala machine translation,” in Industrial and Information Systems, 2007. ICIIS 2007. International Conference on.   IEEE, 2007, pp. 215–220.
  51. ——, “Transliteration system for english to sinhala machine translation,” in Industrial and Information Systems, 2007. ICIIS 2007. International Conference on.   IEEE, 2007, pp. 209–214.
  52. ——, “Using human-assisted machine translation to overcome language barrier in sri lanka,” Proceedings of 4th Annual session of Sri Lanka Association for Artificial Intelligence, p. 10, 2007.
  53. ——, “Web-based english-sinhala translator in action,” in 2008 4th International Conference on Information and Automation for Sustainability.   IEEE, 2008, pp. 80–85.
  54. ——, “Web-based english to sinhala selected texts translation system,” Sri Lanka Association for Artificial Intelligence, p. 26, 2008.
  55. ——, “Theoretical based approach to english to sinhala machine translation,” in 2009 International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2009, pp. 380–385.
  56. ——, “An evaluation methodology for english to sinhala machine translation,” in Information and Automation for Sustainability (ICIAFs), 2010 5th International Conference on.   IEEE, 2010, pp. 31–36.
  57. ——, “Varanageema: A theoretical basics for english to sinhala machine translation,” in Sri Lanka Association for Artificial Intelligence (SLAAI), 2010.
  58. B. Hettige, G. Rzevski, and A. S. Karunananda, “Selected text machine translator for english to sinhala,” 2013.
  59. B. Hettige, A. S. Karunananda, and G. Rzevski, “Multi-agent system technology for morphological analysis,” Proceedings of the 9th Annual Sessions of Sri Lanka Association for Artificial Intelligence (SLAAI), Colombo, 2012.
  60. ——, “Masmt: A multi-agent system development framework for english-sinhala machine translation,” International Journal of Computational Linguistics and Natural Language Processing (IJCLNLP), vol. 2, no. 7, pp. 411–416, 2013.
  61. ——, “Sinhala ontology generator for english to sinhala machine translation,” in Proc. of KDU International Research Conference, 2014.
  62. ——, “A multi-agent solution for managing complexity in english to sinhala machine translation,” Complex Systems: Fundamentals & Applications, vol. 90, p. 251, 2016.
  63. ——, “Phrase-level english to sinhala machine translation with multi-agent approach,” in 2017 IEEE International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2017, pp. 1–6.
  64. ——, “Masmt4: The agr organizational model-based multi-agent system development framework for machine translation,” in Inventive Computation and Information Technologies.   Springer, 2021, pp. 691–702.
  65. A. Herath, Y. Hyodo, T. Ikeda, and S. Herath, “Generation of sinhalese units from japanese bunsetsu structure,” 1993.
  66. A. Herath, Y. Hyodo, Y. Kunieda, T. Ikeda, and S. Herath, “Bunsetsu-based japanese-sinhalese translation system,” Information sciences, vol. 90, no. 1-4, pp. 303–319, 1996.
  67. S. Thelijjagoda, Y. Imai, and T. Ikeda, “Japanese-sinhalese mt system (jaw/sinhalese),” in Proceedings of Asian Symposium on Natural Language Processing to Overcome Language Barriers, IJCNLP-04 Satellite Symposium, 2004.
  68. D. Upeksha, C. Wijayarathna, M. Siriwardena, L. Lasandun, C. Wimalasuriya, N. H. N. D. de Silva, and G. Dias, “Implementing a Corpus for Sinhala Language,” in Symposium on Language Technology for South Asia 2015, 2015.
  69. ——, “Comparison between performance of various database systems for implementing a language corpus,” in International Conference: Beyond Databases, Architectures and Structures.   Springer, May 2015, pp. 82–91.
  70. R. Weerasinghe, D. Herath, and V. Welgama, “Corpus-based sinhala lexicon,” in Proceedings of the 7th Workshop on Asian Language Resources.   Association for Computational Linguistics, 2009, pp. 17–23.
  71. F. Guzmán, P.-J. Chen, M. Ott, J. Pino, G. Lample, P. Koehn, V. Chaudhary, and M. Ranzato, “The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English,” arXiv preprint arXiv:1902.01382, 2019.
  72. Y. Wijeratne and N. de Silva, “Sinhala language corpora and stopwords from a decade of sri lankan facebook,” arXiv preprint arXiv:2007.07884, 2020.
  73. D. Lakmal, S. Ranathunga, S. Peramuna, and I. Herath, “Word embedding evaluation for sinhala,” in Proceedings of The 12th Language Resources and Evaluation Conference, 2020, pp. 1874–1881.
  74. M. Bañón, P. Chen, B. Haddow, K. Heafield, H. Hoang, M. Esplà-Gomis, M. L. Forcada, A. Kamran, F. Kirefu, P. Koehn, S. Ortiz Rojas, L. Pla Sempere, G. Ramírez-Sánchez, E. Sarrías, M. Strelec, B. Thompson, W. Waites, D. Wiggins, and J. Zaragoza, “Paracrawl: Web-scale acquisition of parallel corpora,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 4555–4567.
  75. I. Caswell, J. Kreutzer, L. Wang, A. Wahab, D. van Esch, N. Ulzii-Orshikh, A. Tapo, N. Subramani, A. Sokolov, C. Sikasote, M. Setyawan, S. Sarin, S. Samb, B. Sagot, C. Rivera, A. Rios, I. Papadimitriou, S. Osei, P. J. O. Suárez, I. Orife, K. Ogueji, R. A. Niyongabo, T. Q. Nguyen, M. Müller, A. Müller, S. H. Muhammad, N. Muhammad, A. Mnyakeni, J. Mirzakhalov, T. Matangira, C. Leong, N. Lawson, S. Kudugunta, Y. Jernite, M. Jenny, O. Firat, B. F. P. Dossou, S. Dlamini, N. de Silva, S. Çabuk Ballı, S. Biderman, A. Battisti, A. Baruwa, A. Bapna, P. Baljekar, I. A. Azime, A. Awokoya, D. Ataman, O. Ahia, O. Ahia, S. Agrawal, and M. Adeyemi, “Quality at a glance: An audit of web-crawled multilingual datasets,” arXiv preprint arXiv:2103.12028, 2021.
  76. D. Sachintha, L. Piyarathna, C. Rajitha, and S. Ranathunga, “Exploiting parallel corpora to improve multilingual embedding based document and sentence alignment,” arXiv preprint arXiv:2106.06766, 2021.
  77. D. Warusawithana, N. Kulaweera, L. Weerasinghe, and B. Karunarathne, “A systematic approach to derive a refined speech corpus for sinhala,” 2022.
  78. O. Kjartansson, S. Sarin, K. Pipatsrisawat, M. Jansche, and L. Ha, “Crowd-Sourced Speech Corpora for Javanese, Sundanese, Sinhala, Nepali, and Bangladeshi Bengali,” in Proc. The 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU), Gurugram, India, Aug. 2018, pp. 52–55. [Online]. Available: http://dx.doi.org/10.21437/SLTU.2018-11
  79. A. Butryna, S.-H. C. Chu, I. Demirsahin, A. Gutkin, L. Ha, F. He, M. Jansche, C. Johny, A. Katanova, O. Kjartansson, C. Li, T. Merkulova, Y. M. Oo, K. Pipatsrisawat, C. Rivera, S. Sarin, P. de Silva, K. Sodimana, R. Sproat, T. Wattanavekin, and J. A. E. Wibawa, “Google crowdsourced speech corpora and related open-source resources for low-resource languages and dialects: an overview,” arXiv preprint arXiv:2010.06778, 2020.
  80. V. Dhananjaya, P. Demotte, S. Ranathunga, and S. Jayasena, “BERTifying Sinhala - A Comprehensive Analysis of Pre-trained Language Models for Sinhala Text Classification,” in Proceedings of the 13th language resources and evaluation conference, 2022.
  81. R. A. Hameed, N. Pathirennehelage, A. Ihalapathirana, M. Z. Mohamed, S. Ranathunga, S. Jayasena, G. Dias, and S. Fernando, “Automatic creation of a sentence aligned sinhala-tamil parallel corpus,” in Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), 2016, pp. 124–132.
  82. M. Z. Mohamed, A. Ihalapathirana, R. A. Hameed, N. Pathirennehelage, S. Ranathunga, S. Jayasena, and G. Dias, “Automatic creation of a word aligned sinhala-tamil parallel corpus,” in Engineering Research Conference (MERCon), 2017 Moratuwa.   IEEE, 2017, pp. 425–430.
  83. F. Farhath, P. Theivendiram, S. Ranathunga, S. Jayasena, and G. Dias, “Improving domain-specific smt for low-resourced languages using data from different domains,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), 2018.
  84. C. Vasantharajan and U. Thayasivam, “Tamizhi-Net OCR: Creating A Quality Large Scale Tamil-Sinhala-English Parallel Corpus Using Deep Learning Based Printed Character Recognition (PCR),” arXiv preprint arXiv:2109.05952, 2021.
  85. C. Vasantharajan, L. Tharmalingam, and U. Thayasivam, “Adapting the tesseract open-source ocr engine for tamil and sinhala legacy fonts and creating a parallel corpus for tamil-sinhala-english,” in 2022 International Conference on Asian Language Processing (IALP).   IEEE, 2022, pp. 143–149.
  86. S. Fernando, S. Ranathunga, S. Jayasena, and G. Dias, “Comprehensive part-of-speech tag set and svm based pos tagger for sinhala,” in Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), 2016, pp. 173–182.
  87. N. Dilshani, S. Fernando, S. Ranathunga, S. Jayasena, and G. Dias, “A comprehensive part of speech (pos) tag set for sinhala language.”   The Third International Conference on Linguistics in Sri Lanka, ICLSL 2017 …, 2017.
  88. S. Fernando and S. Ranathunga, “Evaluation of different classifiers for sinhala pos tagging,” in 2018 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2018, pp. 96–101.
  89. S. A. P. M. Manamini, A. F. Ahamed, R. A. E. C. Rajapakshe, G. H. A. Reemal, S. Jayasena, G. V. Dias, and S. Ranathunga, “Ananya-a named-entity-recognition (ner) system for sinhala language,” in Moratuwa Engineering Research Conference (MERCon), 2016.   IEEE, 2016, pp. 30–35.
  90. A. Liyanage, S. Ranathunga, and S. Jayasena, “Bilingual lexical induction for sinhala-english using cross lingual embedding spaces,” in 2021 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2021, pp. 579–584.
  91. A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, and T. Mikolov, “Fasttext. zip: Compressing text classification models,” arXiv preprint arXiv:1612.03651, 2016.
  92. P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Transactions of the Association for Computational Linguistics, vol. 5, pp. 135–146, 2017.
  93. A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, “Bag of tricks for efficient text classification,” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2017, pp. 427–431.
  94. D. Herath, K. Gamage, and A. Malalasekara, “Research report on sinhala lexicon,” Langugae Technology Research Laboratory, UCSC.
  95. D. Herath and N. Medagoda, “Research report on the preprocessing engine of the optical character recognition system for sinhala scripts,” Language Technology Research Laboratory, Univ. Colombo, Sri Lanka.
  96. J. A. S. N. Silva, “Generating contextual word embeddings for sinhala,” Ph.D. dissertation, 2022.
  97. V. Jayawickrama, A. Ranasinghe, D. C. Attanayake, and Y. Wijeratne, “A corpus and machine learning models for fake news classification in sinhala,” 2021.
  98. C. Sonnadara, S. Ranathunga, and S. Jayasena, “Sinhala Spell Correction A Novel Benchmark with Neural Spell Correction.”
  99. E. De Saa and L. Ranathunga, “Self-reflective and introspective feature model for hate content detection in sinhala youtube videos,” in 2020 From Innovation to Impact (FITI), vol. 1.   IEEE, 2020, pp. 1–6.
  100. S. Perera, N. Meedin, M. Caldera, I. Perera, and S. Ahangama, “A comparative study of the characteristics of hate speech propagators and their behaviours over twitter social media platform,” Heliyon, 2023.
  101. J. Tiedemann, “Parallel data, tools and interfaces in opus,” in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), 2012, pp. 2214–2218.
  102. B. Zhang, P. Williams, I. Titov, and R. Sennrich, “Improving massively multilingual neural machine translation and zero-shot translation,” arXiv preprint arXiv:2004.11867, 2020.
  103. J. Tiedemann, “The development of a comprehensive data set for systematic studies of machine translation,” 2021.
  104. P. Lison and J. Tiedemann, “Opensubtitles2016: Extracting large parallel corpora from movie and tv subtitles,” 2016.
  105. NLLB Team, M. R. Costa-jussà, J. Cross, O. Çelebi, M. Elbayad, K. Heafield, K. Heffernan, E. Kalbassi, J. Lam, D. Licht, J. Maillard, A. Sun, S. Wang, G. Wenzek, A. Youngblood, B. Akula, L. Barrault, G. M. Gonzalez, P. Hansanti, J. Hoffman, S. Jarrett, K. R. Sadagopan, D. Rowe, S. Spruit, C. Tran, P. Andrews, N. F. Ayan, S. Bhosale, S. Edunov, A. Fan, C. Gao, V. Goswami, F. Guzmán, P. Koehn, A. Mourachko, C. Ropers, S. Saleem, H. Schwenk, and J. Wang, “No language left behind: Scaling human-centered machine translation,” arXiv preprint arXiv:2207.04672, 2022.
  106. R. Jenarthanan, Y. Senarath, and U. Thayasivam, “ACTSEA: annotated corpus for Tamil & Sinhala emotion analysis,” in 2019 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2019, pp. 49–53.
  107. A. Fernando, S. Ranathunga, D. Sachintha, L. Piyarathna, and C. Rajitha, “Exploiting bilingual lexicons to improve multilingual embedding-based document and sentence alignment for low-resource languages,” Knowledge and Information Systems, pp. 1–42, 2022.
  108. D. Buddhika, R. Liyadipita, S. Nadeeshan, H. Witharana, S. Jayasena, and U. Thayasivam, “Voicer: A crowd sourcing tool for speech data collection,” in 2018 18th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2018, pp. 174–181.
  109. J. Hellarawa and U. Thayasivam, “Domain specific intent classification with bilstm,” in 2022 International Conference on Asian Language Processing (IALP).   IEEE, 2022, pp. 265–270.
  110. T. Ranasinghe, I. Anuradha, D. Premasiri, K. Silva, H. Hettiarachchi, L. Uyangodage, and M. Zampieri, “Sold: Sinhala offensive language dataset,” arXiv preprint arXiv:2212.00851, 2022.
  111. D. van Esch, T. Lucassen, S. Ruder, I. Caswell, and C. Rivera, “Writing system and speaker metadata for 2,800+ language varieties,” in Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 5035–5046.
  112. S. Ruder, J. H. Clark, A. Gutkin, M. Kale, M. Ma, M. Nicosia, S. Rijhwani, P. Riley, J.-M. A. Sarr, X. Wang, J. Wieting, N. Gupta, A. Katanova, C. Kirov, D. L. Dickinson, B. Roark, B. Samanta, C. Tao, D. I. Adelani, V. Axelrod, I. Caswell, C. Cherry, D. Garrette, R. Ingle, M. Johnson, D. Panteleev, and P. Talukdar, “Xtreme-up: A user-centric scarce-data benchmark for under-represented languages,” arXiv preprint arXiv:2305.11938, 2023.
  113. V. Pratap, A. Tjandra, B. Shi, P. Tomasello, A. Babu, S. Kundu, A. Elkahky, Z. Ni, A. Vyas, M. Fazel-Zarandi, A. Baevski, Y. Adi, X. Zhang, W.-N. Hsu, A. Conneau, and M. Auli, “Scaling speech technology to 1,000+ languages,” arXiv, 2023.
  114. L. Burchell, A. Birch, N. Bogoychev, and K. Heafield, “An open dataset and model for language identification,” arXiv preprint arXiv:2305.13820, 2023.
  115. K. Wickramasinghe and N. De Silva, “Sinhala-English Parallel Word Dictionary Dataset,” in 2023 IEEE 17th International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2023, pp. 61–66.
  116. K. Wickramasinghe and N. de Silva, “Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language,” arXiv preprint arXiv:2311.10436, 2023.
  117. T. Nguyen, C. Van Nguyen, V. D. Lai, H. Man, N. T. Ngo, F. Dernoncourt, R. A. Rossi, and T. H. Nguyen, “Culturax: A cleaned, enormous, and multilingual dataset for large language models in 167 languages,” arXiv preprint arXiv:2309.09400, 2023.
  118. S. Kudugunta, I. Caswell, B. Zhang, X. Garcia, C. A. Choquette-Choo, K. Lee, D. Xin, A. Kusupati, R. Stella, A. Bapna et al., “Madlad-400: A multilingual and document-level large audited dataset,” arXiv preprint arXiv:2309.04662, 2023.
  119. T. Hasan, A. Bhattacharjee, M. S. Islam, K. Samin, Y.-F. Li, Y.-B. Kang, M. S. Rahman, and R. Shahriyar, “XL-sum: Large-scale multilingual abstractive summarization for 44 languages,” arXiv preprint arXiv:2106.13822, 2021.
  120. Y. Verma, A. Jangra, R. Kumar, and S. Saha, “Large Scale Multi-Lingual Multi-Modal Summarization Dataset,” arXiv preprint arXiv:2302.06560, 2023.
  121. K. Charuka, S. Wickramanayake, T. D. Ambegoda, P. Madhushan, and D. Wijesooriya, “Sign Language Recognition for Low Resource Languages Using Few Shot Learning,” in International Conference on Neural Information Processing.   Springer, 2023, pp. 203–214.
  122. P. O. Suarez, L. Romary, and B. Sagot, “A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 1703–1714.
  123. P. J. Ortiz Su’arez, B. Sagot, and L. Romary, “Asynchronous pipelines for processing huge corpora on medium to low resource infrastructures,” ser. Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019. Cardiff, 22nd July 2019, P. Bański, A. Barbaresi, H. Biber, E. Breiteneder, S. Clematide, M. Kupietz, H. L”ungen, and C. Iliadi, Eds.   Mannheim: Leibniz-Institut f”ur Deutsche Sprache, 2019, pp. 9 – 16. [Online]. Available: http://nbn-resolving.de/urn:nbn:de:bsz:mh39-90215
  124. G. P. Malalasekera, “English-sinhalese dictionary.” 1967.
  125. A. Weerasinghe and C. P. Weerasinghe, “Godage english-sinhala-tamil dictionary,” Sri Lanka: S. Godage and brothers, Godage book shop, vol. 661, 1999.
  126. M. Kulatunga. Madura english-sinhala dictionary - online language translator. [Online]. Available: https://maduraonline.com/
  127. A. Wasala and R. Weerasinghe, “Ensitip: a tool to unlock the english web,” in 11th international conference on humans and computers, Nagaoka University of Technology, Japan, 2008, pp. 20–23.
  128. L. Samarawickrama and B. Hettige, “Requirements for an english-sinhala smart bilingual dictionary: A review.”
  129. Department of Official Languages, Sri Lanka. Tri-lingual dictionary. [Online]. Available: https://www.trilingualdictionary.lk/
  130. A. Weerasinghe and G. Dias, “Construction of a multilingual place name database for sri lanka,” 2013.
  131. G. A. Miller, “Wordnet: a lexical database for english,” Communications of the ACM, vol. 38, no. 11, pp. 39–41, 1995.
  132. Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Proceedings of the 32nd annual meeting on Association for Computational Linguistics.   Association for Computational Linguistics, 1994, pp. 133–138.
  133. J. J. Jiang and D. W. Conrath, “Semantic similarity based on corpus statistics and lexical taxonomy,” in Proc of 10th International Conference on Research in Computational Linguistics, ROCLING’97.   Citeseer, 1997.
  134. N. de Silva, D. Dou, and J. Huang, “Discovering inconsistencies in pubmed abstracts through ontology-based information extraction,” in Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics.   ACM, 2017, pp. 362–371.
  135. Sinhala wordnet. [Online]. Available: http://www.wordnet.lk/
  136. J. Arukgoda, V. Bandara, S. Bashani, V. Gamage, and D. Wimalasuriya, “A word sense disambiguation technique for sinhala,” in 2014 4th International Conference on Artificial Intelligence with Applications in Engineering and Technology.   IEEE, 2014, pp. 207–211.
  137. V. Welgama, D. L. Herath, C. Liyanage, N. Udalamatta, R. Weerasinghe, and T. Jayawardana, “Towards a sinhala wordnet,” in Proceedings of the Conference on Human Language Technology for Development, 2011.
  138. S. Herath, T. Ikeda, S. Ishizaki, Y. Anzai, and H. Aiso, “Analysis system for sinhalese unit structure,” Journal of Experimental & Theoretical Artificial Intelligence, vol. 4, no. 1, pp. 29–48, 1992.
  139. V. Welgama, R. Weerasinghe, and M. Niranjan, “Evaluating a machine learning approach to sinhala morphological analysis,” in Proceedings of the 10th International Conference on Natural Language Processing, Noida, India, 2013.
  140. N. Fernando and R. Weerasinghe, “A morphological parser for sinhala verbs,” in Proceedings of the International Conference on Advances in ICT for Emerging Regions, 2013.
  141. W. S. N. Dilshani and G. Dias, “A corpus-based morphological analysis of sinhala verbs.”   The Third International Conference on Linguistics in Sri Lanka, ICLSL 2017 …, 2017.
  142. M. Nandathilaka, S. Ahangama, and G. T. Weerasuriya, “A rule-based lemmatizing approach for sinhala language,” in 2018 3rd International Conference on Information Technology Research (ICITR).   IEEE, 2018, pp. 1–5.
  143. V. Welgama, R. Weerasinghe, and N. Mahesan, “Defining the gold standard definitions for the morphology of sinhala words.”
  144. K. T. P. M. Kariyawasam, S. Y. Senanayake, and P. S. Haddela, “A rule based stemmer for sinhala language,” in 2019 14th Conference on Industrial and Information Systems (ICIIS).   IEEE, 2019, pp. 326–331.
  145. S. Y. Senanayake, K. T. P. M. Kariyawasam, and P. S. Haddela, “Enhanced tokenizer for sinhala language,” in 2019 National Information Technology Conference (NITC).   IEEE, 2019, pp. 84–89.
  146. K. Kumarasinghe, G. Dias, and I. Herath, “SinMorphy: A Morphological Analyzer for the Sinhala Language,” in 2021 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2021, pp. 681–686.
  147. Y. Ekanayaka, R. Pushpananda, V. Welgama, and C. Liyanage, “Applying Deep Learning for Morphological Analysis in the Sinhala Language,” The International Journal on Advances in ICT for Emerging Regions, vol. 16, pp. 2–10, 2023.
  148. D. L. Herath and A. R. Weerasinghe, “A stochastic part of speech tagger for sinhala,” in Proceedings of the 06th International Information Technology Conference, 2004, pp. 27–28.
  149. A. J. P. M. P. Jayaweera and N. G. J. Dias, “Evaluation of stochastic based tagging approach for sinhala language,” 2012.
  150. M. Jayasuriya and A. R. Weerasinghe, “Learning a stochastic part of speech tagger for sinhala,” in Advances in ICT for Emerging Regions (ICTer), 2013 International Conference on.   IEEE, 2013, pp. 137–143.
  151. A. J. P. M. P. Jayaweera and N. G. J. Dias, “Part of speech (pos) tagger for sinhala language,” 2011.
  152. ——, “Hidden markov model based part of speech tagger for sinhala language,” arXiv preprint arXiv:1407.2989, 2014.
  153. ——, “Unknown words analysis in pos tagging of sinhala language,” in Advances in ICT for Emerging Regions (ICTer), 2014 International Conference on.   IEEE, 2014, pp. 270–270.
  154. ——, “Handling issues with unknown words in pos tagging.”   Book of Abstracts, Annual Research Symposium 2014, 2014.
  155. M. Jayaweera and N. G. J. Dias, “Comparison of part of speech taggers for sinhala language,” 2016.
  156. A. J. P. M. P. Jayaweera and N. G. J. Dias, “Restful pos tagging web service for sinhala language,” in 2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2015, pp. 50–57.
  157. D. Gunasekara, W. V. Welgama, and A. R. Weerasinghe, “Hybrid part of speech tagger for sinhala language,” in Advances in ICT for Emerging Regions (ICTer), 2016 Sixteenth International Conference on.   IEEE, 2016, pp. 41–48.
  158. B. Kothalawala, R. Weerasinghe, and P. Kumarasinghe, “Online learning for solving data availability problem in natural language processing.” in NL4AI@ AI* IA, 2019.
  159. S. G. Withanage and T. Silva, “A stochastic part of speech tagger for the sinhala language based on social media data mining,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 137–142.
  160. Y. A. D. S. S. Wijerathna, “Svm based part of speech tagger for sinhala language,” Ph.D. dissertation, 2020.
  161. M. W. A. R. Sathsarani, T. P. A. B. Thalawaththa, N. K. Galappaththi, J. N. Danthanarayana, and A. Gamage, “Sinhala part of speech tagger using deep learning techniques,” in 2022 6th International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS).   IEEE, 2022, pp. 1–6.
  162. C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky, “The Stanford CoreNLP natural language processing toolkit,” in Association for Computational Linguistics (ACL) System Demonstrations, 2014, pp. 55–60. [Online]. Available: http://www.aclweb.org/anthology/P/P14/P14-5010
  163. J. Aissen, “Differential object marking: Iconicity vs. economy,” Natural Language & Linguistic Theory, vol. 21, no. 3, pp. 435–483, 2003.
  164. J. K. Dahanayaka and A. R. Weerasinghe, “Named entity recognition for sinhala language,” in Advances in ICT for Emerging Regions (ICTer), 2014 International Conference on.   IEEE, 2014, pp. 215–220.
  165. K. U. Senevirathne, N. S. Attanayake, A. W. M. H. Dhananjanie, W. A. S. U. Weragoda, A. Nugaliyadde, and S. Thelijjagoda, “Conditional random fields based named entity recognition for sinhala,” in 2015 IEEE 10th International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2015, pp. 302–307.
  166. R. Azeez and S. Ranathunga, “Fine-grained named entity recognition for sinhala,” in 2020 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2020, pp. 295–300.
  167. H. M. S. Anuruddha, “Reinforcement learning for sinhala named entity recognition,” Ph.D. dissertation, 2021.
  168. W. M. S. K. Wijesinghe and M. Tissera, “Sinhala named entity recognition model: Domain-specific classes in sports,” in 2022 4th International Conference on Advancements in Computing (ICAC).   IEEE, 2022, pp. 138–143.
  169. P. S. Mallikarachchi, S. A. S. Lorensuhewa, and M. A. L. Kalyani, “Support vector machine based named entity recognition for sinhala,” 2021.
  170. J. C. S. Kadupitiya, S. Ranathunga, and G. Dias, “Sinhala short sentence similarity calculation using corpus-based and knowledge-based similarity measures,” in Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), 2016, pp. 44–53.
  171. ——, “Sinhala short sentence similarity measures using corpus-based similarity for short answer grading,” in 6th Workshop on South and Southeast Asian Natural Language Processing, 2017, pp. 44–53.
  172. S. Nilaxan and S. Ranathunga, “Monolingual sentence similarity measurement using siamese neural networks for sinhala and tamil languages,” in 2021 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2021, pp. 567–572.
  173. U. Isuranga, J. Sandaruwan, U. Athukorala, and G. Dias, “Improved cross-lingual document similarity measurement,” 2020.
  174. S. Gallege, “Analysis of sinhala using natural language processing techniques,” 2010.
  175. K. B. N. Lakmali and P. S. Haddela, “Effectiveness of rule-based classifiers in sinhala text categorization,” in 2017 National Information Technology Conference (NITC).   IEEE, 2017, pp. 153–158.
  176. P. K. S. Kumari and P. S. Haddela, “Use of lime for human interpretability in sinhala document classification,” in 2019 International Research Conference on Smart Computing and Systems Engineering (SCSE).   IEEE, 2019, pp. 97–102.
  177. M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should i trust you?: Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining.   ACM, 2016, pp. 1135–1144.
  178. P. Nanayakkara and S. Ranathunga, “Clustering sinhala news articles using corpus-based similarity measures,” in 2018 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2018, pp. 437–442.
  179. S. V. S. Gunasekara and P. S. Haddela, “Context aware stopwords for sinhala text classification,” in 2018 National Information Technology Conference (NITC).   IEEE, 2018, pp. 1–6.
  180. ——, “Effective domain specific stopwords generation for sinhala text.”   19th Conference on Postgraduate Research, International Postgraduate …, 2018.
  181. S. H. Jayasinghe and K. Sirts, “Deep learning textual entailment system for sinhala language,” 2019.
  182. P. Demotte and S. Ranathunga, “Dual-state capsule networks for text classification,” arXiv preprint arXiv:2109.04762, 2021.
  183. L. Senevirathne, P. Demotte, B. Karunanayake, U. Munasinghe, and S. Ranathunga, “Sentiment Analysis for Sinhala Language using Deep Learning Techniques,” arXiv preprint arXiv:2011.07280, 2020.
  184. A. Sameemdeen and N. Selvanthan, “Topic classification using active learning for sinhala language documents,” in 2021 Asian Conference on Innovation in Technology (ASIANCON).   IEEE, 2021, pp. 1–5.
  185. D. Buddhika, R. Liyadipita, S. Nadeeshan, H. Witharana, S. Javasena, and U. Thayasivam, “Domain specific intent classification of sinhala speech data,” in 2018 International Conference on Asian Language Processing (IALP).   IEEE, 2018, pp. 197–202.
  186. B. Novak, D. Mladenič, and M. Grobelnik, “Text classification with active learning,” in From Data and Information Analysis to Knowledge Engineering.   Springer, 2006, pp. 398–405.
  187. B. Yang, J.-T. Sun, T. Wang, and Z. Chen, “Effective multi-label active learning for text classification,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009, pp. 917–926.
  188. O. Bandara, D. Jayarathne, D. Shashinika, and L. Ranathunga, “Ontology based fake news detection for sinhala language,” in 2021 6th International Conference on Information Technology Research (ICITR).   IEEE, 2021, pp. 1–6.
  189. Y. Kodithuwakku and S. Hettiarachchi, “Adapttext: A novel framework for domain-independent automated sinhala text classification,” in 2021 10th International Conference on Information and Automation for Sustainability (ICIAfS).   IEEE, 2021, pp. 240–245.
  190. A. D. Koralage, “Sinclassify-sinhala text classification system,” Ph.D. dissertation, 2019.
  191. P. Haddela, L. Hirsch, T. Brunsdon, and J. Gaudoin, “Use of interpretable evolved search query classifiers for sinhala documents,” in Proceedings of the Future Technologies Conference.   Springer, 2020, pp. 790–804.
  192. H. Rathnayake, J. Sumanapala, R. Rukshani, and S. Ranathunga, “Adapter based fine-tuning of pre-trained multilingual language models for code-mixed and code-switched text classification,” 2022.
  193. N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient transfer learning for nlp,” in International Conference on Machine Learning.   PMLR, 2019, pp. 2790–2799.
  194. J. Pfeiffer, A. Rücklé, C. Poth, A. Kamath, I. Vulić, S. Ruder, K. Cho, and I. Gurevych, “Adapterhub: A framework for adapting transformers,” arXiv preprint arXiv:2007.07779, 2020.
  195. J. Pfeiffer, I. Vulić, I. Gurevych, and S. Ruder, “Mad-x: An adapter-based framework for multi-task cross-lingual transfer,” arXiv preprint arXiv:2005.00052, 2020.
  196. J. Pfeiffer, A. Kamath, A. Rücklé, K. Cho, and I. Gurevych, “Adapterfusion: Non-destructive task composition for transfer learning,” arXiv preprint arXiv:2005.00247, 2020.
  197. X. Wang, Y. Tsvetkov, S. Ruder, and G. Neubig, “Efficient test time adapter ensembling for low-resource language varieties,” arXiv preprint arXiv:2109.04877, 2021.
  198. D. Friedman, B. Dodge, and D. Chen, “Single-dataset experts for multi-dataset question answering,” arXiv preprint arXiv:2109.13880, 2021.
  199. A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, “Unsupervised cross-lingual representation learning at scale,” arXiv preprint arXiv:1911.02116, 2019.
  200. G. Kirindage and N. Godewithana, “Automatic sinhala news classification approach for news platforms,” in 2020 IEEE 7th International Conference on Engineering Technologies and Applied Sciences (ICETAS).   IEEE, 2020, pp. 1–6.
  201. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” the Journal of machine Learning research, vol. 3, pp. 993–1022, 2003.
  202. C. O. Hettigoda, “An english-sinhala mixed-language comment analyzing system for facebook pages,” Ph.D. dissertation, 2019.
  203. W. M. S. N. P. Wijayarathna and S. Jayalal, “A hybrid feature-based approach for classification of fake news in sinhala on social media.”
  204. S. Chathuranga and S. Ranathunga, “Classification of code-mixed text using capsule networks,” in Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), 2021, pp. 256–263.
  205. R. I. Weerasiri, S. A. S. Lorensuhewa, and M. A. L. Kalyani, “Word embedding-based sinhala news documents classification,” 2022.
  206. Q. Le and T. Mikolov, “Distributed representations of sentences and documents,” in International conference on machine learning.   PMLR, 2014, pp. 1188–1196.
  207. H. M. M. Caldera, N. Meedin, S. Perera, and I. Perera, “Long-term trend analysis for social media content published during covid-19 pandemic,” in 2022 2nd International Conference on Advanced Research in Computing (ICARC).   IEEE, 2022, pp. 108–113.
  208. N. Medagoda, “Sentiment analysis on morphologically rich languages: An artificial neural network (ann) approach,” in Artificial Neural Network Modelling.   Springer, 2016, pp. 377–393.
  209. N. Medagoda, S. Shanmuganathan, and J. Whalley, “Sentiment lexicon construction using sentiwordnet 3.0,” in 2015 11th International Conference on Natural Computation (ICNC).   IEEE, 2015, pp. 802–807.
  210. P. D. T. Chathuranga, S. A. S. Lorensuhewa, and M. A. L. Kalyani, “Sinhala sentiment analysis using corpus based sentiment lexicon,” in International Conference on Advances in ICT for Emerging Regions (ICTer), vol. 1, 2019, p. 7.
  211. P. Demotte, L. Senevirathne, B. Karunanayake, U. Munasinghe, and S. Ranathunga, “Sentiment Analysis of Sinhala News Comments using Sentence-State LSTM Networks,” in 2020 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2020, pp. 283–288.
  212. B. Karunanayake, U. Munasinghe, P. Demotte, L. Senevirathne, and S. Ranathunga, “Sinhala Sentiment Lexicon Generation using Word Similarity,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 77–82.
  213. P. Jayasuriya, R. Munasinghe, and S. Thelijjagoda, “Sentiment classification of sinhala content in social media: A comparison between word n-grams and character n-grams.”
  214. S. Ranathunga and I. U. Liyanage, “Sentiment analysis of sinhala news comments,” Transactions on Asian and Low-Resource Language Information Processing, vol. 20, no. 4, pp. 1–23, 2021.
  215. P. Jayasuriya, R. Munasinghe, and S. Thelijjagoda, “Sentiment classification of sinhala content in social media: A comparison between stemmers and n-gram features,” in 2021 IEEE 16th International Conference on Industrial and Information Systems (ICIIS).   IEEE, pp. 134–139.
  216. ——, “Sentiment classification of sinhala content in social media: An ensemble approach,” in 2021 IEEE 16th International Conference on Industrial and Information Systems (ICIIS).   IEEE, pp. 140–145.
  217. W. I. Karunarathne, “Sentiment analysis of sinhala tweets,” Ph.D. dissertation, 2020.
  218. K. Abeyratne and K. Jayaratne, “Classification of sinhala songs based on emotions,” in 2019 19th International Conference on Advances in ICT for Emerging Regions (ICTer), vol. 250.   IEEE, 2019, pp. 1–10.
  219. V. Jayawickrama, G. Weeraprameshwara, N. de Silva, and Y. Wijeratne, “Seeking sinhala sentiment: Predicting facebook reactions of sinhala posts,” arXiv preprint arXiv:2112.00468, 2021.
  220. ——, “Facebook for sentiment analysis: Baseline models to predict facebook reactions of sinhala posts,” The International Journal on Advances in ICT for Emerging Regions, vol. 15, no. 2, 2022.
  221. G. Weeraprameshwara, V. Jayawickrama, N. de Silva, and Y. Wijeratne, “Sentiment analysis with deep learning models: A comparative study on a decade of sinhala language facebook data,” arXiv preprint arXiv:2201.03941, 2022.
  222. P. M. I. U. Aththanayaka and H. M. M. Naleer, “Sentimental analysis of comments in social media in sinhala-english code-mixed language using supervised learning techniques,” 2020.
  223. B. Mathew, P. Saha, S. M. Yimam, C. Biemann, P. Goyal, and A. Mukherjee, “HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection,” in Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 17, 2021, pp. 14 867–14 875.
  224. D. H. A. De Silva, “An approach to hate speech detection,” 2019.
  225. S. T. Sandaruwan, S. A. S. Lorensuhewa, and K. Munasinghe, “Identification of abusive sinhala comments in social media using text mining and machine learning techniques,” ICTer, vol. 13, no. 1, 2020.
  226. H. M. A. I. Amali and S. Jayalal, “Classification of cyberbullying sinhala language comments on social media,” in 2020 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2020, pp. 266–271.
  227. N. Hettiarachchi, R. Weerasinghe, and R. Pushpanda, “Detecting hate speech in social media articles in romanized sinhala,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 250–255.
  228. S. W. A. M. D. Samarasinghe, R. G. N. Meegama, and M. Punchimudiyanse, “Machine learning approach for the detection of hate speech in sinhala unicode text,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 65–70.
  229. S. Kariyawasam, “A machine learning approach in the identification of sinhala toxic language on social media,” Ph.D. dissertation, 2019.
  230. M. Guruge, S. Ahangama, and D. Amarasinghe, “Analyze hate contents on sinhala tweets using an ensemble method,” in 2022 2nd International Conference on Advanced Research in Computing (ICARC).   IEEE, 2022, pp. 183–187.
  231. H. M. S. T. Sandaruwan, S. A. S. Lorensuhewa, and M. A. L. Kalyani, “Sinhala hate speech detection in social media using text mining and machine learning,” in 2019 19th International Conference on Advances in ICT for Emerging Regions (ICTer), vol. 250.   IEEE, 2019, pp. 1–8.
  232. R. R. Sheran, “Detection of hate speech written in sinhala and singlish language posted on social media by users in sri lanka using text analytics,” Ph.D. dissertation, 2019.
  233. S. Munasinghe and U. Thayasivam, “A deep learning ensemble hate speech detection approach for sinhala tweets,” in 2022 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2022, pp. 1–6.
  234. J. A. D. U. Shalinda and L. Munasinghe, “Hate words detection among sri lankan social media text messages,” in 2022 International Research Conference on Smart Computing and Systems Engineering (SCSE), vol. 5.   IEEE, 2022, pp. 55–60.
  235. K. Gamage, V. Welgama, and R. Weerasinghe, “Improving sinhala hate speech detection using deep learning,” in 2022 22nd International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2022, pp. 045–050.
  236. W. S. S. Fernando, R. Weerasinghe, and E. R. A. D. Bandara, “Sinhala hate speech detection in social media using machine learning and deep learning,” in 2022 22nd International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2022, pp. 166–171.
  237. S. Perera, S. Ahangama, I. Perera, and S. Hathnapitiya, “Predicting twitter hate user behavior using big five personality traits and ensemble machine learning,” in International Conference on Human-Computer Interaction.   Springer, 2023, pp. 116–130.
  238. D. A. Rajapaksha, S. Ahangama, M. D. Dushanthi, and K. D. G. I. Madhurangi, “Analyzing trends and topics of sinhala hate speech on twitter: A time series approach,” in 2023 IEEE 17th International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2023, pp. 67–72.
  239. D. S. U. Arachchi, R. P. N. M. Herath, M. B. P. T. H. Gunaratne, K. T. Hansana, E. Weerasinghe, and D. I. De Silva, “An Inappropriate Word Detector for The Sinhala to English and English to Sinhala Translator (SEES),” Tuijin Jishu/Journal of Propulsion Technology, vol. 44, no. 4, pp. 7657–7665, 2023.
  240. E. N. Fernando and J. D. Deng, “Enhancing Hate Speech Detection in Sinhala Language on Social Media using Machine Learning,” 2023.
  241. O. E. Ojo, O. O. Adebanji, H. Calvo, A. Gelbukh, A. Feldman, and G. Sidorov, “Hate and Offensive Content Identification in Indo-Aryan Languages using Transformer-based Models,” 2023.
  242. T. Ranasinghe and M. Zampieri, “A text-to-text model for multilingual offensive language identification,” arXiv preprint arXiv:2312.03379, 2023.
  243. Y. Bestgen, “Using Only Character Ngrams for Hate Speech and Offensive Content Identification in Five Low-Ressource Languages,” in Forum for Information Retrieval Evaluation, 2023.
  244. N. Narayan, M. Biswal, P. Goyal, and A. Panigrahi, “Hate Speech and Offensive Content Detection in Indo-Aryan Languages: A Battle of LSTM and Transformers,” arXiv preprint arXiv:2312.05671, 2023.
  245. W. M. S. N. P. Wijayarathna and S. Jayalal, “Text similarity-based approach to detect sinhala language fake news in social media: An approach using hybrid features,” 2021.
  246. ——, “Sinhala language-based social media analysis to detect fake news,” 2020.
  247. L. Udurawana, R. Weerasinghe, and R. Pushpananda, “A hybrid approach for detection of fake news in sinhala text,” in 2022 22nd International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2022, pp. 039–044.
  248. R. Adihetti and S. Jayalal, “Sinhala language fake news detection in social media using autoencoder-based method,” in 2023 International Research Conference on Smart Computing and Systems Engineering (SCSE), vol. 6.   IEEE, 2023, pp. 1–8.
  249. D. Yarowsky, “Word-sense disambiguation using statistical models of roget’s categories trained on large corpora,” in Proceedings of the 14th conference on Computational linguistics-Volume 2.   Association for Computational Linguistics, 1992, pp. 454–460.
  250. N. Ide and J. Véronis, “Introduction to the special issue on word sense disambiguation: the state of the art,” Computational linguistics, vol. 24, no. 1, pp. 2–40, 1998.
  251. D. Yarowsky, “Unsupervised word sense disambiguation rivaling supervised methods,” in 33rd annual meeting of the association for computational linguistics, 1995, pp. 189–196.
  252. S. Banerjee and T. Pedersen, “An adapted lesk algorithm for word sense disambiguation using wordnet,” in International conference on intelligent text processing and computational linguistics.   Springer, 2002, pp. 136–145.
  253. R. Navigli, “Word sense disambiguation: A survey,” ACM computing surveys (CSUR), vol. 41, no. 2, p. 10, 2009.
  254. M. Lesk, “Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone,” in Proceedings of the 5th annual international conference on Systems documentation.   Citeseer, 1986, pp. 24–26.
  255. C. Marasinghe, S. Herath, and A. Herath, “Word sense disambiguation of sinhala language with unsupervised learning,” in Proc. International Conference on Information Technology and Applications, 2002, pp. 25–29.
  256. S. Palihakkara, D. Sahabandu, A. Shamsudeen, C. Bandara, and S. Ranathunga, “Dialogue act recognition for text-based sinhala,” in Proceedings of the 12th International Conference on Natural Language Processing, 2015, pp. 367–375.
  257. T. Subasingha, “Sinsense-word sense disambiguation tool for sinhala language,” Ph.D. dissertation, 2020.
  258. W. V. Welgama, “Automatic text summarization for sinhala,” 2012.
  259. O. S. Wimalasuriya, “Automatic text summarization for sinhala,” Ph.D. dissertation, 2019.
  260. H. M. R. Y. Jayawardane, “Automatic sinhala text summarization for government gazettes using abstractive and extractive methods,” Ph.D. dissertation, 2022.
  261. B. R. M. S. R. B. Rathnayake, K. Manathunga, and D. Kasthurirathna, “” talking books”: A sinhala abstractive text summarization approach for sinhala textbooks,” in 2023 IEEE 8th International Conference for Convergence in Technology (I2CT).   IEEE, 2023, pp. 1–6.
  262. M. A. C. A. Jahan and K. K. C. Wijesekara, “Automated text summarization of sinhala online articles,” Journal of Science-FAS-SEUSL, vol. 4, no. 01, pp. 01–15, 2023.
  263. L. Xue, N. Constant, A. Roberts, M. Kale, R. Al-Rfou, A. Siddhant, A. Barua, and C. Raffel, “mt5: A massively multilingual pre-trained text-to-text transformer,” arXiv preprint arXiv:2010.11934, 2020.
  264. S. Herath, S. Ishizaki, T. Ikeda, Y. Anzai, and H. Aiso, “Syntactic and semantic analysis of sinhala: a step towards intelligence computing systems,” in Proceedings. 5th IEEE International Symposium on Intelligent Control 1990.   IEEE, 1990, pp. 316–324.
  265. A. Wasala and K. Gamage, “Research report on phonetics and phonology of sinhala,” Language Technology Research Laboratory, University of Colombo School of Computing, vol. 35, 2005.
  266. R. I. P. Wickramasinghe, K. H. Kumara, and N. G. J. Dias, “Practical issues in the development of tts and sr for the sinhala language,” 2007.
  267. R. Weerasinghe, A. Wasala, and K. Gamage, “A rule based syllabification algorithm for sinhala,” in International Conference on Natural Language Processing.   Springer, 2005, pp. 438–449.
  268. A. Wasala, R. Weerasinghe, and K. Gamage, “Sinhala grapheme-to-phoneme conversion and rules for schwa epenthesis,” in Proceedings of the COLING/ACL on Main conference poster sessions.   Association for Computational Linguistics, 2006, pp. 890–897.
  269. T. Nadungodage, C. Liyanage, A. Prerera, R. Pushpananda, and R. Weerasinghe, “Sinhala g2p conversion for speech processing,” in Proc. The 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, pp. 112–116.
  270. R. Weerasinghe, A. Wasala, V. Welgama, and K. Gamage, “Festival-si: A sinhala text-to-speech system,” in International Conference on Text, Speech and Dialogue.   Springer, 2007, pp. 472–479.
  271. M. S. Amarasekara, K. M. N. S. Bandara, B. V. A. I. Vithana, D. H. De Silva, and A. Jayakody, “Real-time interactive voice communication-for a mute person in sinhala (rtivc),” in 2013 8th International Conference on Computer Science & Education.   IEEE, 2013, pp. 671–675.
  272. K. H. Kumara, N. G. J. Dias, and H. Sirisena, “Automatic segmentation of given set of sinhala text into syllables for speech synthesis,” pp. 53–62, 2007.
  273. W. M. C. Bandara, W. M. S. Lakmal, T. D. Liyanagama, S. V. Bulathsinghala, G. Dias, and S. Jayasena, “A ew prosodic phrasing method for sinhala language,” 2017.
  274. W. M. C. Bandara, S. V. Bulathsinghala, W. M. S.Lakmal, T. D. Liyanagama, G. Dias, and S. Jayasena, “Sinhala text to speech system,” 2009.
  275. W. M. C. Bandara, V. M. S. Lakmal, T. D. Liyanagama, S. V. Bulathsinghala, G. Dias, and S. Jayasena, “A new prosodic phrasing model for sinhala language,” 2013.
  276. K. Sodimana, P. De Silva, R. Sproat, A. Theeraphol, C. F. Li, A. Gutkin, S. Sarin, and K. Pipatsrisawat, “Text Normalization for Bangla, Khmer, Nepali, Javanese, Sinhala, and Sundanese Text-to-Speech Systems,” 2018.
  277. K. Sodimana, K. Pipatsrisawat, L. Ha, M. Jansche, O. Kjartansson, P. De Silva, and S. Sarin, “A step-by-step process for building tts voices using open source data and framework for bangla, javanese, khmer, nepali, sinhala, and sundanese,” 2018.
  278. D. S. Jayamanna, “Android based sinhala document reader for visually impaired persons,” 2014.
  279. A. K. P. D. Mishangi, “Android based sinhala document reader for visually impaired people,” 2018.
  280. D. S. S. De Zoysa, J. M. Sampath, E. M. P. De Seram, D. M. I. D. Dissanayake, L. Wijerathna, and S. Thelijjagoda, “Project bhashitha-mobile based optical character recognition and text-to-speech system,” in 2018 13th International Conference on Computer Science & Education (ICCSE).   IEEE, 2018, pp. 1–5.
  281. M. A. J. A. Lakmal, K. A. D. G. Methmini, D. M. H. M. Rupasinghe, D. I. Hettiarachchi, V. Piyawardana, M. Senarathna, S. Reyal, and K. Pulasinghe, “Adapting MaryTTS for Synthesizing Sinhalese Speech to Communicate with Children,” in 2021 6th International Conference on Information Technology Research (ICITR).   IEEE, 2021, pp. 1–6.
  282. M. Senarathna, K. Pulasinghe, and S. Reyal, “Step-by-Step Process of Building Voices for Under Resourced Languages using MARY TTS Platform,” in 2022 4th International Conference on Advancements in Computing (ICAC).   IEEE, 2022, pp. 18–23.
  283. M. Schröder and J. Trouvain, “The German text-to-speech synthesis system MARY: A tool for research, development and teaching,” International Journal of Speech Technology, vol. 6, no. 4, pp. 365–377, 2003.
  284. P. Jayawardhana, A. Aponso, N. Krishnarajah, and A. Rathnayake, “An intelligent approach of text-to-speech synthesizers for english and sinhala languages,” in 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT).   IEEE, 2019, pp. 229–234.
  285. W. Ping, K. Peng, A. Gibiansky, S. O. Arik, A. Kannan, S. Narang, J. Raiman, and J. Miller, “Deep voice 3: Scaling text-to-speech with convolutional sequence learning,” arXiv preprint arXiv:1710.07654, 2017.
  286. C. Y. Gamage, J. R. M. Bogahawatte, U. K. T. Prasadika, and S. Sumathipala, “DNN based Currency Recognition System for Visually Impaired in Sinhala,” in 2020 2nd International Conference on Advancements in Computing (ICAC), vol. 1.   IEEE, 2020, pp. 422–427.
  287. K. S. Anuradha and S. Thelijjagoda, “Machine translation system to convert sinhala and english braille documents into voice,” in 2020 International Research Conference on Smart Computing and Systems Engineering (SCSE).   IEEE, 2020, pp. 7–16.
  288. L. Nanayakkara, C. Liyanage, P.-T. Viswakula, T. Nagungodage, R. Pushpananda, and R. Weerasinghe, “A human quality text to speech system for sinhala,” in Proc. The 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, pp. 157–161.
  289. T. Nadungodage, R. Weerasinghe, and M. Niranjan, “Speech recognition for low resourced languages: Efficient use of training data for sinhala speech recognition by active learning.”
  290. T. Nadungodage and R. Weerasinghe, “Continuous sinhala speech recognizer,” in Conference on Human Language Technology for Development, Alexandria, Egypt, 2011, pp. 2–5.
  291. T. Nadungodage, R. Weerasinghe, and M. Niranjan, “Efficient use of training data for Sinhala speech recognition using active learning,” in Advances in ICT for Emerging Regions (ICTer), 2013 International Conference on.   IEEE, 2013, pp. 149–153.
  292. ——, “Speaker Adaptation Applied to Sinhala Speech Recognition.” Int. J. Comput. Linguistics Appl., vol. 6, no. 1, pp. 117–129, 2015.
  293. W. G. T. N. Amarasingha and D. D. A. Gamini, “Speaker independent sinhala speech recognition for voice dialling,” in International Conference on Advances in ICT for Emerging Regions (ICTer2012).   IEEE, 2012, pp. 3–6.
  294. W. Manamperi, D. Karunathilake, T. Madhushani, N. Galagedara, and D. Dias, “Sinhala speech recognition for interactive voice response systems accessed through mobile phones,” in 2018 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2018, pp. 241–246.
  295. N. Prasangini and H. Nagahamulla, “Sinhala speech to sinhala unicode text conversion for disaster relief facilitation in sri lanka,” in 2018 IEEE International Conference on Information and Automation for Sustainability (ICIAfS), 2018, pp. 1–6.
  296. P. G. N. Priyadarshani, “Speaker dependent speech recognition on a selected set of sinhala words,” 2012.
  297. P. G. N. Priyadarshani, N. G. J. Dias, and A. Punchihewa, “Dynamic time warping based speech recognition for isolated sinhala words,” in 2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS).   IEEE, 2012, pp. 892–895.
  298. ——, “Genetic algorithm approach for sinhala speech recognition,” in 2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS).   IEEE, 2012, pp. 896–899.
  299. P. G. N. Priyadarshani and N. G. J. Dias, “Automatic segmentation of separately pronounced sinhala words into syllables,” 2011.
  300. M. K. H. Gunasekara and R. G. N. Meegama, “Real-time translation of discrete sinhala speech to unicode text,” in 2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2015, pp. 140–145.
  301. M. Punchimudiyanse and R. G. N. Meegama, “Unicode sinhala and phonetic english bi-directional conversion for sinhala speech recognizer,” in 2015 IEEE 10th International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2015, pp. 296–301.
  302. Y. Karunanayake, U. Thayasivam, and S. Ranathunga, “Transfer learning based free-form speech command classification for low-resource languages,” in Proceedings of the 57th Conference of the Association for Computational Linguistics: Student Research Workshop, 2019, pp. 288–294.
  303. K. A. D. C. Dilshan, “Transcribing number sequences in continuous sinhala speech,” 2018.
  304. B. Gamage, R. Pushpananda, R. Weerasinghe, and T. Nadungodage, “Usage of combinational acoustic models (dnn-hmm and sgmm) and identifying the impact of language models in sinhala speech recognition,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 17–22.
  305. D. Giuliani and B. BabaAli, “Large vocabulary children’s speech recognition with dnn-hmm and sgmm acoustic modeling,” in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
  306. B. Gamage, R. Pushpananda, T. Nadungodage, and R. Weerasinghe, “Improve sinhala speech recognition through e2e lf-mmi model,” in Proceedings of the 18th International Conference on Natural Language Processing (ICON), 2021, pp. 213–219.
  307. V. Manohar, H. Hadian, D. Povey, and S. Khudanpur, “Semi-supervised training of acoustic models using lattice-free mmi,” in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2018, pp. 4844–4848.
  308. A. Carmantini, P. Bell, and S. Renals, “Untranscribed web audio for low resource speech recognition.” in INTERSPEECH, 2019, pp. 226–230.
  309. H. Karunathilaka, V. Welgama, T. Nadungodage, and R. Weerasinghe, “Low-resource sinhala speech recognition using deep learning,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 196–201.
  310. A. M. Arafath, “Polylingo-a short utterance based automatic sinhala language identification & translation tool,” Ph.D. dissertation, 2020.
  311. D. Warusawithana, N. Kulaweera, L. Weerasinghe, and B. Karunarathne, “Enhanced time delay neural network architectures for sinhala speech recognition,” in 2022 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2022, pp. 1–6.
  312. D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz et al., “The kaldi speech recognition toolkit,” in IEEE 2011 workshop on automatic speech recognition and understanding, no. CONF.   IEEE Signal Processing Society, 2011.
  313. A. Kahawanugoda, K. Gnanarathna, N. Meegoda, R. Monarawila, P. Samarasinghe, and A. G. Lindamulage, “Development of low resource machine learning models for child cognitive ability assessments,” in 2022 4th International Conference on Advancements in Computing (ICAC).   IEEE, 2022, pp. 72–77.
  314. M. Y. M. Azir, S. A. S. Lorensuhewa, and M. A. L. Kalyani, “Sinhala speech recognition using hidden markov based model and deep neural networks based model for number sequences,” 2021.
  315. T. K. Arachchige and R. Weerasinghe, “Tacosi: A sinhala text to speech system with neural networks,” in 2023 3rd International Conference on Advanced Research in Computing (ICARC).   IEEE, 2023, pp. 120–124.
  316. Y. Wang, R. Skerry-Ryan, D. Stanton, Y. Wu, R. J. Weiss, N. Jaitly, Z. Yang, Y. Xiao, Z. Chen, S. Bengio et al., “Tacotron: Towards end-to-end speech synthesis,” arXiv preprint arXiv:1703.10135, 2017.
  317. A. L. Nanayakkara, “Exploring model level transfer learning for improving sinhala speech recognition,” Ph.D. dissertation, 2023.
  318. W. T. V. L. Gunarathne, T. K. Ramasinghe, D. G. J. B. Wimalarathne, B. M. S. H. Balasuriya, and B. Hettige, “Sinhala speech to text library using sphinx,” 2017.
  319. R. V. P. S. Akesh and R. G. N. Meegama, “Real-Time Subtitle Generator for Sinhala Speech,” Vidyodaya Journal of Science, vol. 26, no. 02, 2023.
  320. J. S. Bridle and M. D. Brown, “An experimental automatic word recognition system,” JSRU report, vol. 1003, no. 5, p. 33, 1974.
  321. P. Mermelstein, “Distance measures for speech recognition, psychological and instrumental,” Pattern recognition and artificial intelligence, vol. 116, pp. 374–388, 1976.
  322. R. Layansan, S. Aravinth, S. Sarmilan, C. Banujan, and G. Fernando, “Android speech-to-speech translation system for sinhala,” International Journal of Scientific & Engineering Research, vol. 6, no. 10, pp. 1660–1664, 2015.
  323. D. D. S. Rajapakshe, K. N. B. Kudawithana, U. L. N. P. Uswatte, N. A. B. D. Nishshanka, A. V. S. Piyawardana, and K. N. Pulasinghe, “Sinhala conversational interface for appointment management and medical advice,” in 2020 2nd International Conference on Advancements in Computing (ICAC), vol. 1.   IEEE, 2020, pp. 85–90.
  324. Y. Karunanayake, U. Thayasivam, and S. Ranathunga, “Sinhala and tamil speech intent identification from english phoneme based asr,” in 2019 International Conference on Asian Language Processing (IALP).   IEEE, 2019, pp. 234–239.
  325. A. Ignatius and U. Thayasivam, “Speaker-invariant speech-to-intent classification for low-resource languages,” in International Conference on Speech and Computer.   Springer, 2021, pp. 279–290.
  326. H. Yadav, A. Gupta, S. K. Rallabandi, A. W. Black, and R. R. Shah, “Intent classification using pre-trained embeddings for low resource languages,” arXiv preprint arXiv:2110.09264, 2021.
  327. T. Dinushika, L. Kavmini, P. Abeyawardhana, U. Thayasivam, and S. Jayasena, “Speech command classification system for sinhala language based on automatic speech recognition,” in 2019 International Conference on Asian Language Processing (IALP).   IEEE, 2019, pp. 205–210.
  328. L. Kavmini, T. Dinushika, U. Thayasivam, and S. Jayasena, “Improved speech command classification system for sinhala language based on automatic speech recognition,” International Journal of Asian Language Processing, p. 2050009, 2020.
  329. K. T. Welarathna, V. Kulasekara, K. Pulasinghe, and V. Piyawardana, “Automated sinhala speech emotions analysis tool for autism children,” in 2021 10th International Conference on Information and Automation for Sustainability (ICIAfS).   IEEE, 2021, pp. 500–505.
  330. K. H. I. R. Senarathne, J. M. I. Nirash, H. M. C. P. Herath, V. D. Bandara, D. Wijendra, and J. Krishara, “Automated sinhala voice assistant to manage tasks using natural language processing - voice,” in 2022 3rd International Informatics and Software Engineering Conference (IISEC).   IEEE, 2022, pp. 1–5.
  331. C. Weerathunga, R. Weerasinghe, and D. Sandaruwan, “Lip synchronization modeling for sinhala speech,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 208–213.
  332. W. G. V. K. Wakkumbura, R. A. H. Madhubhashana, P. M. K. Alahakoon, W. G. C. W. Kumara, and M. N. A. Hinas, “Phoneme-viseme mapping for sinhala speaking robot for sri lankan healthcare applications,” in 2022 IEEE 4th Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability (ECBIOS).   IEEE, 2022, pp. 258–262.
  333. R. K. Rajapakse, A. R. Weerasinghe, and E. K. Seneviratne, “A neural network based character recognition system for sinhala script,” Department of Statistics and Computer Science, University of Colombo, 1995.
  334. H. L. Premaratne and J. Bigun, “Recognition of printed sinhala characters using linear symmetry,” in The 5th Asian Conference on Computer Vision, 2002, pp. 23–25.
  335. ——, “A segmentation-free approach to recognise printed sinhala script using linear symmetry,” Pattern recognition, vol. 37, no. 10, pp. 2081–2089, 2004.
  336. H. L. Premaratne, E. Järpe, and J. Bigun, “Lexicon and hidden markov model-based optimisation of the recognised sinhala script,” Pattern recognition letters, vol. 27, no. 6, pp. 696–705, 2006.
  337. S. Hewavitharana, H. C. Fernando, and N. D. Kodikara, “Off-line sinhala handwriting recognition using hidden markov models.” in ICVGIP, 2002.
  338. S. Hewavitharana and N. D. Kodikara, “A statistical approach to sinhala handwriting recognition,” in Proc. of the International Information Technology Conference (IITC), Colombo, Sri Lanka, 2002.
  339. S. Ajward, N. Jayasundara, S. Madushika, and R. Ragel, “Converting printed sinhala documents to formatted editable text,” in 2010 Fifth International Conference on Information and Automation for Sustainability.   IEEE, 2010, pp. 138–143.
  340. P. T. C. Madushanka, R. Bandara, and L. Ranathunga, “Sinhala handwritten character recognition by using enhanced thinning and curvature histogram based method,” in 2017 IEEE 2nd International Conference on Signal and Image Processing (ICSIP).   IEEE, 2017, pp. 46–50.
  341. M. L. M. Karunanayaka, N. D. Kodikara, and G. D. S. P. Wimalaratne, “Off line sinhala handwriting recognition with an application for postal city name recognition,” Il’I’C 2004, 2004.
  342. R. Weerasinghe, A. Wasala, D. Herath, and V. Welgama, “Nlp applications of sinhala: Tts & ocr,” in Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II, 2008.
  343. A. R. Weerasinghe, D. L. Herath, and N. P. K. Medagoda, “A nearest-neighbor based algorithm for printed sinhala character recognition,” Innovations for a Knowledge Economy, p. 11, 2006.
  344. ——, “A knn based algorithm for printed sinhala character recognition,” in Proceedings of 8th International Information Technology Conference, 2006.
  345. D. N. Ediriweera, “Improviing the accuracy of the output of sinhala ocr by using a dictionary,” Ph.D. dissertation, University of Moratuwa Sri Lanka, 2012.
  346. G. Dias, T. N. P. Patikirikorala, C. I. Arambewela, R. P. M. Darshana, and N. D. Alahendra, “Sinhala optical character recognition for desktops,” 2013.
  347. G. Dias, T. N. P. Patikirikorala, C. I. Arambewela, R. P. M. Darshani, and N. D. Alahendra, “Online sinhala handwritten character recognition for desktops,” 2013.
  348. M. H. P. Ranmuthugala, G. D. N. C. Pathiragoda, S. H. C. Jayasundara, G. Dias, and A. S. Karunananda, “Online sinhala handwritten character recognition on handheld devices,” Innovations for a Knowledge Economy, p. 1, 2006.
  349. M. Rimas, R. P. Thilakumara, and P. Koswatta, “Optical character recognition for sinhala language,” in 2013 IEEE Global Humanitarian Technology Conference: South Asia Satellite (GHTC-SAS).   IEEE, 2013, pp. 149–153.
  350. G. I. Gunarathna, M. A. P. Chamikara, and R. G. Ragel, “A fuzzy based model to identify printed sinhala characters,” in 7th International Conference on Information and Automation for Sustainability.   IEEE, 2014, pp. 1–6.
  351. H. W. H. Premachandra, C. Premachandra, T. Kimura, and H. Kawanaka, “Artificial neural network based sinhala character recognition,” in International Conference on Computer Vision and Graphics.   Springer, 2016, pp. 594–603.
  352. J. M. H. M. Jayamaha and H. M. M. Naleer, “Feature extraction technique based character recognition using artificial neural network for sinhala characters,” 2016.
  353. T. N. Kumara and R. Ragel, “A systematic feature selection process for a sinhala character recognition system,” in 2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS).   IEEE, 2016, pp. 1–6.
  354. B. R. Jayawickrama, L. Ranathunga, K. L. Mahaliyanaarachchi, L. G. B. Subhagya, and W. H. A. Nimasha, “Letter segmentation and modifier detection in printed sinhala signage,” in 2018 18th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2018, pp. 203–208.
  355. S. Gunawardhana and L. Ranathunga, “Segmentation and identification of presence of sinhala characters in facebook images,” in 2018 IEEE 13th International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2018, pp. 77–82.
  356. K. L. N. D. Liyanage, “Improving Sinhala OCR using Deep Learning,” 2018.
  357. I. Anuradha, C. Liyanage, and R. Weerasinghe, “Estimating the effects of text genre, image resolution and algorithmic complexity needed for sinhala optical character recognition,” International Journal on Advances in ICT for Emerging Regions (ICTer), vol. 14, no. 3, 2021.
  358. I. Anuradha, C. Liyanage, H. Wijayawardhana, and R. Weerasinghe, “Deep learning based sinhala optical character recognition (ocr),” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 298–299.
  359. R. Smith, “An overview of the tesseract ocr engine,” in Ninth international conference on document analysis and recognition (ICDAR 2007), vol. 2.   IEEE, 2007, pp. 629–633.
  360. B. P. K. Balasooriya, “Improving and Measuring OCR Accuracy for Sinhala with Tesseract OCR Engine,” Ph.D. dissertation, 2021.
  361. Y. V. A. N. T. Maduranga and S. Jayalal, “Multi-style printed sinhala character recognition and digitalization using artificial neural network,” in 2022 2nd International Conference on Advanced Research in Computing (ICARC).   IEEE, 2022, pp. 120–124.
  362. D. I. De Silva, E. Weerasinghe, A. M. Y. V. B. Abeykoon, P. Baddewithana, W. M. K. G. S. S. B. Wijekoon, and W. R. A. H. K. Kumara, “Ceylon translate: A multimodal translator for sinhala to english and english to sinhala translations.” Tuijin Jishu/Journal of Propulsion Technology, vol. 44, no. 5, pp. 339–345, 2023.
  363. S. Samarajeewa and L. Ranathunga, “An approach for resolving double character segmentation in sinhala social media text images,” in 2020 From Innovation to Impact (FITI), vol. 1.   IEEE, 2020, pp. 1–6.
  364. K. S. A. Walawage and L. Ranathunga, “Devising a distinguishable feature set for sinhala and english script separation on social media images,” in 2020 From Innovation to Impact (FITI), vol. 1.   IEEE, 2020, pp. 1–6.
  365. N. M. T. de Silva and S. R. Liyanage, “Sinhala braille character recognizer.”
  366. S. Chanda, S. Pal, and U. Pal, “Word-wise sinhala tamil and english script identification using gaussian kernel svm,” in 2008 19th International Conference on Pattern Recognition.   IEEE, 2008, pp. 1–4.
  367. H. C. Fernando, N. D. Kodikara, and S. Hewavitharana, “A database for handwriting recognition research in sinhala language.” in ICDAR, 2003, pp. 1262–1264.
  368. M. L. M. Karunanayaka, C. A. Marasinghe, and N. D. Kodikara, “Thresholding, noise reduction and skew correction of sinhala handwritten words.” in MVA, 2005, pp. 355–358.
  369. B. Jayasekara and L. Udawatta, “Non-cursive sinhala handwritten script recognition: A genetic algorithm based alphabet training approach,” in Proceedings of the International Conference on Information and Automation, 2005.
  370. N. P. T. I. Nilaweera, H. L. Premeratne, and D. U. J. Sonnadara, “Comparison of projection and wavelet based techniques in recognition of sinhala handwritten scripts,” in Proceedings of the 25th National IT Conference, 2007.
  371. C. Silva and C. Kariyawasam, “Segmenting sinhala handwritten characters,” International Journal of Conceptions on Computing and Information Technology, vol. 2, no. 4, pp. 22–26, 2014.
  372. C. M. Silva, N. D. Jayasundere, and C. Kariyawasam, “State of handwriting recognition of modern sinhala script,” 2014.
  373. ——, “Contour tracing for isolated sinhala handwritten character recognition,” in 2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2015, pp. 25–31.
  374. K. A. K. N. D. Dharmapala, W. P. M. V. Wijesooriya, C. P. Chandrasekara, U. K. A. U. Rathnapriya, and L. Ranathunga, “Sinhala handwriting recognition mechanism using zone based feature extraction,” 2017.
  375. K. S. A. Walawage and L. Ranathunga, “Segmentation of overlapping and touching sinhala handwritten characters,” in 2018 3rd International Conference on Information Technology Research (ICITR).   IEEE, 2018, pp. 1–6.
  376. K. S. A. Walawage, “Segmentation of overlapping sinhala handwritten characters,” Ph.D. dissertation, 2019.
  377. C. M. Silva and N. D. Jayasundere, “Character modifier combinations recognition in sinhala handwriting.”
  378. J. Mariyathas, V. Shanmuganathan, and B. Kuhaneswaran, “Sinhala handwritten character recognition using convolutional neural network,” in 2020 5th International Conference on Information Technology Research (ICITR).   IEEE, 2020, pp. 1–6.
  379. W. Wasalthilake and T. Kartheeswaran, “Sinhala handwritten character recognition using convolution neural networks,” 2020.
  380. S. M. Weerasinghe, “Sinhala handwriting character recognition system via a deep convolutional neural network,” 2019.
  381. M. F. A. Ifhaam and S. Jayalal, “Sinhala handwritten postal address recognition for postal sorting,” in 2019 International Research Conference on Smart Computing and Systems Engineering (SCSE).   IEEE, 2019, pp. 134–141.
  382. H. Mahesh and C. Priyankara, “Segmentation based approach for off-line handwritten sinhala word recognition from touch screen gestures,” in 2022 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2022, pp. 1–6.
  383. M. M. K. Rowel, A. D. A. I. Gunasekara, G. A. I. Uwanthika, and D. B. Wijesinghe, “An e-learning platform for hearing impaired children,” 2021.
  384. B. T. Withana and S. Rupasinghe, “Detecting dyslexia and dysgraphia risks in sinhala-speaking children using neural networks,” 2023.
  385. K. A. M. P. Rathnasena, K. M. S. J. Kumarasinghe, D. T. P. Paranavitharana, D. V. A. U. Dayarathne, and L. Ranathunga, “Summarization based approach for old sinhala text archival search and preservation,” in 2018 18th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2018, pp. 182–188.
  386. T. M. T. H. Peiris, “Recognition of inscriptions in ancient sri lanka,” 2012.
  387. D. A. S. Ruwanmini, K. V. Liyanage, K. G. N. D. Karunarathne, G. K. A. Dias, and S. T. Nandasara, “An architecture for an inscription recognition system for sinhala epigraphy,” International Journal of Research-Granthaalayah, vol. 4, pp. 48–64, 2016.
  388. K. G. N. D. Karunarathne, K. V. Liyanage, D. A. S. Ruwanmini, K. Dias, and S. Nandasara, “Recognizing ancient sinhala inscription characters using neural network technologies,” Internationa Journal of Scientific Emgineering and Applied Sciences, vol. 3, no. 1, 2017.
  389. S. Wickramarathna and L. Ranathunga, “Data driven approach to brahmi ocr error correction and sinhala meaning generation from brahmi character array,” in 2019 19th International Conference on Advances in ICT for Emerging Regions (ICTer), vol. 250.   IEEE, 2019, pp. 1–6.
  390. H. M. S. C. R. Heenkenda and T. G. I. Fernando, “Computational archeaology hate inscriptions using deep learning approaches,” Journal of the National Science Foundation of Sri Lanka, vol. 51, no. 3, pp. 437 – 448, 2023.
  391. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.
  392. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  393. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  394. T. Dilshani and C. Senevirathna, “A study on the impact of machine translation software towards technical translation: With special reference on english to sinhala category.”   Proceedings of the Undergraduate Research Symposium (HUG 2019), Department …, 2019.
  395. E.-S. A. Lee, S. Thillainathan, S. Nayak, S. Ranathunga, D. I. Adelani, R. Su, and A. D. McCarthy, “Pre-trained multilingual sequence-to-sequence models: A hope for low-resource language translation?” arXiv preprint arXiv:2203.08850, 2022.
  396. I. Ramadasa, L. Liyanage, D. Asanka, and T. Dilanka, “Analysis of the effectiveness of using google translations api for nlp of sinhalese,” 2022.
  397. S. B. Das, D. Panda, T. K. Mishra, and B. K. Patra, “Statistical machine translation for indic languages,” arXiv preprint arXiv:2301.00539, 2023.
  398. S. K. Sheshadri, D. Gupta, and M. R. Costa-Jussà, “A voyage on neural machine translation for indic languages,” Procedia Computer Science, vol. 218, pp. 2694–2712, 2023.
  399. A. Bapna, I. Caswell, J. Kreutzer, O. Firat, D. van Esch, A. Siddhant, M. Niu, P. Baljekar, X. Garcia, W. Macherey, T. Breiner, V. Axelrod, J. Riesa, Y. Cao, M. X. Chen, K. Macherey, M. Krikun, P. Wang, A. Gutkin, A. Shah, Y. Huang, Z. Chen, Y. Wu, and M. Hughes, “Building machine translation systems for the next thousand languages,” arXiv preprint arXiv:2205.03983, 2022.
  400. A. Jones, I. Caswell, I. Saxena, and O. Firat, “Bilex rx: Lexical data augmentation for massively multilingual machine translation,” arXiv preprint arXiv:2303.15265, 2023.
  401. S. Sen, A. Ekbal, and P. Bhattacharyya, “Parallel corpus filtering based on fuzzy string matching,” in Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), 2019, pp. 289–293.
  402. P. D. N. M. Ubhayawardhana and J. A. M. Hansani, “A study on the effectiveness of using google translate in legal translation: With special reference to selected legal documents of the registrar general’s department,” LOGOS, vol. 1, no. 1, 2023.
  403. A. Jones, I. Caswell, O. Firat, and I. Saxena, “Gatitos: Using a new multilingual lexicon for low-resource machine translation,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 371–405.
  404. D. Kamholz, J. Pool, and S. M. Colowick, “PanLex: Building a Resource for Panlingual Lexical Translation.” in LREC, 2014, pp. 3145–3150.
  405. J. Liyanapathirana and R. Weerasinghe, “English to sinhala machine translation: Towards better information access for sri lankans,” in Conference on Human Language Technology for Development, 2011, pp. 182–186.
  406. J. U. Liyanapathirana, “A statistical approach to english and sinhala translation,” 2013.
  407. L. Wijerathna, W. L. S. L. Somaweera, S. L. Kaduruwana, Y. V. Wijesinghe, D. I. De Silva, K. Pulasinghe, and S. Thellijjagoda, “A translator from sinhala to english and english to sinhala (sees),” in International Conference on Advances in ICT for Emerging Regions (ICTer2012).   IEEE, 2012, pp. 14–18.
  408. D. De Silva, A. Alahakoon, I. Udayangani, V. Kumara, D. Kolonnage, H. Perera, and S. Thelijjagoda, “Sinhala to english language translator,” in 2008 4th International Conference on Information and Automation for Sustainability.   IEEE, 2008, pp. 419–424.
  409. A. M. Silva and R. Weerasinghe, “Example based machine translation for english-sinhala translations,” in Proceedings of the 09th International IT Conference, 2008, pp. 27–28.
  410. A. J. Vidanaralage, A. U. Illangakoon, S. Y. Sumanaweera, C. Pavithra, and S. Thelijjagoda, “Sinhala language decoder,” in 2018 National Information Technology Conference (NITC).   IEEE, 2018, pp. 1–5.
  411. J. K. Joseph, W. M. T. Chathurika, A. Nugaliyadde, and Y. Mallawarachchi, “Evolutionary algorithm for sinhala to english translation,” arXiv preprint arXiv:1907.03202, 2019.
  412. R. Pushpananda, R. Weerasinghe, and M. Niranjan, “Statistical machine translation from and into morphologically rich and low resourced languages,” in International Conference on Intelligent Text Processing and Computational Linguistics.   Springer, 2015, pp. 545–556.
  413. A. Fernando, S. Ranathunga, and G. Dias, “Data augmentation and terminology integration for domain-specific sinhala-english-tamil statistical machine translation,” arXiv preprint arXiv:2011.02821, 2020.
  414. M. D. Rajitha, L. L. Piyarathna, M. M. D. Nayanajith, and S. Surangika, “Sinhala and english document alignment using statistical machine translation,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 29–34.
  415. T. Fonseka, R. Naranpanawa, R. Perera, and U. Thayasivam, “English to sinhala neural machine translation,” in IALP, 2020.
  416. R. Naranpanawa, R. Perera, T. Fonseka, and U. Thayasivam, “Analyzing subword techniques to improve english to sinhala neural machine translation,” International Journal of Asian Language Processing, vol. 30, no. 04, p. 2050017, 2020.
  417. A. Fernando, G. Dias, and S. Ranathunga, “Data augmentation and list integration for improving domain-specific sinhala-english-tamil statistical machine translation,” 2021.
  418. A. Fernando and S. Ranathunga, “Data augmentation to address out-of-vocabulary problem in low-resource sinhala-english neural machine translation,” arXiv preprint arXiv:2205.08722, 2022.
  419. K. Epaliyana, S. Ranathunga, and S. Jayasena, “Improving Back-Translation with Iterative Filtering and Data Selection for Sinhala-English NMT,” in 2021 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2021, pp. 438–443.
  420. R. Perera, T. Fonseka, R. Naranpanawa, and U. Thayasivam, “Improving english to sinhala neural machine translation using part-of-speech tag,” arXiv preprint arXiv:2202.08882, 2022.
  421. Z. Lin, Z. Zhou, and S. Guo, “Improvement on low resources machine translation: English-sinhala.”
  422. M. Ott, S. Edunov, A. Baevski, A. Fan, S. Gross, N. Ng, D. Grangier, and M. Auli, “fairseq: A fast, extensible toolkit for sequence modeling,” arXiv preprint arXiv:1904.01038, 2019.
  423. A. Kugathasan and S. Sumathipala, “Neural machine translation for sinhala-english code-mixed text,” The International Journal on Advances in ICT for Emerging Regions, vol. 15, no. 3, 2022.
  424. ——, “Standardizing sinhala code-mixed text using dictionary based approach,” in 2020 International Conference on Image Processing and Robotics (ICIP).   IEEE, 2020, pp. 1–6.
  425. X.-P. Nguyen, H. Gong, Y. Tang, C. Wang, P. Koehn, and S. Joty, “Contrastive clustering to mine pseudo parallel data for unsupervised translation,” in International Conference on Learning Representations, 2021.
  426. X. P. Nguyen, “Improving neural machine translation: data centric approaches,” 2023.
  427. V. Y. Attigala and R. Weerasinghe, “The effectiveness of chatgpt in literary translations and generating lyrics,” 2023.
  428. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  429. S. Goonetilleke, Y. Hayashi, Y. Itoh, and F. Kishino, “Srishell primo: A predictive sinhala text input system,” in Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, 2008.
  430. H. S. Priyadarshani, M. D. W. Rajapaksha, M. M. S. P. Ranasinghe, K. Sarveswaran, and G. V. Dias, “Statistical machine learning for transliteration: Transliterating names between sinhala, tamil and english,” in 2019 International Conference on Asian Language Processing (IALP).   IEEE, 2019, pp. 244–249.
  431. W. M. P. Liwera and L. Ranathunga, “Combination of trigram and rule-based model for singlish to sinhala transliteration by focusing social media text,” in 2020 From Innovation to Impact (FITI), vol. 1.   IEEE, 2020, pp. 1–5.
  432. A. D. De Silva, “Singlish to sinhala converter using machine learning,” 2020.
  433. L. de Silva and S. Ahangama, “Singlish to sinhala transliteration using rule-based approach,” in 2021 IEEE 16th International Conference on Industrial and Information Systems (ICIIS).   IEEE, pp. 162–167.
  434. R. Nanayakkara, T. Nadungodage, and R. Pushpananda, “Context aware back-transliteration from english to sinhala,” in 2022 22nd International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2022, pp. 051–056.
  435. T. G. D. K. Sumanathilaka, R. Weerasinghe, and H. Y. P. P. Priyadarshana, “Sinhala word suggestion algorithm for ad hoc romanized sinhala transliterations using a trie.” 2023.
  436. T. G. D. K. Sumanathilaka, R. Weerasinghe, and Y. H. P. P. Priyadarshana, “Swa-bhasha: Romanized sinhala to sinhala reverse transliteration using a hybrid approach,” in 2023 3rd International Conference on Advanced Research in Computing (ICARC).   IEEE, 2023, pp. 136–141.
  437. F. Bodon and L. Rónyai, “Trie: an alternative data structure for data mining algorithms,” Mathematical and Computer Modelling, vol. 38, no. 7-9, pp. 739–751, 2003.
  438. T. G. D. K. Sumanathilaka, “Romanized sinhala to sinhala reverse transliteration using a hybrid approach,” Ph.D. dissertation, 2023.
  439. M. D. C. Amarasekara, R. A. D. P. Rajapaksha, H. M. D. T. Jayarathna, H. M. G. K. Karunarathna, I. T. S. Piyatilake, and C. P. Wijesiriwardana, “Developing a system to transliterate singlish twitter posts to sinhala,” 2023.
  440. S. Rajapaksha, S. J. Podige, S. L. Arachchige, D. I. De Silva, A. Manathunga, and E. Weerasinghe, “Sinhala to english language translation model,” 2023.
  441. D. B. Kumaravithana, P. P. D. M. D, S. A. W. S, L. U. S. P, D. I. De Silva, and W. E, “Sinhala–english bilingual translator,” Tuijin Jishu/Journal of Propulsion Technology, vol. 44, no. 5, pp. 184–191, 2023.
  442. E. K. Jayawardhana, T. R. Ranasinghe, S. N. Baalasooriya, D. De Silva, and E. Weerasinghe, “BridgeTalk: A Translator from Sinhala to English and English to Sinhala,” Tuijin Jishu/Journal of Propulsion Technology, vol. 44, no. 6, pp. 1703–1711, 2023.
  443. D. Sandaruwan, S. Fernando, and S. Sumathipala, “Neural machine translation approach for singlish to english translation,” The International Journal on Advances in ICT for Emerging Regions, vol. 14, no. 03, pp. 36–42, 2021.
  444. G. K. Nalinka, G. H. M. Iroshan, R. M. S. N. Rathnayake, G. M. N. Monali, D. I. De Silva, and E. Weerasinghe, “Shattering language barriers: Singlish to english translation with transformer neural network,” Tuijin Jishu/Journal of Propulsion Technology, vol. 44, no. 4, pp. 3019–3037, 2023.
  445. D. I. De Silva, E. Weerasinghe, M. S. Shiraz, H. G. M. K. K. L. Karunasena, C. H. Zimmendra, and O. A. Kumarasinghe, “The art and science of translating english to singlish,” Tuijin Jishu/Journal of Propulsion Technology, vol. 44, no. 5, pp. 710–718, 2023.
  446. P. Tennage, P. Sandaruwan, M. Thilakarathne, A. Herath, S. Ranathunga, S. Jayasena, and G. Dias, “Neural machine translation for sinhala and tamil languages,” in Asian Language Processing (IALP), 2017 International Conference on.   IEEE, 2017, pp. 189–192.
  447. P. N. Tennage, M. W. D. P. Sandaruwan, J. K. M. M. Thilakarathne, A. N. Herath, S. Ranathunga, S. Jayasena, and G. Dias, “Neural machine translation for sinhala-tamil,” 2017.
  448. P. Tennage, A. Herath, M. Thilakarathne, P. Sandaruwan, and S. Ranathunga, “Transliteration and byte pair encoding to improve tamil to sinhala neural machine translation,” in 2018 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2018, pp. 390–395.
  449. P. Tennage, P. Sandaruwan, M. Thilakarathne, A. Herath, and S. Ranathunga, “Handling rare word problem using synthetic training data for sinhala and tamil neural machine translation,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), 2018.
  450. S. Ranathunga, F. Farhath, U. Thayasivam, S. Jayasena, and G. Dias, “Si-ta: Machine translation of sinhala and tamil official documents,” in 2018 National Information Technology Conference (NITC).   IEEE, 2018, pp. 1–6.
  451. F. Farhath, S. Ranathunga, S. Jayasena, and G. Dias, “Integration of bilingual lists for domain-specific statistical machine translation for sinhala-tamil,” in 2018 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2018, pp. 538–543.
  452. R. Weerasinghe, “A statistical machine translation approach to sinhala-tamil language translation,” Towards an ICT enabled Society, p. 136, 2003.
  453. S. Sripirakas, A. R. Weerasinghe, and D. L. Herath, “Statistical machine translation of systems for sinhala-tamil,” in Advances in ICT for Emerging Regions (ICTer), 2010 International Conference on.   IEEE, 2010, pp. 62–68.
  454. M. Jeyakaran and R. Weerasinghe, “A novel kernel regression based machine translation system for sinhala-tamil translation,” in Proceedings of 4th Annual UCSC Research Symposium, 2013.
  455. R. Pushpananda, R. Weerasinghe, and M. Niranjan, “Towards sinhala tamil machine translation,” in Advances in ICT for Emerging Regions (ICTer), 2013 International Conference on.   IEEE, 2013, pp. 288–288.
  456. ——, “Sinhala-tamil machine translation: Towards better translation quality,” in Proceedings of the Australasian Language Technology Association Workshop 2014, 2014, pp. 129–133.
  457. S. Rajpirathap, S. Sheeyam, K. Umasuthan, and A. Chelvarajah, “Real-time direct translation system for sinhala and tamil languages,” in 2015 Federated Conference on Computer Science and Information Systems (FedCSIS).   IEEE, 2015, pp. 1437–1443.
  458. W. S. N. Dilshani, S. Yashothara, R. T. Uthayasanker, and S. Jayasena, “Linguistic divergence of sinhala and tamil languages in machine translation,” in 2018 International Conference on Asian Language Processing (IALP).   IEEE, 2018, pp. 13–18.
  459. T. Mokanarangan, “Translation of named entities between sinhala and tamil for official government documents,” 2019.
  460. A. Arukgoda, A. R. Weerasinghe, and R. Pushpananda, “Improving sinhala-tamil translation through deep learning techniques,” 2019.
  461. A. S. Arukgoda, “Improving sinhala–tamil translation through deep learning techniques,” Ph.D. dissertation, 2021.
  462. A. Pramodya, R. Pushpananda, and R. Weerasinghe, “A comparison of transformer, recurrent neural networks and SMT in Tamil to Sinhala MT,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 155–160.
  463. L. N. A. S. H. Nissanka, B. H. R. Pushpananda, and A. R. Weerasinghe, “Exploring neural machine translation for sinhala-tamil languages pair,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 202–207.
  464. S. Thillainathan, S. Ranathunga, and S. Jayasena, “Fine-Tuning Self-Supervised Multilingual Sequence-To-Sequence Models for Extremely Low-Resource NMT,” in 2021 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2021, pp. 432–437.
  465. Y. Liu, J. Gu, N. Goyal, X. Li, S. Edunov, M. Ghazvininejad, M. Lewis, and L. Zettlemoyer, “Multilingual denoising pre-training for neural machine translation,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 726–742, 2020.
  466. S. Yashothara and R. T. Uthayasanker, “The utility of hierarchical phrase-based model machine translation for low resource languages,” in Computational Linguistics and Intelligent Text Processing: 19th International Conference, CICLing 2018, Hanoi, Vietnam, March 18–24, 2018, Revised Selected Papers, Part I.   Springer, 2023, pp. 279–288.
  467. A. Pramodya, “Exploring low-resource neural machine translation for sinhala-tamil language pair,” in Proceedings of the 8th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing, 2023, pp. 87–97.
  468. S. Thelijjagoda, Y. Imai, and T. Ikeda, “Japanese-sinhalese machine translation system jaw/sinhalese,” Journal of the National Science Foundation of Sri Lanka, vol. 35, no. 2, 2007.
  469. S. S. Jayasinghe, “An analytical study of the background which had been urged in india for the necessity to translate sinhala commentaries into pali,” 2023.
  470. A. J. Liddicoat, “Choosing a liturgical language,” Australian Review of Applied Linguistics, vol. 16, no. 2, pp. 123–141, 1993.
  471. R. M. M. Shalini and B. Hettige, “Dictionary based machine translation system for pali to sinhala,” in SLAAI-International Conference on Artificial Intelligence, 2017, p. 23.
  472. A. Wasala, R. Weerasinghe, R. Pushpananda, C. Liyanage, and E. Jayalatharachchi, “A data-driven approach to checking and correcting spelling errors in sinhala,” Int. J. Adv. ICT Emerg. Reg, vol. 3, no. 01, 2010.
  473. R. A. Wasala, R. Weerasinghe, R. Pushpananda, C. Liyanage, and E. Jayalatharachchi, “An open-source data driven spell checker for sinhala,” ICTer, vol. 3, no. 1, 2011.
  474. E. Jayalatharachchi, A. Wasala, and R. Weerasinghe, “Data-driven spell checking: the synergy of two algorithms for spelling error detection and correction,” in International Conference on Advances in ICT for Emerging Regions (ICTer2012).   IEEE, 2012, pp. 7–13.
  475. L. G. B. Subhagya, L. Ranathunga, W. H. A. Nimasha, B. R. Jayawickrama, and K. L. Mahaliyanaarchchi, “Data driven approach to sinhala spellchecker and correction,” in 2018 18th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2018, pp. 01–06.
  476. U. Liyanapathirana, K. Gunasinghe, and G. Dias, “Sinspell: A comprehensive spelling checker for sinhala,” arXiv preprint arXiv:2107.02983, 2021.
  477. L. Sithamparanathan and T. Uthayasanker, “A sinhala and tamil extension to generic environment for context-aware correction,” in 2019 National Information Technology Conference (NITC).   IEEE, 2019, pp. 102–106.
  478. L. Samarawickrama, H. L. Premarathne, S. C. M. De Silva, and S. B. Hettige, “LaSi Spell: Language Agents for Sinhala Spellings.”   4th International Conference on Advances in Computing and Technology (ICACT …, 2019.
  479. Y. Udagedara, B. Elikewela, and U. Thayasivam, “Language model-based spell-checker for sri lankan names and addresses,” in 2022 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2022, pp. 1–6.
  480. P. Sudesh, D. Dashintha, R. Lakshan, and G. Dias, “Erroff: A Tool to Identify and Correct Real-word Errors in Sinhala Documents,” in 2022 Moratuwa Engineering Research Conference (MERCon).   IEEE, 2022, pp. 1–6.
  481. H. M. U. Pabasara and S. Jayalal, “Computational model for detecting grammatical mistakes in sinhala text,” in 9TH YSF SYMPOSIUM, 2020, p. 255.
  482. ——, “Grammatical error detection and correction model for sinhala language sentences,” in 2020 International Research Conference on Smart Computing and Systems Engineering (SCSE).   IEEE, 2020, pp. 17–24.
  483. S. Gunasekara, D. Chathura, C. Jeewantha, and G. Dias, “Using annotation projection for semantic role labeling of low-resourced language: Sinhala,” 2020.
  484. P. A. S. Fernando and T. Arudchelvam, “Sinhala grammar checker using parts of speech tagging,” 2020.
  485. K. N. Widyaratna, “Sinhala grammar evaluation through natural language processing approaches,” Ph.D. dissertation, 2019.
  486. P. Jayasuriya, M. Wijesundara, S. Thelijjagoda, and N. Kodagoda, “Grammar error correction for less resourceful languages: A case study of sinhala,” in 2023 IEEE 17th International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2023, pp. 169–174.
  487. O. Ilukkumbura and S. Rupasinghe, “Sinhala active voice into passive voice converter using rule based approach with grammar error correction,” 2023.
  488. M. Goonawardena, A. Kulatunga, R. Wickramasinghe, T. Weerasekara, H. De Silva, and S. Thelijjagoda, “Automated spelling checker and grammatical error detection and correction model for sinhala language,” in 2022 International Research Conference on Smart Computing and Systems Engineering (SCSE), vol. 5.   IEEE, 2022, pp. 184–189.
  489. M. R. Navoda, O. W. R. Y. Weerasooriya, A. U. A. Siriwardhana, L. D. A. Sonali, J. Krishara, and P. Panduwawala, “Automated spelling and grammar checker tool for sinhala,” International Research Journal of Innovations in Engineering and Technology, vol. 7, no. 10, p. 131, 2023.
  490. B. Gamage, R. Pushpananda, and R. Weerasinghe, “The impact of using pre-trained word embeddings in sinhala chatbots,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 161–165.
  491. T. Bocklisch, J. Faulkner, N. Pawlowski, and A. Nichol, “Rasa: Open source language understanding and dialogue management,” arXiv preprint arXiv:1712.05181, 2017.
  492. S. Harshani, “Sinhala chatbot for train information,” Ph.D. dissertation, 2021.
  493. J. A. W. T. Chandrasena, A. D. A. I. Gunasekara, and G. A. I. Uwanthika, “Sinhala chatbot with recommendation system for sri lankan traditional dancers,” 2021.
  494. U. E. Kumanayake, “A sinhala chatbot for user inquiries regarding degree programs at university of ruhuna,” Ph.D. dissertation, 2021.
  495. W. A. P. Avishka, B. Kuhaneswaran, and H. N. Gunasinghe, “A novel conceptual chatbot architecture for the sinhala language–a case study on food ordering scenario,” in 2022 2nd International Conference on Advanced Research in Computing (ICARC).   IEEE, 2022, pp. 254–259.
  496. M. Biswas, “Microsoft bot framework,” in Beginning AI Bot Frameworks.   Springer, 2018, pp. 25–66.
  497. I. Dissanayake, D. Jayasinghe, S. Hameed, L. Abeywardhana, A. Sakalasooriya, and D. Wijendra, “Enhancing conversational ai model performance and explainability for sinhala-english bilingual speakers,” 2022.
  498. D. D. S. S. Dasanayaka and N. Warnajith, “Contextual assistant framework for the sinhala language,” in 2020 International Research Conference on Smart Computing and Systems Engineering (SCSE).   IEEE, 2020, pp. 45–50.
  499. L. Jayasekara and S. Ahangama, “Trend detection in sinhala tweets using clustering and ranking algorithms,” in 2020 From Innovation to Impact (FITI), vol. 1.   IEEE, 2020, pp. 1–6.
  500. U. Sandamini, K. Rathnakumara, P. Pramuditha, M. Dissanayake, D. Sriyaratna, H. De Silva, and D. Kasthurirathna, “A singlish supported post recommendation approach for social media,” 2022.
  501. T. M. S. A. Tennakoon and G. R. N. A. Gamlath, “Hybrid recommender system for categorized sinhala news articles,” 2020.
  502. A. Tennakoon, N. Gamlath, G. Kirindage, J. Ranatunga, P. Haddela, and D. Kaveendri, “Hybrid recommender for condensed sinhala news with grey sheep user identification,” in 2020 2nd International Conference on Advancements in Computing (ICAC), vol. 1.   IEEE, 2020, pp. 228–233.
  503. N. P. G. A. Malsha, K. D. Heshani, R. K. Ransara, D. M. D. D. A. Bandara, P. K. S. Kumari, and T. A. Kuruppu, “Automated sinhala news platform based on machine learning and deep learning,” in 2021 3rd International Conference on Advancements in Computing (ICAC).   IEEE, 2021, pp. 134–139.
  504. M. D. Madhushika, S. Ahangama, and D. A. Rajapaksha, “Analyzing the impact of social media on sinhala news dissemination in mass media,” in 2022 2nd International Conference on Advanced Research in Computing (ICARC).   IEEE, 2022, pp. 177–182.
  505. M. Meyler, “Learning sri lankan sign language – groundviews,” https://groundviews.org/2021/09/02/learning-sri-lankan-sign-language/, 2 2021, (Accessed on 11/16/2023).
  506. R. M. Rishan, S. Jayalal, and T. K. Wijayasiriwardhane, “Translation of sri lankan sign language to sinhala text: A leap motion technology-based approach,” in 2022 2nd International Conference on Advanced Research in Computing (ICARC).   IEEE, 2022, pp. 218–223.
  507. K. L. P. Liyanaarachchi, D. Shakya, T. Herath, N. Vithanage, and L. S. K. Udugama, “Signing dataset for the sinhala sign language,” 2020.
  508. M. B. DISSANAYAKE, H. C. M. HERATH, W. A. L. V. KUMARI, and W. A. P. B. SENEVIRATHNE, “Image processing based sinhala sign language recognition system,” Sign, vol. 3, no. 5, p. 2.
  509. M. Priyankara, A. Gunasekara, and K. Ilmini, “Sign Language Translation Techniques Using Artificial Intelligence for the Hearing Impaired Community in Sri Lanka: A Review,” in 2023 7th SLAAI International Conference on Artificial Intelligence (SLAAI-ICAI).   IEEE, 2023, pp. 1–6.
  510. S. K. Wijegoonaratna, “Realtime sinhala sign language interpreter using hand gesture recognition,” Ph.D. dissertation, 2020.
  511. S. D. Hettiarachchi and R. G. N. Meegama, “Machine learning approach for real time translation of sinhala sign language into text.”
  512. S. Dilakshan and Y. H. P. P. Priyadarshana, “Convolutional neural networks: A novel approach for sinhala sign recognition system,” in 2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON).   IEEE, 2020, pp. 0141–0146.
  513. W. D. T. Peiris, “Sinhala sign language to text interpreter based on machine learning,” Ph.D. dissertation, 2021.
  514. L. L. D. K. Perera and S. G. V. S. Jayalal, “Sri lankan sign language to sinhala text using convolutional neural network combined with scale invariant feature transform (sift),” 2021.
  515. D. G. Lowe, “Object recognition from local scale-invariant features,” in Proceedings of the seventh IEEE international conference on computer vision, vol. 2.   Ieee, 1999, pp. 1150–1157.
  516. P. Fernando and P. Wimalaratne, “Sign Language Translation Approach to Sinhalese Language,” GSTF Journal on Computing (JoC), vol. 5, no. 1, pp. 1–9, 2016.
  517. J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” Advances in neural information processing systems, vol. 30, 2017.
  518. H. H. S. N. Haputhanthri, H. M. N. Tennakoon, M. A. S. M. Wijesekara, B. H. R. Pushpananda, and H. N. D. Thilini, “Multi-modal deep learning approach to improve sentence level sinhala sign language recognition,” The International Journal on Advances in ICT for Emerging Regions, vol. 16, pp. 21–30, 2023.
  519. M. Punchimudiyanse and R. G. N. Meegama, “Computer interpreter for translating written sinhala to sinhala sign,” OUSL Journal, vol. 12, no. 1, pp. 70–90, 2017.
  520. ——, “Animation of fingerspelled words and number signs of the sinhala sign language,” ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), vol. 16, no. 4, p. 24, 2017.
  521. D. M. Kumar, K. Bavanraj, S. Thavananthan, G. M. A. S. Bastiansz, S. M. B. Harshanath, and J. Alosious, “EasyTalk: A Translator for Sri Lankan Sign Language using Machine Learning and Artificial Intelligence,” in 2020 2nd International Conference on Advancements in Computing (ICAC), vol. 1.   IEEE, 2020, pp. 506–511.
  522. J. D. K. N. Perera and B. M. T. Kumarika, “Real-time system for place recognition by interpreting Sri Lankan sign language into text using machine learning approach.” 2023.
  523. R. J. Herath and P. Ishanka, “An approach to sri lankan sign language recognition using deep learning with mediapipe,” in International Conference on Digital Technologies and Applications.   Springer, 2022, pp. 449–459.
  524. C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C.-L. Chang, M. G. Yong, J. Lee et al., “Mediapipe: A framework for building perception pipelines,” arXiv preprint arXiv:1906.08172, 2019.
  525. I. S. M. Dissanayake, P. J. Wickramanayake, M. A. S. Mudunkotuwa, and P. W. N. Fernando, “Utalk: Sri Lankan sign language converter mobile app using image processing and machine learning,” in 2020 2nd International Conference on Advancements in Computing (ICAC), vol. 1.   IEEE, 2020, pp. 31–36.
  526. K. V. S. D. Vithanage, “Braille to text convertor for sinhala,” Ph.D. dissertation, 2021.
  527. G. Madubashana, “Automated braille-sinhala recognition system,” Ph.D. dissertation, 2020.
  528. W. R. Ariyarathna, L. R. Kahandagamage, and W. M. P. Kumara, “Projection profiling based sinhala braille character recognition and conversion,” 2020.
  529. Y. Weerasinghe, “A system to normalize sinhala characters in sinhala braille translator,” Ph.D. dissertation, 2020.
  530. S. Basnayake, H. Wijekoon, and T. K. Wijayasiriwardhane, “Plagiarism detection in sinhala language: A software approach.”
  531. L. P. Rajamanthri and S. Thelijjagoda, “Sinhala language plagiarism tool with internet resources using natural language processing.”
  532. L. Rajamanthri and S. Thelijjagoda, “Plagiarism detection tool for sinhala language with internet resources using natural language processing,” in 2021 10th International Conference on Information and Automation for Sustainability (ICIAfS).   IEEE, 2021, pp. 156–160.
  533. T. KasthuriArachchi and E. Y. A. Charles, “Deep learning approach to detect plagiarism in sinhala text,” in 2019 14th Conference on Industrial and Information Systems (ICIIS).   IEEE, 2019, pp. 314–319.
  534. A. Y. Piyarathna, “Sinhala multi document similarity detection tool,” Ph.D. dissertation, 2019.
  535. M. Punchihewa, C. Rajapaksha, and D. Asanka, “A language modelling approach to authorship identification for online examinations in sinhala,” 2021.
  536. I. Smith and U. Thayasivam, “Language detection in sinhala-english code-mixed data,” in 2019 International Conference on Asian Language Processing (IALP).   IEEE, 2019, pp. 228–233.
  537. J. R. I. Smith, “Sinhala-english language detection in code-mixed data,” Ph.D. dissertation, 2020.
  538. I. Smith and U. Thayasivam, “Sinhala-english code-mixed data analysis: A review on data collection process,” in 2019 19th International Conference on Advances in ICT for Emerging Regions (ICTer), vol. 250.   IEEE, 2019, pp. 1–6.
  539. K. Shanmugalingam and S. Sumathipala, “Language identification at word level in sinhala-english code-mixed social media text,” in 2019 International Research Conference on Smart Computing and Systems Engineering (SCSE).   IEEE, 2019, pp. 113–118.
  540. F. Fazal and C. Farook, “Depression detection in sinhala-english code-mixed language using social media data,” 2023.
  541. T. Dissanayake and B. Hettige, “Thematic relations based qa generator for sinhala,” 13th International Research Conference General Sir John Kotelawala Defence University, 2020.
  542. J. A. T. K. Jayakody, T. S. K. Gamlath, W. A. N. Lasantha, K. M. K. P. Premachandra, A. Nugaliyadde, and Y. Mallawarachchi, ““mahoshadha”, the sinhala tagged corpus based question answering system,” in Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems: Volume 1.   Springer, 2016, pp. 313–322.
  543. G. K. S. M. Amarasinghe and L. Ranathunga, “Evolutionary ontology approach for sinhala essay question generation,” in 2019 14th Conference on Industrial and Information Systems (ICIIS).   IEEE, 2019, pp. 452–457.
  544. V. Liyanage and S. Ranathunga, “A multi-language platform for generating algebraic mathematical word problems,” in 2019 14th Conference on Industrial and Information Systems (ICIIS).   IEEE, 2019, pp. 332–337.
  545. ——, “Multi-lingual mathematical word problem generation using long short term memory networks with enhanced input features,” in Proceedings of The 12th Language Resources and Evaluation Conference, 2020, pp. 4709–4716.
  546. K. Niyarepola, D. Athapaththu, S. Ekanayake, and S. Ranathunga, “Math word problem generation with multilingual language models,” in Proceedings of the 15th International Conference on Natural Language Generation, 2022, pp. 144–155.
  547. Y. Tang, C. Tran, X. Li, P.-J. Chen, N. Goyal, V. Chaudhary, J. Gu, and A. Fan, “Multilingual translation with extensible multilingual pretraining and finetuning,” arXiv preprint arXiv:2008.00401, 2020.
  548. S. Kao and K. Ilmini, “Automated generation of sinhala lyrics using recurrent neural networks,” 2020.
  549. R. M. V. D. Bandara, H. A. A. Sanja, and B. Hettige, “Sibil AI: Children Story Generator in Sinhala Using Transformers,” 2022.
  550. S. C. Fernando, “Inexact matching of proper names in sinhala,” 2011.
  551. A. B. P. Kanduboda and K. Tamaoka, “Priority information in determining canonical word order of colloquial sinhalese sentences,” in Proceedings of the 139th Conference of the Linguistic Society of Japan, vol. 1, 2009, pp. 32–37.
  552. ——, “Priority information for canonical word order of written sinhala sentences,” in Proceedings of the 140th Conference of the Linguistic Society of Japan, 2010, pp. 358–363.
  553. K. Tamaoka, P. B. A. Kanduboda, and H. Sakai, “Effects of word order alternation on the sentence processing of sinhalese written and spoken forms,” Open Journal of Modern Linguistics, vol. 1, no. 02, pp. 24–32, 2011.
  554. A. B. P. Kanduboda and K. Tamaoka, “Priority information determining the canonical word order of written sinhalese sentences,” Open Journal of Modern Linguistics, vol. 2, no. 01, p. 26, 2012.
  555. M. H. M. Hisan, A. R. Weerasinghe, and B. H. R. Pushpananda, “Cross language information retrieval for accessing the english web in sinhala,” in 2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer).   IEEE, 2020, pp. 244–249.
  556. L. Sandathara, S. Tissera, R. Sathsarani, H. Hapuarachchi, and S. Thelijjagoda, “Arunalu: Learning ecosystem to overcome sinhala reading weakness due to dyslexia,” in 2020 2nd International Conference on Advancements in Computing (ICAC), vol. 1.   IEEE, 2020, pp. 416–421.
  557. K. C. D. Vithana, D. N. N. Weerarathne, H. A. S. Krishan, M. R. M. Wijesiri, S. Thelijjagoda, J. A. D. T. Jayawickrama, and N. T. Weerawarna, “Mimi: Sinhala language speech assistive learning bot to support children with stuttering,” in 2022 International Conference on Automation, Computing and Renewable Systems (ICACRS).   IEEE, 2022, pp. 662–668.
  558. D. Nethmi, R. Navarathna, and A. Senanayake, “Narrataa: Learning Tool for Generating Kid-Friendly Sinhala Names for Objects,” in 2023 IEEE 17th International Conference on Industrial and Information Systems (ICIIS).   IEEE, 2023, pp. 323–328.
  559. D. Ranasinghe, R. Pushpananda, and R. Weerasinghe, “Image Caption Generator for Sinhala Using Deep Learning,” The International Journal on Advances in ICT for Emerging Regions, vol. 16, pp. 40–46, 2023.
  560. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13.   Springer, 2014, pp. 740–755.
  561. C. Rashtchian, P. Young, M. Hodosh, and J. Hockenmaier, “Collecting image annotations using amazon’s mechanical turk,” in Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, 2010, pp. 139–147.
  562. C. Rajitha, L. Piyarathne, D. Sachintha, and S. Ranathunga, “Metric learning in multilingual sentence similarity measurement for document alignment,” arXiv preprint arXiv:2108.09495, 2021.
  563. R. M. D. R. Kumari and S. Hettiarachchi, “Sintm-lda and rake based topic modelling for sinhala language,” in 2021 Asian Conference on Innovation in Technology (ASIANCON).   IEEE, 2021, pp. 1–5.
  564. S. Rose, D. Engel, N. Cramer, and W. Cowley, “Automatic Keyword Extraction from Individual Documents,” Text mining: applications and theory, vol. 1, pp. 1–20, 2010.
  565. T. H. Batawalaarachchi, “Automated title generation in sinhala language,” Ph.D. dissertation, 2021.
  566. A. L. D. S. Arambewela, S. Ahangama, and D. M. A. K. Dissanayake, “Real-time sinhala writing assistant for kids,” in 2021 IEEE 16th International Conference on Industrial and Information Systems (ICIIS).   IEEE, pp. 152–156.
  567. A. A. V. A. Jayaweera, Y. N. Senanayake, and P. S. Haddela, “Dynamic stopword removal for sinhala language,” in 2019 National Information Technology Conference (NITC).   IEEE, 2019, pp. 1–6.
  568. G. Weeraprameshwara, V. Jayawickrama, N. de Silva, and Y. Wijeratne, “Sinhala sentence embedding: A two-tiered structure for low-resource languages,” arXiv preprint arXiv:2210.14472, 2022.
  569. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” arXiv preprint arXiv:1907.11692, 2019.
  570. C. Liyanage, K. Sarveswaran, T. Nadungodage, and R. Pushpananda, “Sinhala dependency treebank (stb),” in Proceedings of the Sixth Workshop on Universal Dependencies (UDW, GURT/SyntaxFest 2023), 2023, pp. 17–26.
  571. B. Minixhofer, J. Pfeiffer, and I. Vulić, “Where’s the point? self-supervised multilingual punctuation-agnostic sentence segmentation,” arXiv preprint arXiv:2305.18893, 2023.
  572. A. Petrov, E. La Malfa, P. H. Torr, and A. Bibi, “Language model tokenizers introduce unfairness between languages,” arXiv preprint arXiv:2305.15425, 2023.
  573. J. H. Clark, D. Garrette, I. Turc, and J. Wieting, “Canine: Pre-training an efficient tokenization-free encoder for language representation,” Transactions of the Association for Computational Linguistics, vol. 10, pp. 73–91, 2022.
  574. I. U. Hewapathirana, “A Review on Current Trends and Applications of Social Media Research in Sri Lanka,” Cloud Computing and Data Science, pp. 223–242, 2023.
  575. R. Yasasri and D. Karunarathna, “Helaa: A Sinhala Language-Based Programming,” 2023.
  576. D. K. Henadeerage, “Topics in sinhala syntax,” Ph.D. dissertation, The Australian National University, 2002.
  577. K. C. Perera, Prayogika Sinhla Viyakaranaya.
  578. S. Ranathunga and N. de Silva, “Some languages are more equal than others: Probing deeper into the linguistic disparity in the nlp world,” arXiv preprint arXiv:2210.08523, 2022.
  579. J. Lankage, “Sinhala warna malawe vikashanaya,” Ph.D. dissertation, 1988.
  580. Wiktionary, “anusvara,” https://en.wiktionary.org/wiki/anusvara, (Accessed on 02/05/2023).
  581. ——, “visarga,” https://en.wiktionary.org/wiki/visarga, (Accessed on 02/05/2023).
  582. S. T. Nandasara, “Development and standardization of sinhala script code for digital inclusion of native computer users,” Ph.D. dissertation, 2019.
  583. Microsoft, “Font list windows 10 - typography - microsoft learn,” https://learn.microsoft.com/en-us/typography/fonts/windows_10_font_list, 1998, (Accessed on 02/05/2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Nisansa de Silva (37 papers)
Citations (39)

Summary

We haven't generated a summary for this paper yet.