Carnatic Raga Identification System using Rigorous Time-Delay Neural Network (2405.16000v1)
Abstract: Large scale machine learning-based Raga identification continues to be a nontrivial issue in the computational aspects behind Carnatic music. Each raga consists of many unique and intrinsic melodic patterns that can be used to easily identify them from others. These ragas can also then be used to cluster songs within the same raga, as well as identify songs in other closely related ragas. In this case, the input sound is analyzed using a combination of steps including using a Discrete Fourier transformation and using Triangular Filtering to create custom bins of possible notes, extracting features from the presence of particular notes or lack thereof. Using a combination of Neural Networks including 1D Convolutional Neural Networks conventionally known as Time-Delay Neural Networks) and Long Short-Term Memory (LSTM), which are a form of Recurrent Neural Networks, the backbone of the classification strategy to build the model can be created. In addition, to help with variations in shruti, a long-time attention-based mechanism will be implemented to determine the relative changes in frequency rather than the absolute differences. This will provide a much more meaningful data point when training audio clips in different shrutis. To evaluate the accuracy of the classifier, a dataset of 676 recordings is used. The songs are distributed across the list of ragas. The goal of this program is to be able to effectively and efficiently label a much wider range of audio clips in more shrutis, ragas, and with more background noise.
- P. K. Srimani and Y. G. Parimala, “Artificial Neural Network approach to develop unique Classification and Raga identification tools for Pattern Recognition in Carnatic Music,” AIP Conference Proceedings, vol. 1414, no. 1, pp. 227–231, 12 2011.
- R. Sridhar and T. Geetha, “Swara indentification for south indian classical music,” in Information Technology, International Conference on. Los Alamitos, CA, USA: IEEE Computer Society, 12 2006, pp. 143–144.
- R. Sridhar and T. V. Geetha, “Raga identification of carnatic music for music information retrieval,” International Journal of Recent Trends in Engineering, vol. 1, no. 1, pp. 571–574, 05 2009, copyright - Copyright Academy Publisher May 2009; Last updated - 2011-05-15.
- T. Krishna and V. Ishwar, “Carnatic music: Svara, gamaka, motif and raga identity,” in Serra X, Rao P, Murthy H, Bozkurt B, editors. Proceedings of the 2nd CompMusic Workshop; 2012 Jul 12-13; Istanbul, Turkey. Barcelona: Universitat Pompeu Fabra; 2012. Universitat Pompeu Fabra, 2012.
- P. Sriram. A carnatic music primer. [Online]. Available: https://sanskritdocuments.org/english/carnatic.pdf
- S. Gulati, J. Serrà Julià, K. K. Ganguli, S. Sentürk, and X. Serra, “Time-delayed melody surfaces for rāga recognition,” in Devaney J, Mandel MI, Turnbull D, Tzanetakis G, editors. ISMIR 2016. Proceedings of the 17th International Society for Music Information Retrieval Conference; 2016 Aug 7-11; New York City (NY).[Canada]: ISMIR; 2016. p. 751-7. International Society for Music Information Retrieval (ISMIR), 2016.
- P. Chordia and S. Şentürk, “Joint recognition of raag and tonic in north indian music,” Computer Music Journal, vol. 37, no. 3, pp. 82–98, 2013.
- P. R. Gopala Krishna Koduri, Sankalp Gulati and X. Serra, “Rāga recognition based on pitch distribution methods,” Journal of New Music Research, vol. 41, no. 4, pp. 337–350, 2012.
- A. Krishna, P. Rajkumar, S. Pulliyakode, and M. John, “Identification of carnatic raagas using hidden markov models,” 01 2011.
- K. Drossos, S. Mimilakis, D. Serdyuk, G. Schuller, T. Virtanen, and Y. Bengio, “Mad twinnet: Masker-denoiser architecture with twin networks for monaural sound source separation,” 02 2018.
- M. Venkataraman, P. Boominathan, and A. Nallamuthu, “Frequency range measures in carnatic singers,” Journal of Voice, vol. 36, no. 5, pp. 732.e1–732.e8, 2022.
- S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” 2015.
- V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807–814.
- H. Ide and T. Kurita, “Improvement of learning for cnn with relu activation by sparse regularization,” in 2017 International Joint Conference on Neural Networks (IJCNN), 2017, pp. 2684–2691.
- Y.-L. Boureau, J. Ponce, and Y. LeCun, “A theoretical analysis of feature pooling in visual recognition,” in Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 111–118.
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The journal of machine learning research, vol. 15, no. 1, pp. 1929–1958, 2014.
- Ragas. [Online]. Available: https://www.karnatik.com/ragasa.shtml
- B. Thoshkahna, M. Müller, V. Kulkarni, and N. Jiang, “Novel audio features for capturing tempo salience in music recordings,” in 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2015, pp. 181–185.
- A. Gholamy, V. Kreinovich, and O. Kosheleva, “Why 70/30 or 80/20 relation between training and testing sets: A pedagogical explanation,” 2018.
- R. T. Pillai and S. P. Mahajan, “Automatic carnatic raga identification using octave mapping and note quantization,” in 2017 International Conference on Communication and Signal Processing (ICCSP), 2017, pp. 0645–0649.
- S. Shetty and K. Achary, “Raga mining of indian music by extracting arohana-avarohana pattern,” International Journal of Recent Trends in Engineering, vol. 1, no. 1, p. 362, 2009.
- H. Ranjani, S. Arthi, and T. Sreenivas, “Carnatic music analysis: Shadja, swara identification and raga verification in alapana using stochastic models,” in 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 2011, pp. 29–32.
- G. K. Koduri, S. Gulati, and P. Rao, “A survey of raaga recognition techniques and improvements to the state-of-the-art,” Sound and Music Computing, vol. 38, pp. 39–41, 2011.
- K. S. R. S. Samsekai Manjabhat, Shashidhar G. Koolagudi and P. B. Ramteke, “Raga and tonic identification in carnatic music,” Journal of New Music Research, vol. 46, no. 3, pp. 229–245, 2017.
- P. Chinthapenta, “Machine learning for raga classification in indian classical music,” Undergraduate Research Scholars Program, 2019. [Online]. Available: https://hdl.handle.net/1969.1/194515