2000 character limit reached
Emotion and Theme Recognition in Music with Frequency-Aware RF-Regularized CNNs (1911.05833v1)
Published 28 Oct 2019 in cs.SD, cs.LG, cs.MM, and eess.AS
Abstract: We present CP-JKU submission to MediaEval 2019; a Receptive Field-(RF)-regularized and Frequency-Aware CNN approach for tagging music with emotion/mood labels. We perform an investigation regarding the impact of the RF of the CNNs on their performance on this dataset. We observe that ResNets with smaller receptive fields -- originally adapted for acoustic scene classification -- also perform well in the emotion tagging task. We improve the performance of such architectures using techniques such as Frequency Awareness and Shake-Shake regularization, which were used in previous work on general acoustic recognition tasks.