Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Audio-Based Music Classification with DenseNet And Data Augmentation (1906.11620v1)

Published 15 Jun 2019 in eess.AS, cs.MM, and cs.SD

Abstract: In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Wenhao Bian (2 papers)
  2. Jie Wang (481 papers)
  3. Bojin Zhuang (10 papers)
  4. Jiankui Yang (1 paper)
  5. Shaojun Wang (29 papers)
  6. Jing Xiao (268 papers)
Citations (21)

Summary

We haven't generated a summary for this paper yet.