Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification (2305.14032v4)

Published 23 May 2023 in eess.AS, cs.LG, and cs.SD

Abstract: Respiratory sound contains crucial information for the early diagnosis of fatal lung diseases. Since the COVID-19 pandemic, there has been a growing interest in contact-free medical care based on electronic stethoscopes. To this end, cutting-edge deep learning models have been developed to diagnose lung diseases; however, it is still challenging due to the scarcity of medical data. In this study, we demonstrate that the pretrained model on large-scale visual and audio datasets can be generalized to the respiratory sound classification task. In addition, we introduce a straightforward Patch-Mix augmentation, which randomly mixes patches between different samples, with Audio Spectrogram Transformer (AST). We further propose a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space. Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Sangmin Bae (22 papers)
  2. June-Woo Kim (12 papers)
  3. Won-Yang Cho (2 papers)
  4. Hyerim Baek (1 paper)
  5. Soyoun Son (2 papers)
  6. Byungjo Lee (2 papers)
  7. Changwan Ha (1 paper)
  8. Kyongpil Tae (1 paper)
  9. Sungnyun Kim (19 papers)
  10. Se-Young Yun (114 papers)
Citations (19)

Summary

We haven't generated a summary for this paper yet.