Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Coswara -- A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis (2005.10548v2)

Published 21 May 2020 in eess.AS and cs.SD

Abstract: The COVID-19 pandemic presents global challenges transcending boundaries of country, race, religion, and economy. The current gold standard method for COVID-19 detection is the reverse transcription polymerase chain reaction (RT-PCR) testing. However, this method is expensive, time-consuming, and violates social distancing. Also, as the pandemic is expected to stay for a while, there is a need for an alternate diagnosis tool which overcomes these limitations, and is deployable at a large scale. The prominent symptoms of COVID-19 include cough and breathing difficulties. We foresee that respiratory sounds, when analyzed using machine learning techniques, can provide useful insights, enabling the design of a diagnostic tool. Towards this, the paper presents an early effort in creating (and analyzing) a database, called Coswara, of respiratory sounds, namely, cough, breath, and voice. The sound samples are collected via worldwide crowdsourcing using a website application. The curated dataset is released as open access. As the pandemic is evolving, the data collection and analysis is a work in progress. We believe that insights from analysis of Coswara can be effective in enabling sound based technology solutions for point-of-care diagnosis of respiratory infection, and in the near future this can help to diagnose COVID-19.

Citations (288)

Summary

  • The paper introduces a novel database of respiratory sounds to diagnose COVID-19 using machine learning techniques.
  • It details a rigorous data collection protocol yielding over 941 annotated sound samples to ensure quality and diversity.
  • Preliminary analysis with a Random Forest classifier achieved 66.74% accuracy, highlighting the potential of acoustic biomarkers.

Coswara: A Database for COVID-19 Sound-Based Diagnosis

The paper, "Coswara: A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis," presents an innovative approach to diagnosing COVID-19 infections using respiratory sound analysis. The paper focuses on creating a comprehensive database, dubbed Coswara, comprising diverse respiratory sounds such as cough, breath, and voice. This database aims to facilitate the development of non-invasive, low-cost, scalable COVID-19 diagnostic tools leveraging the capabilities of machine learning.

Approach and Dataset

The central premise of the research is the potential of respiratory sounds as biomarkers for respiratory infections, notably COVID-19. The Coswara project involves gathering sound data from global participants through a web application, covering sound categories including shallow and deep breathing, heavy and shallow coughing, sustained vowels, and various pacing in digit counting. Alongside, metadata such as age, gender, health status, and demographic information are also collected, although personal identifiers are omitted to maintain privacy.

The paper describes a meticulous data collection protocol that operators adhered to, involving sanitization and optimal microphone distance instructions, ensuring the collection quality of sound samples. The dataset boasts a significant sample size, encapsulating audio from 941 participants, categorized into clean, noisy, and degraded recordings based on subsequent manual annotation. This database is made openly accessible, promoting collaborative research to expedite diagnostic tool development.

Analytical Insights

The initial exploration of the dataset involved the employment of signal processing and machine learning techniques aimed at classifying respiratory sounds into different categories. Using a Random Forest classifier, an average classification accuracy of approximately 66.74% was achieved in distinguishing between the nine sound categories. The results underscore the distinct acoustic properties inherent in each sound type, offering preliminary insights into their utility for diagnosing varying respiratory states.

Implications and Future Prospects

The research's impact lies in addressing the limitations of current COVID-19 diagnostic methods, such as RT-PCR, which involve logistical, financial, and operational challenges. The Coswara project envisages developing a diagnostic tool deployable as a mobile or web application, potentially transforming preliminary COVID-19 screening by providing rapid, cost-effective, and accessible diagnostic alternatives that complement chemical testing methodologies.

From a research perspective, the paper posits a foundation for future studies integrating machine learning models for classifying and quantifying respiratory sound-specific biomarkers. The broad scope of the collected dataset aids in training more robust models, contributing to advancements in acoustic-based diagnostic technologies that could extend beyond COVID-19 to other respiratory conditions.

Conclusion

Coswara represents a significant stride towards sound-based diagnostic mechanisms utilizing respiratory sound data as a surrogate for assessing COVID-19 infections. The project underscores the potential for embedded AI solutions in healthcare, paving the way for further research into acoustic signatures of respiratory pathologies and their incorporation into scalable diagnostic frameworks. As health technology progresses, datasets like Coswara will likely play a critical role in developing innovative, efficient, and accessible healthcare solutions globally.