Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rene: A Pre-trained Multi-modal Architecture for Auscultation of Respiratory Diseases (2405.07442v2)

Published 13 May 2024 in cs.SD, cs.AI, eess.AS, and q-bio.QM

Abstract: Compared with invasive examinations that require tissue sampling, respiratory sound testing is a non-invasive examination method that is safer and easier for patients to accept. In this study, we introduce Rene, a pioneering large-scale model tailored for respiratory sound recognition. Rene has been rigorously fine-tuned with an extensive dataset featuring a broad array of respiratory audio samples, targeting disease detection, sound pattern classification, and event identification. Our innovative approach applies a pre-trained speech recognition model to process respiratory sounds, augmented with patient medical records. The resulting multi-modal deep-learning framework addresses interpretability and real-time diagnostic challenges that have hindered previous respiratory-focused models. Benchmark comparisons reveal that Rene significantly outperforms existing models, achieving improvements of 10.27%, 16.15%, 15.29%, and 18.90% in respiratory event detection and audio classification on the SPRSound database. Disease prediction accuracy on the ICBHI database improved by 23% over the baseline in both mean average and harmonic scores. Moreover, we have developed a real-time respiratory sound discrimination system utilizing the Rene architecture. Employing state-of-the-art Edge AI technology, this system enables rapid and accurate responses for respiratory sound auscultation(https://github.com/zpforlove/Rene).

Summary

We haven't generated a summary for this paper yet.