Automated Diagnosis of Pediatric Pulmonary Auscultation Using Deep Neural Networks
DOI:
https://doi.org/10.24215/15146774e072Keywords:
deep neural networks, respiratory sounds, VGG-16 architecture, Mel-frequency cepstral coefficients (MFCCs), diagnosis of respiratory diseasesAbstract
This study investigates the implementation of deep neural networks in the classification of respiratory sounds, a crucial task for diagnosing pulmonary diseases. For this purpose, the VGG-16 architecture, renowned for its effectiveness in image classification, was adapted to process audio data. The respiratory sound dataset was collected and preprocessed using Mel-frequency cepstral co-efficients (MFCCs) as input to the network. The results reveal significant perfor-mance, achieving 79% accuracy in classifying respiratory sounds. This outcome highlights the potential of pre-trained convolutional neural networks in the medical field. However, challenges remain, such as the need for larger datasets and a deeper understanding of the results for effective clinical implementation.
Downloads
References
World Health Organization. The top 10 causes of death. (2017).
Landau LI, Taussig LM: Early childhood origins and Economic impact of respiratory disease throughout life. En: Pediatric Respiratory Medicine (Second Edition), editado por Taussig LM, Landau LI, 1-8. Mosby, Philadelphia (2008).
Gibson GJ, Loddenkemper R, Lundbäck B, Sibille Y: Respiratory health and disease in Europe: the new European Lung White Book. Eur Respir J. 42, 559–563 (2013).
Marques A, Oliveira A, Jácome C: Computerized adventitious respiratory sounds as outcome measures for respiratory therapy: a systematic review. Respir Care. 59, 765–776 (2014).
Marques A, Jácome C: Breath sounds from basic science to clinical practice, editado por Priftis KN, Hadjileontiadis LJ, Everard ML, 291-304. Springer, Switzerland (2018).
Aviles-Solis JC, Jácome C, Davidsen A, Einarsen R, Vanbelle S, Pasterkamp H, Melbye H: Prevalence and clinical associations of wheezes and crackles in the general population: the Tromsø study. BMC Pulm Med. 19, 173 (2019).
Saglani S, Payne DN, Zhu J, et al: Early detection of airway wall remodeling and eosino-philic inflammation in preschool wheezers. Am J Respir Crit Care Med. 176, 858–864 (2007).
Saglani S, Malmstrom K, Pelkonen AS, et al: Airway remodeling and inflammation in symptomatic infants with reversible airflow obstruction. Am J Respir Crit Care Med. 171, 722–727 (2005).
Brand PL, Baraldi E, Bisgaard H, et al: Definition, assessment and treatment of wheezing disorders in preschool children: an evidence-based approach. Eur Respir J. 32, 1096–1110 (2008).
Kelada L, Molloy CJ, Hibbert P, Wiles LK, Gardner C, Klineberg E, Braithwaite J, Jaffe A: Child and caregiver experiences and perceptions of asthma self-management. NPJ Prim Care Respir Med. 31, 42 (2021).
Zhjeqi V, Kundi M, Shahini M, Ahmetaj H, Ahmetaj L, Krasniqi S: Correlation between parents and child's version of the child health survey for asthma questionnaire. Eur Clin Respir J. 10, 2194165 (2023).
Rocha B M, Filos D, Mendes L, Vogiatzis I, Perantoni E, Kaimakamis E, Natsiavas P, Oliveira A, Jácome C, Marques A, Paiva RP, Chouvarda I, Carvalho P, & Maglaveras N: A Respiratory Sound Database for the Development of Automated Classification. (2017).
Pramono RXA, Bowyer S, Rodriguez-Villegas E: Automatic adventitious respiratory sound analysis: A systematic review. PLoS One. 12, e0177926 (2017).
Liu GK: Evaluating Gammatone Frequency Cepstral Coefficients with Neural Networks for Emotion Recognition from Speech. Informe de investigación. Ravenwood High School, Brentwood, TN 37027 (2018).
Wei H, Chan C, Choy C, Pun P: An efficient MFCC extraction method in speech recognition. En: Circuits and Systems (2006).
Acharya J, & Basu A: Deep Neural Network for Respiratory Sound Classification in Wearable Devices Enabled by Patient Specific Model Tuning. IEEE Transactions on Biomedical Engineering (2020).
Chamberlain D, Kodgule R, Ganelin D, Miglani V, Fletcher RR: Application of Semi-Su-pervised Deep Learning to Lung Sound Analysis. 38th Annu Int Conf IEEE Eng Med Biol Soc; 2016;804–7.
Simonyan K, & Zisserman A: Very Deep Convolutional Networks for Large-Scale Image Recognition (2015).
Xia T, Han J, & Mascolo C: Exploring machine learning for audio-based respiratory condi-tion screening: A concise review of databases, methods, and open issues. Department of Computer Science and Technology, University of Cambridge, 15 JJ Thomson Avenue, Cam-bridge CB3 0FD, Reino Unido (2022).
Kochetov K, Putin E, Azizov S, Skorobogatov I, & Filchenkov A: Wheeze Detection Using Convolutional Neural Networks (2017).
Chang G-C, Lai Y-F: Performance evaluation and enhancement of lung sound recognition system in two real noisy environments. Comput Methods Prog Biomed. 97(2):141–150 (2010).
Huang DM, Huang J, Qiao K, et al. Deep learning-based lung sound analysis for intelligent stethoscope. Military Med Res 10, 44 (2023). https://doi.org/10.1186/s40779-023-00479-3.
Messner E, Fediuk M, Swatek P, Scheidl S, Smolle-Juttner FM, et al. Crackle and breathing phase detection in lung sounds with deep bidirectional gated recurrent neural networks. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Honolulu, HI, USA; 2018. p. 356–9.
Sengupta N, Sahidullah M, Saha G. Lung sound classification using cepstral-based statisti-cal features. Comput Biol Med. 2016;75: 118–29.
Perna D, Tagarelli A. Deep auscultation: predicting respiratory anomalies and diseases via recurrent neural networks. En: 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS). Córdoba, Spain; 2019. p. 50–5.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Jorge Lopez Perez, Damián Taire, Claudio Delrieux

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Those authors who have publications with this journal, agree with the following terms:
a. Authors will retain its copyright and will ensure the rights of first publication of its work to the journal, which will be at the same time subject to the Creative Commons Atribución-NoComercial-CompartirIgual 4.0 Internacional (CC BY-NC-SA 4.0) allowing third parties to share the work as long as the author and the first publication on this journal is indicated.
b. Authors may elect other non-exclusive license agreements of the distribution of the published work (for example: locate it on an institutional telematics file or publish it on an monographic volume) as long as the first publication on this journal is indicated,
c. Authors are allowed and suggested to disseminate its work through the internet (for example: in institutional telematics files or in their website) before and during the submission process, which could produce interesting exchanges and increase the references of the published work. (see The effect of open Access)