Skip to main content
Top

COPD Scope: Smart Clinical Pathways for Respiratory Illness Diagnosis and Management

  • Open Access
  • 03-10-2025
  • Original Article

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This article delves into the realm of smart healthcare technologies, specifically focusing on the diagnosis and management of Chronic Obstructive Pulmonary Disease (COPD). The study explores the use of advanced sensor systems, such as digital stethoscopes and wearable devices, for capturing respiratory signals. It also examines the application of artificial intelligence (AI) and machine learning (ML) techniques in analyzing these signals to improve diagnostic accuracy. The article presents a comprehensive review of various ML and deep learning (DL) models, including 1D and 2D Convolutional Neural Networks (CNNs), and their effectiveness in classifying respiratory sounds. Additionally, the study discusses the integration of time-frequency representations, such as spectrograms and scalograms, with deep learning architectures to enhance the classification performance. The results indicate that the combination of melspectrogram representations and the SGDM optimizer yields the highest accuracy in diagnosing COPD. This research provides valuable insights into the potential of smart healthcare technologies in revolutionizing the diagnosis and management of respiratory illnesses.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Smart healthcare plays a vital role in the materialization of providing improved patient care and healthcare accessibility by the virtue of digital technologies like artificial intelligence (AI), virtual reality, internet of things (IoT) and so on [1]. Building smart healthcare system facilitates in realizing the dream of enhancing healthcare efficiency making it patient centric at an affordable cost ensuring data security. Smart healthcare essentially makes use of advanced technologies to integrate data from multiple modalities and derive inferences to enhance the patient outcomes in terms of diagnosis, treatment and other healthcare services. The technology assistance by AI [2], machine learning (ML), deep learning (DL), internet of medical things (IoMT) [3], VR, augmented reality (AR) integrated with the electronic health records (EHR) [4] and blockchain in the cloud platform are instrumental in escalating the healthcare into individualized patient care and management systems. AI driven diagnostics and treatment planning, predictive analytics for disease prevention and early diagnosis, continuous health monitoring and automated detection of anomaly from the vitals, building of smart hospital infrastructure, chatbots for patient management, automation of patient tracking and digital records and so on are some of the applications under the umbrella of building smart systems for healthcare. AI driven diagnosis and predictive analytics are performed from the digital records of the patients which include clinical details taken from electronic health records (EHR), scan-based imaging modalities and physiological data like biosignals and data captured from wearable devices. AI based decision support systems typically involve data collection, data preprocessing, pattern recognition followed by prediction / forecasting. This study presents a case study on AI driven diagnosis for respiratory illness from the breathing sounds using ML and DL algorithms as a part of building smart healthcare systems.
Chronic Obstructive Pulmonary Disease (COPD), an incessant respiratory illness with an obstruction of airway passage affects pulmonary function progressively and is not completely reversible [5]. Primary risk factors for COPD include smoking, tobacco consumption, occupational and environmental hazards with prolonged exposure to chemicals, dust, smoke and other lung irritants [6]. This respiratory ailment is diagnosed with different modalities at different stages of progression including physical examination with stethoscope, pulmonary function test using spirometry and imaging-based screening using computed tomography (CT) or chest X rays [7].
Despite the fact that several modalities are in existence for assessing the respiratory disorders, one of the major concern in handling this long term illness is late identification and false diagnosis. Some of the common indicators such as bronchial congestion, whooping cough and shortness of breath are mis-interpreted as aging factors or due to industrial pollutants thereby leading to delays in prognosis. As per the report of 2023 Global Initiative for Chronic Obstructive Lung Disease (GOLD) standard, approximately 50 percent of chronic disorders have not been discovered specifically in the early phase of diagnosis [8]. The motivation behind the proposed idea is, few studies in literature highlights on the clinical challenges such as the importance of thorough and active monitoring of respiration in patients. This factor helps in timely identification of COPD thereby reducing the exacerbations and also improving the reliability[9, 10]. In addition, the other important factor to be considered is the inconsistency among the patients in adhering to the treatment plans varying between seven percent to 78% due to lack in awareness of disorders, incorrect usage of medicines etc. [11, 12].
COPD in the initial stage is exhibited as difficulty in breathing accompanied with recurrent cough, later progresses as shortness of breath, wheezing, chronic cold, frequent colds etc. worsening to consistent wheeze, crackling sound during breathing, irregular heartbeat and so on [13]. As the symptoms of COPD ailments at an early as well as intermediate stage can be assessed with the help of parameters like oxygen saturation, heart rate, respiratory rate, heart rate variability, electro cardiogram (ECG), sleep pattern, physical activity levels and body temperature, telemedicine can be effectively deployed in the management of COPD affected patients. These parameters can be measured at the patient side without the need for healthcare professionals and can be sent to them for review and assistance with the help of digital technologies adopted for telemedicine.
Expert systems and the integration of models with smart sensors are found to promising in handling COPD and also address the abovementioned clinical and operational challenges. The recent research in the year 2025 [14] demonstrates the potency of machine learning model in the timely identification of disorder by analysing the data from digital inhalers. Handheld pulse oximeter or smart watches with pulse oximeter sensor can measure blood oxygen saturation levels and heart rate. A standalone respiration rate sensor or that incorporated into a wrist worn smart watch can monitor the respiratory levels continuously whereas gyroscopes and accelerometers can track the physical activity pattern on a regular basis. However, the parameters assessed with the help of wrist worn smart watches or body worn sensors can be helpful only at the initial stage and cannot be reliable when immediate clinical attention is required. In such conditions, patients visiting nearby clinic will be subjected to one more clinical assessment which includes lung auscultation, spirometry test, arterial blood gas analysis and sometimes chest X rays or computed tomography (CT) based assessment.
Auscultation provides a means of listening to the rhythm of the lungs with the help of a stethoscope. Auscultation provides the clinicians to identify the complications in respiration and enables them in discriminating the lung sound as wheeze, crackle or rhinchi [15]. Spirometry test is a commonly followed diagnostic procedure in respiratory medicine to assess the lung functionality by measuring the volume of inhaled and exhaled air and speed of exhalation [16]. Arterial blood gas analysis measures the level of oxygen and carbon dioxide in the blood and determines the need for ventilation in a patient with respiratory discomfort or failure [17]. In a highly worsened cases, chest X rays and CT may be referred to infer information about the structure of lungs, severity of lung inflammation and progression of the disease over a period of time. These assessments can be done by trained professional like nurse, respiratory therapist or a radiographer and will be then reviewed by expert clinician or pulmonologist. Setting up these infrastructural facilities with trained professionals in a remote place like primary health centres can greatly assist the patient community to be served even in the absence of doctors by virtually connecting to them. Transferring the insights provided by one or more of the above-mentioned from a remote health centre to a pulmonologist through digital technologies like e-mail, messenger services or mobile applications enable the patient to be treated with appropriate therapeutic procedures and medicines. The other two challenges which highly contribute in managing the COPD are limited studies in integrating between end to end which include clinical interpretations and decisions and less utilization of sensor data with AI algorithms. The authors Farida Mohsen et al. in 2022, investigated and developed a model based on AI which focus on the combination of 2 modalities of input viz. medical imaging and electronic health records (EHR). The research findings prove the potential of this model in enhancing the personalized health care and also highlight the difficulties in end to end integration of this models clinically [18]. These two factors are identified as the research gaps and are considered in the proposed work for better COPD management and smart health care.
The main objectives and contributions of the proposed work are as follows
1.
Implementation of multi-domain framework for representing the respiratory patterns of data in temporal, spectral and cepstral domain and perform classification with machine learning models.
 
2.
Implementation of 1D as well as 2D CNN by analyzing the original time series respiratory signal capturing local level time varying features and time frequency representations as input.
 
3.
Experimentation of prediction model with three optimizers viz. stochastic gradient descent (Sgdm), Adaptive data momentum (Adam) and Root Mean Square Optimization (Rmsprop)
 
The remainder of this paper is structured in the following way: the literature review for the study on the categorization of normal and abnormal breathing sounds using diversified methods is outlined in Sect. 2; Sect. 3 focus on methodology wherein various data acquisition methods are discussed along with feature extraction techniques, ML and DL based models. Section 4 deals with the experimentation results and discussion and lastly, Sect. 5 provide concluding remarks and highlights the future work.

2 Literature Survey

In the literature review, numerous attempts have been made for differentiating the normal viz. vesicular and adventitious respiratory sounds. These are generally delineated based on various transforms in the plane of time versus frequency, dissimilar set of features and diverse categorization techniques.
In general, the breathing sounds captured from the human respiratory system provided a significant amount of meaningful information which have a broad range of distribution varying in frequency. The bibliographic reviews have revealed that, analysing the temporal and spectral components of the breathing signals has significant contribution in the prognostic process. In general, the conventional handcrafted features extracted from the raw respiratory signals have measurable attributes which plays an important role in categorizing normal and abnormal sounds. Broadly these are divided into 4 subclasses viz. temporal features such as mean, standard deviation, variance etc., spectral features such as spectral centroid, spectral roll off etc. Apart from temporal and spectral features, significant amount of information present in energy distribution of the signal in various sub bands can be captured with the help of Mel-frequency cepstral coefficients [1924]. The fourth set of features are the features captured in the time–frequency domain with the help of time frequency transforms which helps in identifying the energy contents among various band at every instant of time [2527].
Few investigations concentrated on assessing the signal using short time Fourier transform (STFT) [28, 29] but this method poses certain limitations as the signals are segmented using fixed size windows resulting in fixed temporal resolution. To overcome these challenges, wavelet Transform is explored as it offers localization in both time and frequency domain with variations in resolution. Several authors suggested and experimented various techniques in which diverse set of statistical features and coefficients were extricated by disintegrating the original sound signals using wavelet transform [30, 31]. Additionally, few researchers focussed on using Discrete Wavelet Transform for extracting features [3234]. From this perspective, Gogus F.Z et.al., [35] explored the DWT disintegration rule and computed the power spectral density from various frequency intervals and henceforth extracted statistical features. Apart from these commonly used features for categorization, one more dimension of feature extraction is exploited by several authors by transforming the signal into image (2D) using time frequency representation (TFRs). Detailed investigations have been carried out in several studies with various TFRs fed as input to deep learning algorithms.
In the year 2021, Pham et al. [36], trialled converting the entire respiratory audio files from ICBHI data set into spectrogram type of representations which are further subdivided into short uniform segments. Some of the variations in the spectrograms such as log-mel spectrograms using mel filter banks and Constant Q spectrograms using Gabor filters were also investigated by the authors among which the log- Mel spectrograms were found to be effective in classifying the respiratory disorders [37]. In the same year, S. B. Shuvo et. al proposed a scalogram type of TFR using wavelet transform and achieved good classification accuracy using deep CNN models [38].
In the initial phase of investigations, ML models were explored for classifying the several classes of signals. Demirci, B.A et.al [39] utilized the most widely used ML algorithms viz., k-nearest neighbors (KNN), Artificial neural networks and support vector machine (SVM) for analyzing the respiratory sounds. In this work, the authors extracted MFCC Cepstral features from the signal using decomposition algorithms such as Empirical mode decomposition and Wavelet Transform. The research findings of this study indicate that the best accuracy of 98.8% is achieved using KNN algorithm compared to other two methods of classification. Several authors used SVM and ANN [4042] to support various classes of respiratory sounds. Zhang et al. [43] experimented a clinical investigation using the abnormal sound signals viz., crackles and high-pitched wheezes and built a ML based SVM classifier resulting with a accuracy of 77.7%. Ullah A et.al [44]. proposed a classifier named Decision tree and random forest for respiratory disorder prediction and found that these classifiers perform comparitively well compared to SVM in certain cases of data sets. In recent times, Deep Neural Nets (DNN) were suggested by several authors to identify the trends and patterns of the lung signals in order to predict the various types of respiratory disorders[45, 46]. On comparison with the shallow ML based models, DNN offers an endwise strategy in naturally learning the respiratory sound representation from the original.wav files without requiring the effort of handcrafted feature extraction. DNN models [47, 48] also utilize the pre-trained architectures using Transfer learning approach to enhance the flexibility when dealing with new set of data sets, thereby reducing the training time. In this context, the most commonly used DL model is CNN which has shown significant results in the field of computer vision, imaging techniques and signal processing. Additionally, CNN proves promising results specifically in lung sound categorization [4952]. Additionally, by incorporating pre-trained architectures such as VGG [53], ResNet 50 [50, 54], AlexNet [55], and GoogLeNet [5658] classifiers, the performance results were found to be promising by increasing the number of layers thereby learning deep representations. By taking all this factors into account such as the diverse set of features and several classification methods, the proposed work is assessed with all sort of features namely statistical, spectral and cepstral from pre-processed lung sound signals and experimented with various ML based classifiers. In other perspective, 1D CNN and 2D CNN were also exploited with inputs being various TFRs derived from the pre-processed respiratory signals.

3 Methodology

The flow of prediction model presented in this study is shown in Fig. 1. In the proposed framework named COPDScope, dual streams viz. Machine learning and deep learning techniques are followed in parallel for categorizing the respiratory sounds. In the ML flow, statistical, spectral and cepstral features are extracted. The statistical features include mean, standard deviation, kurtosis and skewness. Similarly, in the frequency domain, features such as spectral centroid and spectral roll off are extracted and from every signal 13 MFCC features is are extracted in the cepstral domain. The second stream named DL comprises of two sub sections namely 1Dimensional CNN and 2D CNN. In 1D CNN, the features are directly learnt from original raw respiratory signals. In 2D CNN, two time frequency transforms viz. STFT and CWT are employed for convert the time domain signals to time frequency representations. Three TFRs such as spectrograms in linear and Mel Scale and scalograms are given as input to VGG pretrained architecture and are trained for different optimizers. The various stages involved in the proposed framework are signal acquisition, noise reduction, normalization, feature extraction and classification. The modules are discussed in the following section.
Fig. 1
Proposed Framework: COPDScope
Full size image

3.1 Signal Acquisition

Signal acquisition from lungs plays a significant role in evaluating the respiratory system. It is generally carried out in 2 ways through contact-based sensor systems and auscultation methods. To capture and assess the lung signals, contact based sensor systems use modern tools with high accuracy. Some of them are outlined as follows:

4 (i) Sensor Based Signal Acquisition:

4.1 (a) Sensors for capturing vibrations:

Microphone sensors are used for recording the vesicular sounds by directly placing the sensor on the surface of the pharynx or chest. These sensors convert the mechanical sound signals into electrical signals. Basically, these sensors are very sensitive and even the minute discernible variations in the abnormal lung sounds like rhonchi, crackles and wheezes are detected easily. Second type of sensor which measures the patterns of vibrations during flow of air in the chest wall is accelerometer-based sensors. The greatest advantage of accelerometer-based sensor is, its weight is very less and portable. Piezoelectric sensors, are next type of sensor which is widely used in real time and depends on the piezoelectric effect. The mechanical stress captured is converted to electrical signal by this sensor. Compared to microphone and accelerometer sensors, piezoelectric sensors are highly effective in capturing the chest wall movements during the process of breathing. These sensors can be integrated into the wearable devices and help in monitoring the respiratory system continuously [59, 60].

4.2 (b) Differential Flow meters for air flow measurement:

Differential Flowmeters (DFs) [49] are extensively used for accumulating inhaled and exhaled air during breathing over different trends of time. Furthermore, DFs have gained wide acceptance as sensors for monitoring gases delivered by mechanical breathing apparatus. DFs are used to oversee the drifts in respiratory flows of both adults and infants as it is aimed to meet the compromise between sensitivity and additional resistance to the respiratory system. All the exhaled and inhaled flow is conveyed into the differential flow meters and the sensors allows recording the breathing lung volume and track its changes. During breathing, the sound caused by the air flowing through the patient’s throat and respiratory tracts are collected to examine the inspiration and expiration phases of breathing to estimate respiratory frequency.

4.3 (c) Sensors for nasal airflow:

Transitions in the composition of gases during the process of breathing in and breathing out facilitates in the monitoring of important parameters in the respiratory system [6163]. Traditionally, the flow of air in the respiratory system varies between the inhaled and exhaled air. The former process is less warm, contains less moisture and the carbon dioxide level is also lesser compared to the exhalation process. These changes present in the exchange of gases can be used to measure the respiratory rate. Some of the methods used for measuring the flow of air in the alveoli are Nasal Thermistor and Spirometer. Nasal Thermistor detects the changes in temperature during the process of respiration. The potency of this method is limited and the estimated airflow results in empirical approximation, as the displacement of thermistor is of high level [64]. Spirometry sensors [65] measure the most vital parameters of lung namely forced expiratory volume (FEV1) and forced vital capacity (FVC) and peak expiratory flow rate.

4.4 (d) Sensors for sensing environment and cardiac activity:

The inflammation in the respiratory tract can be detected using the temperature sensors which measures the variations in temperature during exhalation. The moisture levels in the exhaled air and other organic mixtures which are common indicators of respiratory disorders are measured using humidity sensors. Additionally, ECG sensors measure the cardiac activity in order to detect the abnormalities associated with chronic obstructive pulmonary diseases and arrhythmia [66, 67].

5 (ii) Auscultation

In comparison with sensor-based systems, the internal breathe sounds of the respiratory system can be listened by using the most traditional method called Auscultation [9]. This method uses acoustic stethoscopes to capture respiratory sounds. This method is easy, cost effective and also very simple to use. Despite the fact it’s simple, the outcomes of diagnosis are less significant as this approach highly depends on the expertise of medical practitioners. [56, 57]. Both contact-based sensor systems and auscultation techniques requires proper placement of tool on the pharynx and chest in order to monitor precise signal capturing. In the proposed method, the respiratory signals captured with the help of digital stethoscope are used for analysis [6870].

5.1 Pre-Processing:

The two major pre-processing steps employed in the analysis of respiratory signals are noise reduction and Normalization. These 2 steps facilitate in the quality and consistency of the pattern of data thereby ascertaining the precise feature extraction and signal categorization.

5.2 i) Noise Reduction and Normalization:

In general, lung signals contain ambient noise due to several factors which include surrounding environment, equipment failure and mal functioning and psychical artifacts like blood flow and heart sounds. The first stage of noise reduction normally aims to reduce all these effects by preserving the principal components of the acquired respiratory signal. Various techniques are used for reducing the noise effects viz. filtering, wavelet denoising and adaptive filtering. In the proposed work, band pass filtering is employed with the frequency ranging between (45Hz–2500 Hz). This filter removes the low frequency noise less than 45 Hz and high frequency noise beyond 2.5 kHz due to power line interference. The second stage involves normalizing the noise reduced signals. In this Z-score normalization is employed to scale the input signal data into a uniform range, thereby enhancing the potency of feature extraction and categorization.

5.3 Feature Extraction for Machine Learning Based Classification:

This segment outlines the real time signal processing (DSP) approaches utilized for extracting the features from the pre-processed input respiratory signals. Basically, the approaches are extensively categorized into three different kinds viz. statistical parameters resulted from respiratory time-domain audio signals, spatial domain features derived by transforming the time domain signals in the Fourier domain, and cepstral features extracted using progressive signal encoding.

6 (a) Statistical Features:

The overall behaviour of the time domain signal is generally characterized by statistical features which includes variation in the amplitude, variations observed in shape of the signal such as kurtosis and skewness, dispersion metrics such as standard deviation, variance, central tendency metrics such as first moment about origin, mode and median, energy based features and percentile based features. Among these, in the proposed work, the most commonly used features such as first moment (about the origin), sigma (\(\sigma )\), kurtosis and skewness are extracted for ML based classification.
The first two metrics namely First moment and \(\sigma \) defines the mid value and the extent of deviation of values from the mid value. Kurtosis defines the distribution of data whether it is thicker and broaden or thinner with respect to the normal distribution. Similarly, skewness describes the symmetricity and assymetricity of distribution of data. The mathematical equations used for calculating the features from the pre-processed respiratory signals are given as follows.
$$First Moment (About Origin)\mu =\frac{1}{N}\sum_{i=1}^{N}{z}_{i}$$
(1)
where \(\upmu \) is mean, \({z}_{i}\) is the ith sample of respiratory signal and N defines the total number of samples in a signal.
$$Sigma \left(\sigma \right)=\surd \frac{1}{N}\sum_{i=1}^{N}{{(z}_{i}-\upmu )}^{2}$$
(2)
where \(\upsigma \) is the standard deviation or dispersion in the signal, \(\upmu \) is mean, \({z}_{i}\) is the ith sample of respiratory signal.
$$Skewness = \frac{\sqrt{\frac{1}{N}\sum_{i=1}^{N}{{(z}_{i}-\mu )}^{3}} }{{\sigma }^{3}}$$
(3)
$$Kurtosis = \frac{\sqrt{\frac{1}{N}\sum_{i=1}^{N}{{(z}_{i}-\mu )}^{4}} }{{\sigma }^{4}}$$
(4)
Using the above-mentioned equations the features are extracted from all the classes of respiratory data set. The extracted features are tabulated in the Table 1 for a sample signal from all classes of data.
Table 1
Statistical features extracted for a sample signal in every class
Class
Mean
Std. deviation
Kurtosis
Skewness
Crackle
− 0.648 × 10–5
0.136366
2.505
0.046
Normal
9.5 × 10–5
0.129446
0.903
− 0.030
Rhinchi
9.76 × 10–3
0.091296
1.246
− 0.061
Wheeze
7.361 × 10–3
0.772938
− 1.496
− 0.024

7 (b) Spatial Domain Features:

Similar to statistical features, the frequency characteristics of the signal can be visualized from frequency domain representation of the signal. This can be obtained by converting the temporal domain of the signal into spectral domain using Fourier Transform using the mathematical equation,
$$Z\left(f\right)={\int }_{-\infty }^{+\infty }z(t){e}^{-j2\pi ft}dt$$
(5)
where \(z(t)\) is the time domain representation of the signal and \(Z\left(f\right)\) is the frequency domain representation. From the spatial domain representation of the signal, the metrics such as frequency distribution, energy concentration, power content and shape distribution can be captured. Some of the metrics such as spectral centroid and spectral roll off are calculated in the proposed work.

7.1 (i) Spectral Centroid:

This parameter concentrates on calculating the center of mass where most of the energies are present in the power spectrum of the signal. Generally, the high value of spectral centroid represents the dominant frequencies present in the signal and lower value depicts the lower frequencies present. Mathematically it is defined by,
$$Spectral Centroid=\frac{\sum_{k=1}^{N}{f}_{k\left|Z({f}_{k})\right|}}{\sum_{k=1}^{N}\left|Z({f}_{k})\right|}$$
(6)
where \(\left|Z({f}_{k})\right|\) defines the magnitude of Fourier Transform of the signal at frequency \({f}_{k}\), N is the total number of bins and \({f}_{k}\) is the frequency at kth bin.

7.2 (ii) Spectral Roll-Off:

Spectral roll-off defines the frequency under which a particular proportion normally 85% of the total spectral power of the signal present. This parameter generally discriminates the harmonic content and noise content present in the signal. Mathematically, it is defined as,
$$\sum_{k=1}^{krolloff}\left|Z({f}_{k})\right|= \alpha \sum_{k=1}^{N}\left|Z({f}_{k})\right|$$
(7)
where \(\alpha \) defines the portion of spectral power (0.85).
Using the formulas, the features are extracted from all classes of respiratory signals and it is tabulated in Table 2 for a sample signal from all classes.
Table 2
Spectral features extracted for a sample signal in every class
Class
Spectral Centroid
Spectral Roll Off
Crackle
0.67814
0.076375
Normal
0.224703
0.231221
Rhinchi
0.304834
0.29393
Wheeze
0.170322
0.166696

8 (c) Cepstral Domain Features:

In general, the statistical features lack in the capability of representing the frequency and phase information thereby limiting its behaviour in complex representation. Similarly, spectral domain features are unable to capture the nonlinear correlations between various frequencies present in the signal spectrum. Therefore, in order to extract the features beyond the linear, temporal and spectral analysis, non-linear processing technique is exploited using inverse Fast Fourier Transform by taking the logarithm of spectral power content in the signal. This method of analysis captures both the linear and non-linear characteristics of signal thereby providing more compact representation resulting in cepstral features. Cepstral features are efficient for the signals having rapid variations both in frequency and amplitude and also helps in segregating the vocal tract (filter) and air pressure in alveoli (source). One of the most common features extracted in cepstral domain is Mel Frequency Cepstral Coefficients (MFCC) where the features are represented in mel scale which resembles the human auditory system. As the respiratory signals lie in the frequency ranges which is perceived by human hearing, this feature is extracted in the proposed work for ML based classification.

8.1 Algorithm used for extracting the MFCC features

Algorithm :
Generation of MFCC Features from all classes of respiratory data set
Full size image
The 13 MFCC Features include lower order 5 coefficients (MFCC 1–5) which captures the global spectral information such as energy concentration in low frequencies for normal breathing, mid and high frequency in rhonchi and wheeze type of signals. The middle order coefficients (MFCC 6–10) capture the dominant information of frequencies in mid-level and higher order coefficients (MFCC 11–13) specifies the high frequency i.e. lower energy present in the spectrum which are helpful in detecting the abnormalities present in the signal. The 13 MFCC coefficients extracted for a sample signal from all classes are tabulated in the Table 3.
Table 3
MFCC features extracted for a sample signal in every class
 
Crackle
Normal
Rhinchi
Wheezes
MFCC 1
− 281.623
− 333.735
− 264.603
− 397.28
MFCC 2
140.5415
117.1365
154.8182
94.94102
MFCC 3
73.61044
51.91729
74.9086
80.68465
MFCC 4
30.24455
34.56506
21.32998
63.03274
MFCC 5
28.40871
35.34722
24.83302
45.39642
MFCC 6
19.17703
27.59229
16.73577
30.74352
MFCC 7
17.32287
19.69374
15.7567
20.34848
MFCC 8
1.536078
16.55967
− 3.41034
13.99352
MFCC 9
5.757586
13.19998
3.774941
10.39762
MFCC 10
1.275351
11.43244
− 1.89146
8.212521
MFCC 11
3.681112
12.1633
3.215543
6.421958
MFCC 12
− 1.48687
10.00433
− 3.5423
4.711829
MFCC 13
2.53483
6.062804
2.680279
3.154856
Subsequent to the process of extracting statistical, spectral and cepstral features from all classes of respiratory signal, the features are aggregated to form a row vector for every signal present in the data set. Therefore, an array of 515 × 19 features for a total 515 signals are formed and these features are fed as input to various ML algorithms for classification. Some of the ML algorithms used in this study are SVM, random forest, decision trees and KNN.
In SVM, the considered dataset in the complex dimensional space is separated into various classes by determining the optimal hyper plane which segregates the classes. The non-linear dataset can be handled by exploring various kernels in SVM such as radial basis function, Gaussian and polynomial. SVM classifier has the greatest advantage of maximizing the margin between the determined optimal hyperplane and data points. For classifying the aggregated respiratory features, in the proposed work, linear kernel with a box constraint value of 1 is used.
In contrast, the second algorithm named decision trees follows a method of hierarchical structure, in which every single node makes a decision on the basis of feature values resulting in different categories of classes at the terminal nodes. This approach is basically simple and tends in over-fitting. But with the help of pruning the unwanted branches overfitting can be reduced. In the proposed work, gini’s index is chosen as the split criterion and the terminal node is chosen to be 4 with the depth of tree ranging between 2 to 10 levels.
The third method of ML algorithm used in this study to enhance the accuracy is random forest algorithm which constructs a group of decision trees. To carry out the classification, the hyperparameter values chosen are 100 number of trees with default values for number of predictors to sample at every split. The last method of classification is trialed with KNN algorithm which basically divides the classes based upon the majority votes of the k-nearest neighbors present in the high dimensional feature space. In this algorithm, among the various nearest neighbor search methods such as ‘kdtree’, ‘exhaustive’ and ‘auto’, exhaustive search method is used as it searches by comparing each and every point in high dimensional space. Similarly, Euclidean distance metric is considered as it is best for numeric features which are normalized.

8.2 Deep Learning Based Classification:

9 (i) Feature Extraction using 1D CNN:

A special type of neural network architecture which is specifically used for processing time series signals and sequential type of data is 1D convolutional neural network (CNN). The variations in the signals such as crests and troughs are captured by sliding the kernels in 1D convolution operation. Using this the temporal variations as well as the spatial dimension representations of the signals are extracted. The segmented respiratory signal is fed as input to 1D CNN input layer with 63,000 samples depicting the signal length and single channel. Following the input layer, three sets of convolutional layers with 32, 64 and 128 filters are exploited for extracting the feature maps. This set produces an output feature of size is (20,000, 128). This size of feature map reduced with the help of maxpool layer of size 2 producing the output size of (10,000,128). The resulting output shape is then flattened with a flattened layer producing a vector of size 1,280,000 which is fed to a series of two dense layers with 256 and 512 nodes and activation function as rectified linear unit (ReLU). Lastly, 1D-CNN architecture outputs totally 4 nodes belonging to four categories of respiratory sounds with activation function softmax. The architecture used for 1D CNN along with output feature maps is illustrated in Fig. 2.
Fig. 2
1D CNN Architecture for Extracting Features from respiratory signals
Full size image

10 (iii) Feature Extraction using 2D CNN:

Traditionally, any non-stationary signal varying with respect to time is discerned with better temporal resolution. In contrast, if the same signal is examined with the Fourier transform technique, its magnitude can be reviewed well with spatial resolution lacking the time information content. The Fourier transform method does not expose the frequent instance of variations occurring at different intervals of time. In order to circumvent these challenges, time frequency transforms such as short time Fourier transforms and wavelet transforms are employed to provide time frequency representations (TFRs). These TFRs offer both the temporal and spectral resolution of the respiratory signal in a better way.
A most common method of time frequency transform suggested by Gabor is short time Fourier transform (STFT) generally termed as windowed FT [68]. This technique is used to compute the phase and the periodicity of harmonic waves available in localized representation of time domain respiratory signal. The visualization plot illustrating the spectrum of the signal with respect to every instant of time is termed as spectrogram. The major drawback of this STFT is the constant window size. This can be changed by varying the window sizes according the frequency variations of the signal using the basis function called wavelets. The wavelets are scaled and shifted in time thereby adapting to the nature of input. The visualization plot illustrating the spectrum of the signal with respect to every instant of time using the absolute value of continuous wavelet transform is termed as scalogram. Similarly, the acquired spectrograms using STFT are applied to mel scale in order to produce third type of TFR called melspectrograms.
Algorithm :
Steps to generate spectrogram, scalogram and melspectrograms
Full size image
The 2D CNN comprises of various layers of convolutional layers and max pooling layers to extract the spatial features from the time frequency representations of the signals. Following the feature extraction, classification with dense layers will be performed. A sample CNN architecture is presented in Fig. 3
Fig. 3
2D-CNN Architecture for Extracting Features from respiratory signals
Full size image

11 Results and Discussion

This section deals with the analysis of respiratory data through various ML and DL algorithms. As a case study, respiratory signals’ auscultation acquired using digital stethoscope are considered in this study for building ML and DL based prediction models. The respiratory signals used in this study are taken from respiratory anomaly labelled events (RALE) dataset [69] and international conference on biomedical and heath informatics (ICBHI) dataset [70] with a total of 515 respiratory signals belonging to four classes viz. crackle, normal, rhinchi and wheezes. The distribution of respiratory signals in these four classes are presented in Table 4. The time domain representation of three sample signals taken from every class are presented in Figs. 4, 5
Table 4
Dataset distribution
Class
No. of Signals
Crackle
281
Normal
75
Rhinchi
37
Wheeze
122
Total
515
Fig. 4
Time domain representation of three sample signals belonging to crackle and normal
Full size image
Fig. 5
Time domain representation of three sample signals belonging to rhinchi and wheeze
Full size image
Three indigenous predictive models viz. ML based, 1-D CNN and 2-D CNN were experimented in the study presented in this study. For the ML based predictive model, a total of 19 features as presented in Table 5 were extracted from the signals and an extensive experimentation was carried out using four classifiers. The hyperparameter settings maintained while performing training using the ML based classifiers are presented in Table 6. The classification performance in terms of accuracy for all the four classifiers considered in this study is presented in Table 7. Among the four classifiers – decision tree, SVM, KNN and random forest, random forest classifier recorded a better classification accuracy of 77.42%. It is observed that KNN and SVM classifiers perform closer to random forest classifier with an accuracy of 76.77% and 75.48% respectively.
Table 5
Features extracted for ML based classification
Feature Type
Feature
Statistical features
Mean
Standard deviation
Kurtosis
Skewness
Spectral features
Spectral centroid
Spectral roll-off
Cepstral features
MFCC (13 Coefficients)
Total number of features extracted
19
Table 6
Hyperparameter Settings for ML based classification
Parameter
Value
Train test split
70:30
Criterion for DT classifier
Entropy
SVM Kernel
Linear
No. of neighbours in KNN classifier
7
Number of decision trees in random forest classifier
25
Table 7
Comparison of classifier performance
Classifier
Accuracy (%)
Decision Tree
66.45
SVM
75.48
KNN
76.77
Random forest
77.42
Following the ML classifier, experimentation of respiratory signal classification using 1D CNN was carried out with a training settings as presented in Table 8. 1D essentially processes the time series data sequentially with the help of convolution filters and learns the time domain patterns. The architectural details of 1D CNN considered in this work for classifying the respiratory signals are presented in Table 9.
Table 8
Hyper Parameter setting for 1D CNN based classifier
Hyper parameter
Value
Optimizer
Adam
Loss function
Categorical cross entropy
No. of epochs
50
Train Test Split
70:30
Table 9
Architectural details of 1D CNN
Layer
Output dimensions
No. of learnable parameters
Input layer
1 × 63,000
0
Conv1D
20,000 × 32
128
Conv1D
20,000 × 64
6,208
Conv1D
20,000 × 128
24,704
MaxPooling1D
10,000 × 128
0
Dropout
10,000 × 128
0
Flatten
1 × 1,280,000
0
Dense
1 × 256
327,680,256
Dense
1 × 512
131,584
Dense
1 × 4
2,052
Total Parameters
327,844,932
The 1D CNN architecture as mentioned above, when trained for 50 epochs resulted in a classification accuracy of 67.74% in classifying the respiratory signals. 1D CNN essentially captures the time domain dependencies and local patterns of the signals and are then exploited for signal classification. 1D CNN by the virtue of local receptive fields, learn the short term patterns of the signal and hence do not effectively capture the long term dependencies of the signal. 1D CNNs can accept only fixed length input. Accordingly, the length of the signal considered in the classification of respiratory signals is 63,000 samples even though most signals in the dataset have samples more than 200,000. This results in loss of information. From the architecture presented in Table 9, it is seen that the 1D CNN is computationally intensive exceeding 327 million learnable parameters. Owing to a very large number of parameters, the network is susceptible to overfitting. On account of the aforementioned limitations, 1D CNN recorded a poor classification performance of 67.74% which is comparatively less than that of ML models. The training progress curve obtained for 1D CNN based signal classification is presented in Figs. 6, 7.
Fig. 6
Training Progress Curve for 1D CNN based Classification
Full size image
Fig. 7
Spectrogram Representations for 3 Sample Signals in Every Class
Full size image
The third indigenous predictive model for respiratory disease classification explored in this framework is based on 2D CNN classifier. 2D CNN architecture accepts 2D or three-dimensional matrix as input and learns the spatial features for the task of classification. VGG-16 architecture is considered as the classifier and the time frequency representations derived from STFT and the wavelet transform are considered as input for the VGG 16 classifier. Accordingly, the classification is performed using VGG 16 architecture from the spectrogram, melspectrogram and scalogram representations of the respiratory signals [71, 72]. The spectrogram, melspectrogram and scalogram representations of three signals taken from every class are presented in Figs .8, 9 respectively. These representations characterize the signal equivalent to an image and hence are used similarly how images are used in classification. These time frequency representation helps in extracting the information which was not noticed in time domain. Additionally, these representations provide information pertaining to patterns and structures in the frequency spectrum and hence are preferred for signal classification. Though spectrograms characterize the signals in time and frequency scales, it fails to mimic the human hearing system whereas melspectrograms are better align with human auditory perception. Scalograms do not model the human hearing systems but are robust for modelling the non-stationary audio signals with sharp transients.
Fig. 8
Melspectrogram Representations for 3 Sample Signals in Every Class
Full size image
Fig. 9
Scalogram Representations for 3 Sample Signals in Every Class
Full size image
VGG16, a deep CNN architecture pretrained with a large ImageNet database is exploited in this framework to classify the respiratory signals from three above-mentioned time frequency representations. The initial few layers extract low level features namely edges and textures and the deeper layers extract the high-level features. Initial few layers can be made non-trainable as they extract low level edges and textures and do not affect the classification performance. The architectural details of VGG16 network is presented in Table 10. It is seen that the network has over 16 million parameters. VGG16 network was trained individually scalogram, spectrogram and melspectrogram representations with the learning environment as presented in Table 11. The training progress plot in terms of accuracy and loss for scalogram, spectrogram and melspectrogram representations are presented in Figs. 10, 11, 12 respectively.
Table 10
Architectural details of VGG16
Layer
Output Dimensions
No. of learnable parameters
Input layer
224 × 224 × 3
0
Conv2D
224 × 224 × 64
1,792
Conv2D
224 × 224 × 64
36,928
MaxPooling2D
112 × 112 × 64
0
Conv2D
112 × 112 × 128
73,856
Conv2D
112 × 112 × 128
147,584
MaxPooling2D
56 × 56 × 128
0
Conv2D
56 × 56 × 256
295,168
Conv2D
56 × 56 × 256
590,080
Conv2D
56 × 56 × 256
590,080
MaxPooling2D
28 × 28 × 256
0
Conv2D
28 × 28 × 512
1,180,160
Conv2D
28 × 28 × 512
2,359,808
Conv2D
28 × 28 × 512
2,359,808
MaxPooling2D
14 × 14 × 512
0
Conv2D
14 × 14 × 512
2,359,808
Conv2D
14 × 14 × 512
2,359,808
Conv2D
14 × 14 × 512
2,359,808
MaxPooling2D
7 × 7 × 512
0
Flatten
1 × 25,088
0
Dense
1 × 64
1,605, 696
Dense
1 × 128
8,320
Dense
1 × 4
516
Total parameters
16, 329, 220
Table 11
Hyper parameter setting for 2D CNN based classifier
Hyper parameter
Value
Optimizer
Sgdm, Adam, Rmsprop
Loss function
Categorical cross entropy
No. of epochs
50
Train test split
70:30
Image size
224 × 224 × 3
Fig. 10
Training Progress Curves for CNN based Classification of Scalogram Representations a sgdm optimizer b adam optimizer c rmsprop optimizer
Full size image
Fig. 11.
Training Progress Curves for CNN based Classification of Spectrogram Representations. a sgdm optimizer. b adam optimizer. c rmsprop optimizer
Full size image
Fig. 12.
Training Progress Curves for CNN based Classification of Melspectrogram Representations. a sgdm optimizer. b adam optimizer. c rmsprop optimizer
Full size image
The classification performance of VGG 16 network from the scalogram, spectrogram and melspectrogram representations is presented in Table 12. Experimentation has been carried out with three popular weight updation techniques viz., rmsprop, adam and sgdm. It is seen from Table 12, that among the three representations – scalogram, spectrogram and melspectrogram, melspectrogram representations demonstrated a phenomenal performance for all three optimizers considered in this experimentation. Additionally, among the three optimizers, the classifier trained with sgdm optimization outperformed the other two for all three representations. Accordingly, the melspectrogram representations in combination with sgdm optimizer recorded a classification performance with an accuracy of 93.02%. Furthermore, spectrogram representations recorded a comparable performance with an accuracy of 91.86% with sgdm optimization. Scalograms derived from the time domain representation of respiratory signals did not play a significant role in the classification of respiratory signals. However, the classification using 2D CNN based classifier on the whole recorded a better performance than 1D CNN classifier. The training progress plots depicting the variation of accuracy and loss with respect to number of epochs for all three representations are presented in Figs. 10, 11, 12.
Table 12
CNN based classifier performance
Classifier
Representation
Accuracy
Rmsprop optimizer
Adam optimizer
SGDM optimizer
1D CNN
1D Signals
71.68%
70.02%
67.74%
2D CNN
Scalogram
72.26%
74.19%
77.91%
2D CNN
Spectrogram
84.96%
86.45%
91.86%
2D CNN
Melspectrogram
85.16%
89.03%
93.02%

12 Conclusion

The framework that has been proposed provide a comprehensive review of the many sensors that are utilized in the process of acquiring and diagnosing COPD was provided by COPDScope. A complete investigation on the classification of respiratory signal classification utilizing a wide set of characteristics and classifiers was carried out as a case study in this work. This study was carried out during the course of this work. Both machine learning and deep learning techniques were utilized in order to classify the respiratory signals after they were studied in several dimensions. The results make it abundantly clear that the signals, when represented as TFR, contribute more to the classification task than when they themselves are represented. Among the three TFRs that were taken into consideration for this study, melspectrogram representations has resulted in a comparatively better performance. Furthermore, on experimentation with various optimizers, sgdm proves significant improvement in the accuracy of 93.02% compared to adam and rmsprop.

Declarations

Conflicts of Interest

The authors declare that they have no conflicts of interest concerning the publication of this paper.

Ethics approval

Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Title
COPD Scope: Smart Clinical Pathways for Respiratory Illness Diagnosis and Management
Authors
B. Lakshmipriya
S. Jayalakshmy
Sunday Adeola Ajagbe
Precious Ikpemhinogena Ogie
Joseph Bamidele Awotunde
Matthew O. Adigun
Publication date
03-10-2025
Publisher
Springer Berlin Heidelberg
Published in
Annals of Data Science
Print ISSN: 2198-5804
Electronic ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-025-00628-6
1.
go back to reference Muhammad G, Alshehri F, Karray F, El Saddik A, Alsulaiman M, Falk TH (2021) A comprehensive survey on multimodal medical signals fusion for smart healthcare systems. Information Fusion 76:355–375. https://​doi.​org/​10.​1016/​j.​inffus.​2021.​06.​007CrossRef
2.
go back to reference Bajwa J, Munir U, Nori A, Williams B (2021) Artificial intelligence in healthcare: transforming the practice of medicine. Future healthcare journal 8(2):e188–e194. https://​doi.​org/​10.​7861/​fhj.​2021-0095CrossRef
3.
go back to reference Mathkor DM, Mathkor N, Bassfar Z, Bantun F, Slama P, Ahmad F et al (2024) Multirole of the internet of medical things (IoMT) in biomedical systems for managing smart healthcare systems: An overview of current and future innovative trends. J Infect Public Health 17(4):559–572. https://​doi.​org/​10.​1016/​j.​jiph.​2024.​01.​013CrossRef
4.
go back to reference Kanschik D, Bruno RR, Wolff G, Kelm M, Jung C (2023) Virtual and augmented reality in intensive care medicine: a systematic review. Ann Intensive Care 13(1):81. https://​doi.​org/​10.​1186/​s13613-023-01176-zCrossRef
5.
go back to reference Weiss, S. T., DeMeo, D. L., & Postma, D. S. (2003). COPD: problems in diagnosis and measurement. European Respiratory Journal21(41 suppl), 4s-12s. Devine, J. F. (2008). Chronic obstructive pulmonary disease: an overview. American health & drug benefits1(7), 34. https://​doi.​org/​10.​1183/​09031936.​03.​00077702
6.
go back to reference Devine JF (2008) Chronic obstructive pulmonary disease: an overview. Am Health Drug Benefits 1(7):34
7.
go back to reference ŞERİfOğLu, İ., & Ulubay, G. (2019) The methods other than spirometry in the early diagnosis of COPD. Tuberk Toraks 67(1):63–70. https://​doi.​org/​10.​5578/​tt.​68162CrossRef
8.
go back to reference Global Initiative for Chronic Obstructive Lung Disease (GOLD). (2023). Global Strategy for the Diagnosis, Management, and Prevention of COPD. www.​goldcopd.​org
9.
go back to reference Alves Pegoraro J, Guerder A, Similowski T, Salamitou P, Gonzalez-Bermejo J, Birmelé E (2025) Detection of COPD exacerbations with continuous monitoring of breathing rate and inspiratory amplitude under oxygen therapy. BMC Med Inform Decis Mak 25(1):101. https://​doi.​org/​10.​1186/​s12911-025-02939-3CrossRef
10.
go back to reference Teresi RK, Hendricks AC, Moraveji N, Murray RK, Polsky M, Maselli DJ (2024) Clinical interventions following escalations from a continuous respiratory monitoring service in patients with chronic obstructive pulmonary disease. Chronic Obstruct Pulmonary Diseases: J COPD Foundation 11(6):558CrossRef
11.
go back to reference Mäkelä MJ, Backer V, Hedegaard M, Larsson K (2013) Adherence to inhaled therapies, health outcomes and costs in patients with asthma and COPD. Respir Med 107(10):1481–1490. https://​doi.​org/​10.​1016/​j.​rmed.​2013.​04.​005CrossRef
12.
go back to reference Bhattarai B, Walpola R, Mey A, Anoopkumar-Dukie S, Khan S (2020) Barriers and strategies for improving medication adherence among people living with COPD: a systematic review. Respir Care 65(11):1738–1750. https://​doi.​org/​10.​4187/​respcare.​07355CrossRef
13.
go back to reference Xiang X, Huang L, Fang Y, Cai S, Zhang M (2022) Physical activity and chronic obstructive pulmonary disease: a scoping review. BMC Pulm Med 22(1):301. https://​doi.​org/​10.​1186/​s12890-022-02099-4CrossRef
14.
go back to reference Snyder LD, DePietro M, Reich M, Neely ML, Lugogo N, Pleasants R, Li T, Granovsky L, Brown R, Safioti G (2025) Predictive machine learning algorithm for COPD exacerbations using a digital inhaler with integrated sensors. BMJ Open Respir Res 12(1):e002577–e002577. https://​doi.​org/​10.​1136/​bmjresp-2024-002577CrossRef
15.
go back to reference Bohadana A, Izbicki G, Kraman SS (2014) Fundamentals of Lung Auscultation. N Engl J Med 370(8):744–751. https://​doi.​org/​10.​1056/​nejmra1302901CrossRef
16.
go back to reference Liou TG, Kanner RE (2009) Spirometry. Clin Rev Allergy Immunol 37:137–152. https://​doi.​org/​10.​1007/​s12016-009-8128-zCrossRef
17.
go back to reference Cukic V (2014) The changes of arterial blood gases in COPD during four-year period. Medical archives 68(1):14CrossRef
18.
go back to reference Mohsen F, Ali H, El Hajj N, Shah Z (2022) Artificial intelligence-based methods for fusion of electronic health records and imaging data. Sci Rep 12(1):17981. https://​doi.​org/​10.​1038/​s41598-022-22514-4CrossRef
19.
go back to reference Wooten FT, Waring WW, Wegmann MJ, Anderson WF, Conley JD (1978) Method for respiratory sound analysis. PubMed 12(4):254–257
20.
go back to reference Chowdhury SK, Majumder AK (1982) Frequency analysis of adventitious lung sounds. J Biomed Eng 4(4):305–312CrossRef
21.
go back to reference Kandaswamy A, Rajkumar S, Kumar AS, Jayaraman S (1999) Respiratory system diagnosis through lung sound processing. J Systems Sci Eng 4(1):32–36
22.
go back to reference Oud M, Dooijes EH, van der Zee JS (2000) Asthmatic airways obstruction assessment based on detailed analysis of respiratory sound spectra. IEEE Trans Biomed Eng 47(11):1450–1455. https://​doi.​org/​10.​1109/​10.​880096CrossRef
23.
go back to reference Bahoura, M., & Pelletier, C. (2004, September). Respiratory sounds classification using cepstral analysis and Gaussian mixture models. In The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (Vol. 1, pp. 9–12). IEEE. https://​doi.​org/​10.​1109/​IEMBS.​2004.​1403077
24.
go back to reference Haider NS, Singh BK, Periyasamy R, Behera AK (2019) Respiratory sound based classification of chronic obstructive pulmonary disease: a risk stratification approach in machine learning paradigm. J Med Syst 43(8):255. https://​doi.​org/​10.​1007/​s10916-019-1388-0CrossRef
25.
go back to reference Tocchetto, M. A., Bazanella, A. S., Guimaraes, L., Fragoso, J. L., & Parraga, A. J. I. P. V. (2014). An embedded classifier of lung sounds based on the wavelet packet transform and ANN. IFAC Proceedings Volumes47(3), 2975-2980. https://​doi.​org/​10.​3182/​20140824-6-ZA-1003.​01638
26.
go back to reference Charleston-Villalobos S, Martinez-Hernandez G, Gonzalez-Camarena R, Chi-Lem G, Carrillo JG, Aljama-Corrales T (2011) Assessment of multichannel lung sounds parameterization for two-class classification in interstitial lung disease patients. Comput Biol Medi 41(7):473–482CrossRef
27.
go back to reference Lozano M, Fiz JA, Jané R (2015) Automatic differentiation of normal and continuous adventitious respiratory sounds using ensemble empirical mode decomposition and instantaneous frequency. IEEE J Biomed Health Inf 20(2):486–497CrossRef
28.
go back to reference Rizal A, Hidayat R, Nugroho HA (2015) Signal domain in respiratory sound analysis: methods, application and future development. J Comput Sci 11(10):1005CrossRef
29.
go back to reference Rizal, A., Hidayat, R., & Nugroho, H. A. (2016, August). Lung sounds classification using spectrogram's first order statistics features. In 2016 6th International Annual Engineering Seminar (InAES). https://​doi.​org/​10.​1109/​INAES.​2016.​7821914
30.
go back to reference Kandaswamy A, Kumar CS, Ramanathan RP, Jayaraman S, Malmurugan N (2004) Neural classification of lung sounds using wavelet coefficients. Comput Biol Med 34(6):523–537. https://​doi.​org/​10.​1016/​S0010-4825(03)00092-1CrossRef
31.
go back to reference Kahya, Y. P., Yeginer, M., & Bilgic, B. (2006, August). Classifying respiratory sounds with different feature sets. In 2006 International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 2856–2859). IEEE. https://​doi.​org/​10.​1109/​IEMBS.​2006.​259946
32.
go back to reference AberaTessema B, Nemomssa H, LamesginSimegn G (2022) Acquisition and classification of lung sounds for improving the efficacy of auscultation diagnosis of pulmonary diseases. Medi Devices: Evidence Res 15:89–102. https://​doi.​org/​10.​2147/​mder.​s362407CrossRef
33.
go back to reference Levy J, Naitsat A, Zeevi YY (2022) Classification of audio signals using spectrogram surfaces and extrinsic distortion measures. EURASIP J Adv Signal Process. https://​doi.​org/​10.​1186/​s13634-022-00933-9CrossRef
34.
go back to reference Göğüş FZ, Karlık B, Harman G (2016) Identification of pulmonary disorders by using different spectral analysis methods. Int J Comput Intell Syst 9(4):595–611. https://​doi.​org/​10.​1080/​18756891.​2016.​1204110CrossRef
35.
go back to reference Pham L, Phan H, Palaniappan R, Mertins A, McLoughlin I (2021) CNN-MoE based framework for classification of respiratory anomalies and lung disease detection. IEEE J Biomed Health Inform 25(8):2938–2947. https://​doi.​org/​10.​1109/​jbhi.​2021.​3064237CrossRef
36.
go back to reference McFee, B., Raffel, C., Liang, D., Ellis, D., McVicar, M., Battenberg, E., & Nieto, O. (2015). librosa: Audio and Music Signal Analysis in Python. Proceedings of the 14th Python in Science Conference. https://​doi.​org/​10.​25080/​majora-7b98e3ed-003
37.
go back to reference Shuvo SB, Ali SN, Swapnil SI, Hasan T, Bhuiyan MIH (2021) A lightweight CNN model for detecting respiratory diseases from lung auscultation sounds using EMD-CWT-based hybrid scalogram. IEEE J Biomed Health Inform 25(7):2595–2603. https://​doi.​org/​10.​1109/​jbhi.​2020.​3048006CrossRef
38.
go back to reference Demirci, B. A., Koçyiğit, Y., Kızılırmak, D., & Havlucu, Y. (2022). Adventitious and Normal Respiratory Sound Analysis with Machine Learning Methods. Celal Bayar Üniversitesi Fen Bilimleri Dergisi. https://​doi.​org/​10.​18466/​cbayarfbe.​1002917
39.
go back to reference Meng F, Shi Y, Wang N, Cai M, Luo Z (2020) Detection of respiratory sounds based on wavelet coefficients and machine learning. IEEE Access 8:155710–155720. https://​doi.​org/​10.​1109/​ACCESS.​2020.​3016748CrossRef
40.
go back to reference Abdullah S, Demosthenous A, Yasin I (2020) Comparison of auditory-inspired models using machine-learning for noise classification. Int J Simul Syst Sci Technol. https://​doi.​org/​10.​5013/​ijssst.​a.​21.​02.​20CrossRef
41.
go back to reference Taiwo GA, Vadera S, Alameer A (2025) Vision transformers for automated detection of pig interactions in groups. Smart Agricultural Technol 10:100774. https://​doi.​org/​10.​1079/​cabireviews.​2024.​0038
42.
go back to reference Zhang J, Wang HS, Zhou HY, Dong B, Zhang L, Zhang F, Yin Y (2021) Real-world verification of artificial intelligence algorithm-assisted auscultation of breath sounds in children. Front Pediatrics. 9:627337CrossRef
43.
go back to reference Ullah, A., Khan, M. S., Khan, M. U., & Mujahid, F. (2021). Automatic Classification of Lung Sounds Using Machine Learning Algorithms. https://​doi.​org/​10.​1109/​fit53504.​2021.​00033
44.
go back to reference Song, W., Han, J., & Song, H. (2021, June). Contrastive embeddind learning method for respiratory sound classification. In ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1275–1279). IEEE. https://​doi.​org/​10.​1109/​icassp39728.​2021.​9414385
45.
go back to reference Cinyol F, Baysal U, Köksal D, Babaoğlu E, Ulaşlı SS (2023) Incorporating support vector machine to the classification of respiratory sounds by convolutional neural network. Biomed Signal Process Control 79:104093. https://​doi.​org/​10.​1016/​j.​bspc.​2022.​104093CrossRef
46.
go back to reference Acharya J, Basu A (2020) Deep neural network for respiratory sound classification in wearable devices enabled by patient specific model tuning. IEEE Trans Biomed Circ Syst. https://​doi.​org/​10.​1109/​tbcas.​2020.​2981172CrossRef
47.
go back to reference Nguyen T, Pernkopf F (2022) Lung Sound Classification Using Co-Tuning and Stochastic Normalization. IEEE Trans Biomed Eng 69(9):2872–2882. https://​doi.​org/​10.​1109/​tbme.​2022.​3156293CrossRef
48.
go back to reference Kwon AM, Kang K (2022) A temporal dependency feature in lower dimension for lung sound signal classification. Sci Rep. https://​doi.​org/​10.​1038/​s41598-022-11726-3CrossRef
49.
go back to reference Petmezas G, Cheimariotis G-A, Stefanopoulos L, Rocha B, Paiva RP, Katsaggelos AK, Maglaveras N (2022) Automated lung sound classification using a hybrid CNN-LSTM network and focal loss function. Sensors 22(3):1232. https://​doi.​org/​10.​3390/​s22031232CrossRef
50.
go back to reference Lal N (2023) A lung sound recognition model to diagnoses the respiratory diseases by using transfer learning. Multimedia Tools Appl 82(23):36615–36631. https://​doi.​org/​10.​1007/​s11042-023-14727-0CrossRef
51.
go back to reference Chen H, Yuan X, Pei Z, Li M, Li J (2019) Triple-classification of respiratory sounds using optimized S-transform and deep residual networks. IEEE Access 7:32845–32852. https://​doi.​org/​10.​1109/​access.​2019.​2903859CrossRef
52.
go back to reference Jayalakshmy S, Sudha GF (2020) Scalogram based prediction model for respiratory disorders using optimized convolutional neural networks. Artif Intell Med 103:101809. https://​doi.​org/​10.​1016/​j.​artmed.​2020.​101809CrossRef
53.
go back to reference Jayalakshmy S, Lakshmipriya B, Sudha GF (2023) Bayesian optimized GoogLeNet based respiratory signal prediction model from empirically decomposed gammatone visualization. Biomed Signal Process Control 86:105239. https://​doi.​org/​10.​1016/​j.​bspc.​2023.​105239CrossRef
54.
go back to reference Chu M, Nguyen T, Pandey V, Zhou Y, Pham HN, Bar-Yoseph R, Radom-Aizik S, Jain R, Cooper DM, Khine M (2019) Respiration rate and volume measurements using wearable strain sensors. Npj Digital Medicine. https://​doi.​org/​10.​1038/​s41746-019-0083-3CrossRef
55.
go back to reference Wu, D., Wang, L., Zhang, Y. T., Huang, B. Y., Wang, B., Lin, S. J., & Xu, X. W. (2009, September). A wearable respiration monitoring system based on digital respiratory inductive plethysmography. In 2009 Annual international conference of the IEEE Engineering in Medicine and Biology Society (pp. 4844–4847). IEEE.
56.
go back to reference Fan D, Yang J, Zhang J, Lv Z, Huang H, Qi J, Yang P (2018) Effectively measuring respiratory flow with portable pressure data using back propagation neural network. IEEE J Trans Eng Health Medi 6:1–12CrossRef
57.
go back to reference Ajagbe SA, Oladosu JB, Olayiwola AA, Falohun AS (2025) Design and development of automatic speech recognition (ASR) system for low-resource language using convolutional neural network model. J Comput Sci its Appl 31(2):10–18
58.
go back to reference Ukah DO, Ehizojie L, Ajayi SA, Nnakwuzie D, Shokenu ES, Sojobi A (2024) An online knowledge-based support system. Int J Papier Public Rev. 5(4):79–92CrossRef
59.
go back to reference Dinh T, Nguyen T, Phan HP, Nguyen NT, Dao DV, Bell J (2020) Stretchable respiration sensors: advanced designs and multifunctional platforms for wearable physiological monitoring. Biosens Bioelectron 166:112460CrossRef
60.
go back to reference Hussain T, Ullah S, Fernández-García R, Gil I (2023) Wearable sensors for respiration monitoring: a review. Sensors 23(17):7518. https://​doi.​org/​10.​3390/​s23177518CrossRef
61.
go back to reference Storck K, Karlsson M, Ask P, Loyd D (1996) Heat transfer evaluation of the nasal thermistor technique. IEEE Trans Biomed Eng 43(12):1187–1191. https://​doi.​org/​10.​1109/​10.​544342CrossRef
62.
go back to reference Kim Y, Ajayi AS, Yun R (2023) Development of condensation heat transfer coefficient and pressure drop model applicable to a full range of reduced pressures. Korean J Air-Conditioning Refriger Eng 35(11):557–565. https://​doi.​org/​10.​6110/​kjacr.​2023.​35.​11.​557CrossRef
63.
go back to reference Kozia C, Herzallah R, Lowe D (2018) ECG-derived respiration using a real-time QRS detector based on empirical mode decomposition. Aston Public Explor (Aston Univ). https://​doi.​org/​10.​1109/​icspcs.​2018.​8631760CrossRef
64.
go back to reference Wang G, Zhang Y, Yang H, Wang W, Dai Y-Z, Niu L-G, Lv C, Xia H, Liu T (2020) Fast-response humidity sensor based on laser printing for respiration monitoring. RSC Adv 10(15):8910–8916. https://​doi.​org/​10.​1039/​c9ra10409gCrossRef
65.
go back to reference Fan J, Yang S, Liu J, Zhu Z, Xiao J, Chang L, Lin S, Zhou J (2022) A high accuracy & ultra-low power ECG-derived respiration estimation processor for wearable respiration monitoring sensor. Biosensors 12(8):665–665. https://​doi.​org/​10.​3390/​bios12080665CrossRef
66.
go back to reference Gross V, Dittmar A, Penzel T, Schuttler F, Von Wichert P (2000) The Relationship between normal lung sounds, age, and gender. Am J Respir Crit Care Med 162(3):905–909. https://​doi.​org/​10.​1164/​ajrccm.​162.​3.​9905104CrossRef
67.
go back to reference Sarkar, M., Madabhavi, I., Niranjan, N., & Dogra, M. (2015). Auscultation of the respiratory system. Annals of Thoracic Medicine, 10(3), 158–168. https://​pmc.​ncbi.​nlm.​nih.​gov/​articles/​PMC4518345/​
68.
go back to reference The R.A.L.E. Repository. Rale.ca.N.P.,2017.Web.28 Feb.2017.
70.
go back to reference Gabor, “Theory of Communication—Part 3: Frequency compression and expansion,” J. Inst. Electr. Eng., vol. 93, no. 26, pp. 445–457, 1946.
71.
go back to reference Sadiku MN, Ajayi SA, Sadiku JO (2025) Artificial intelligence in legal practice: opportunities, challenges, and future directions. J Eng Res Rep 27(4):68–80. https://​doi.​org/​10.​9734/​jerr/​2025/​v27i41456CrossRef
72.
go back to reference Akinlade, O., Vakaj, E., Dridi, A., Tiwari, S., Ortiz-Rodriguez, F. (2023). Semantic Segmentation of the Lung to Examine the Effect of COVID-19 Using UNET Model. In: Jabbar, M.A., Ortiz-Rodríguez, F., Tiwari, S., Siarry, P. (eds) Applied Machine Learning and Data Analytics. AMLDA 2022. Communications in Computer and Information Science, vol 1818. Springer, Cham. https://​doi.​org/​10.​1007/​978-3-031-34222-6_​5 (pp 52–63)
    Image Credits
    Schmalkalden/© Schmalkalden, NTT Data/© NTT Data, Verlagsgruppe Beltz/© Verlagsgruppe Beltz, EGYM Wellpass GmbH/© EGYM Wellpass GmbH, rku.it GmbH/© rku.it GmbH, zfm/© zfm, ibo Software GmbH/© ibo Software GmbH, Lorenz GmbH/© Lorenz GmbH, Axians Infoma GmbH/© Axians Infoma GmbH, OEDIV KG/© OEDIV KG, Rundstedt & Partner GmbH/© Rundstedt & Partner GmbH