Abstract

R-peak detection is crucial in electrocardiogram (ECG) signal analysis. This study proposed an adaptive and time-efficient R-peak detection algorithm for ECG processing. First, wavelet multiresolution analysis was applied to enhance the ECG signal representation. Then, ECG was mirrored to convert large negative R-peaks to positive ones. After that, local maximums were calculated by the first-order forward differential approach and were truncated by the amplitude and time interval thresholds to locate the R-peaks. The algorithm performances, including detection accuracy and time consumption, were tested on the MIT-BIH arrhythmia database and the QT database. Experimental results showed that the proposed algorithm achieved mean sensitivity of 99.39%, positive predictivity of 99.49%, and accuracy of 98.89% on the MIT-BIH arrhythmia database and 99.83%, 99.90%, and 99.73%, respectively, on the QT database. By processing one ECG record, the mean time consumptions were 0.872 s and 0.763 s for the MIT-BIH arrhythmia database and QT database, respectively, yielding 30.6% and 32.9% of time reduction compared to the traditional Pan-Tompkins method.

1. Introduction

Electrocardiogram (ECG) can describe the electrical activity of the heart and is an essential tool for the diagnosis of cardiovascular diseases (CAD). With the rapid development of wearable and wireless ECG techniques, real-time and routine ECG monitoring is attracting more and more attention due to the increasing popularization of medical health, especially for the elderly people [1]. Recent years, lots of publications about ambulatory ECG monitoring devices have been reported [24], aiming to automatically monitor the heart activities and give the feedback of any CAD early warning in real time. However, this application still needs significant development due to the challenge of unexpected noise effects in ECG signal, such as baseline drift, electrode motion and stretching, motion artifacts, and muscle noise [5, 6], which impedes the automatic ECG processing technology to perform effectively. The primary sources of noises are electrical activities of muscles and baseline drift caused by respiration, poor contact of electrodes, and equipment or electronic devices [7, 8]. Electrode movement alters the signal baseline and brings the signal fluctuate perpendicularly to the baseline. If the electrode moves drastically enough to drop from the skin, baseline drift will overwhelm the signal and waveform distortion occurs [9]. Motion artifact is generally attributed to the variation of electrode-skin impedance during a subject’s motion. Changed impedance will be treated by the ECG amplifier as a different input, resulting in impedance mismatching and difficult identification of irregular fluctuation on small amplitude waveforms, such as P wave and T wave [10, 11]. Consequently, noise removal is the preliminary issue to consider for in ECG signal processing.

ECG features are essential characteristics for CAD diagnosis. R-peak detection is the datum since all other features are extracted after the R-peak location [12]. Accurate R-peak detection is critical for arrhythmia diagnosis such as atrial premature contraction, tachycardia, and bradycardia [13]. Nevertheless, efficient R-peak extraction is still difficult in the dynamic and noisy environment due to the time-varying waveform morphology. This would be more difficult when ECG signal is overwhelmed by noises with similar frequency in energy distribution [9].

Over the last decades, numerous techniques have been proposed for R-peak detection. In [14], a thorough review on R-peak detection methodologies for portable, wearable, battery-operated, and wireless ECG devices was elaborated. The authors claimed that the thresholding methods were regarded as the most computationally efficient. However, the suitable parameter settings for thresholds were difficult. The most widely used R-peak detection method, proposed by Pan and Tompkins [15], is the Pan-Tomkins method. It is a threshold-based method with low complexity. Other algorithms of R-peak detection can be classified as pattern recognition [16, 17], wavelet transform [18], mathematical morphology [19], and digital filter [20]. In [10], a real-time R-peak detector using adaptive thresholding was proposed. This algorithm consisted of preprocessing to initialize R-peak threshold and thresholding to adaptively modify the threshold. It achieved sensitivity and positive predictivity higher than 99.3%. In [21], a different interference-based method was developed. This method could effectively distinguish R-peaks from high amplitude noises but failed to detect R-peaks when abrupt jump of baseline appeared. Some researchers also conducted ECG feature extraction without predenoising [22, 23]. The detection accuracy could reach up to 94.8%, although much lower than that acquired from the denoised signals.

Time cost is important due to the fast-responding requirement in CAD early warning applications [4], especially in the real-time monitoring. Many ambulatory ECG devices are generally limited in power supply and computation [1]. The conventional feature extraction algorithms are, from a computational perspective, very intensive tasks, which are typically executed in mainframe-type computational facilities. A significant power expenditure component in such systems is the energy required by the radio front-end for supporting continuous data transmission, which may not allow a long-term sustainable operation. To this end, some researchers have attempted to develop algorithms of low computational load. Apart from the aforementioned methods, in [24], the authors presented a low-complexity ECG feature extraction approach for mobile healthcare applications. This technique was based on the combination of wavelet analysis and time-domain morphology principles. Except for high accuracy and precision, low computation and fast response are also needed in ECG feature extraction.

In this study, an adaptive and time-efficient ECG R-peak detection algorithm is proposed. The method takes advantage of wavelet-based multiresolution analysis (WMRA) and adaptive thresholding. WMRA is applied to strengthen ECG signal representation by extracting ECG frequency interval of interest from wide-range frequencies, which contain interference such as baseline drift and motion artifacts. All the noises produce considerable influence on the following thresholding operation. The adaptive thresholding is designed to exclude false R-peaks in the reconstructed signal by WMRA. The proposed algorithm was tested by the MIT-BIH arrhythmia database (MITDB) and the QT database (QTDB) [25]. Both accuracy and time consumption of the algorithm were evaluated. By exploring the time-frequency property of ECG, this study aims to conduct preliminary and tentative research on adaptive and time-efficient R-peak detection for low-quality ECG signals, promoting automatic ECG processing technology for clinical and daily use.

The remainder of the paper is organized as follows. Section 2 elaborated the detailed procedures of the proposed R-peak detection algorithm. In Section 3, experiment setups were introduced, including the datasets and the evaluation indices. Section 4 demonstrated the experimental results over R-peak detection accuracy, time consumption and time complexity, and the selection of optimal threshold coefficients. Section 5 discussed the advantages and the potential limitation of our algorithm. The summarization of this study was presented in Section 6.

2. Proposed R-Peak Detection Algorithm

The R-peak detection system is described in Figure 1. The purpose of this study is to develop an algorithm which can effectively identify R-peaks mixed in different noises.

2.1. Step 1: WMRA Enhancement

WMRA enhances signals using wavelet transform to extract both time and frequency domain information. This method is very suitable for ECG processing since ECG is essentially nonstationary with small amplitude (0.01~5 mV) and low frequency (0.05~100 Hz) [26]. This method also provides low computational cost [27]. By WMRA, signal below 0.05 Hz and above 100 Hz can be filtered from the raw signal. These intervals are not the ECG frequency bands and contain most types of noises [28]. In addition, according to the Nyquist criterion, subfrequency band presented by each decomposition level is directly related to the sampling frequency [26]. Consequently, the ECG signals, sampled at 360 Hz in MITDB and 250 Hz in QTDB as illustrated in [25], are all decomposed up to 8 levels in this study.

Figure 2 shows the decomposition procedure of eight-level WMRA by using bior6.8 wavelet. For MITDB, consist of frequency components in a range of 0.70–90 Hz, which is the ECG frequency band of interest. with frequency band 90~180 Hz and with frequency band 0~0.70 Hz are beyond the ECG frequency; they are not the considered coefficients containing baseline drift and other interference. Consequently, and are set to zeroes; are kept for reconstruction. Similarly, for QTDB, with frequency band 0~0.49 Hz is set to zero; with frequency components in a range of 0.49–125 Hz are kept. All the retained coefficients are then filtered by the wavelet shrinking threshold algorithm [29]. In this study, soft thresholding is adopted due to its good continuity and no Gibbs phenomenon on step points [30].

2.2. Step 2: Signal Mirroring

For some ECG patterns, such as premature ventricular contraction (PVC) beat, R-peaks are presented with amplitude below the baseline but other features are above the baseline. To avoid the potential missing detection, signal mirroring is designed. The mirroring procedure for a PVC segment is described in Figure 3. Large negative amplitudes are mirrored by taking the baseline as their symmetrical axis. However, not all the negative amplitudes are mirrored, they should be significantly distinctive from adjacent negative values. This assumption is based on the fact that R-peaks have steep slopes while other waves such as P wave and T wave have gentle ones [10]. Steep slope means drastic increment and decrement on both sides of local maximum, and the slope is finished within several sampling points. If the absolute amplitude of a negative point is 1.5 times larger than that of the adjacent points within 0.278 seconds (0.278 points) before and after it, then the negative point will be mirrored. where is the signal length, is the amplitude of large negative point with position number in signal, , and is the amplitude of point within 0.278 s before and after the large negative point.

In some literatures [15, 21, 3133], authors recommend that signal with baseline drift removed could be squared to highlight the difference between true R-peaks and false ones, such as high-amplitude noise and high-amplitude P waves. However, this operation may not be suitable in our method due to the differences among R-peak amplitudes. If the signal is squared, amplitude values below 1 will be smaller than the original, and values above 1 will be enlarged. This increases the difference among true R-peaks and is adverse for the amplitude threshold to detect potential R-peaks, especially when a signal segment is mixed with large and small amplitude R-peaks.

2.3. Step 3: Local Maximum Location and Adaptive Threshold Selection

Local maximums are located by implementing first-order forward differential in the mirrored signal. The procedure is illustrated as follows. (1)First-order forward difference is implemented on ECG signal with .(2)For all the elements in , values less than, equal, and more than 0 are replaced by −1, 0, and 1, respectively, (3)First-order forward difference is implemented on the updated , and the value of is symbolized by −2, 0, and 2. Local maximums in original ECG signal are positions shifted by 1 sample to the right of −2.

The following threshold procedure depends significantly on the amplitude threshold and time interval threshold , which are adaptively determined by the location of local maximum instead of adopting fixed thresholds, since fixed thresholds do not copy with large or small amplitude R-peak and slow or fast beat. The selection of the two thresholds is displayed in Figure 4. During an ECG segment, is selected to be a multiple of the amplitude maximum ; is selected to be a multiple of the average horizontal distance of each adjacent local maximum . If the positions and amplitudes of these local maximums change, the two thresholds will change correspondingly; hence, and will adjust adaptively according to the maximum variety.

However, the threshold selection is strongly dependent on the noise; and are coefficients designed to correct the noise influence. The detailed selection method for them is discussed in Section 4.3. In a segment, the positions of local maximums are fixed, correspondingly; and are deterministic. Hence, only and need to be decided. The thresholds are automatically updated by the shift of new coming segment. The superiority of automatic threshold substitution embodies in the corresponding adjustment on recognition for small amplitude and slow or fast cardiac beat, as fixed thresholds may fail to detect R-peaks in these cases.

2.4. Step 4: Threshold Recognition

Actually, most of the local maximums are not true R-peaks, such as burst points caused by high-frequency interference. The difficulty of R-peak detection lies in the recognition of false R-peaks with amplitudes approximate to or even larger than true R-peaks. To this end, is designed to filter the local maximums with small amplitudes. In general, there should be no extra R-peaks between two adjacent R-peaks; otherwise, the extra R-peaks are definitely false. Assisted by this knowledge, is designed to further remove false R-peaks omitted by . The thresholding algorithm is plotted in Figure 5. The example ECG is from the Record 200 in MITDB with PVC beats. It comprises of large negative R-peaks, and consequently, the signal needs to be mirrored. The marks M in Figure 5 signify the mirrored R-peaks, where they should originally be large negative amplitudes. After the amplitude filtration, the time interval threshold algorithm is summarized as follows: (1)Step A. A local maximum and its following maximum are chosen as true reference (Tref) and comparative reference (Cref) respectively, turn to Step B. If there is no Cref, Tref is considered as a true R-peak and the algorithm ends.(2)Step B. The time period (t_p) between Tref and Cref is calculated. If t_p<ti_t, it indicates that one of the two maximums is a false R-peak, then turn to Step C. Otherwise, Tref is considered as a true R-peak, Cref replaces Tref as true reference, then turn to Step A for next thresholding.(3)Step C. Widths along the baseline of Tref (Wr) and Cref (Wf) are compared, if Wr<Wf, Tref is considered to be a true R-peak, if Wr>Wf, Cref is treated as a true R-peak [34]. Then turn to Step E. If Wr=Wf, turn to Step D.(4)Step D. Amplitude of Tref (Ar) and Cref (Af) is compared, if Ar>Af, Tref is considered to be a true R-peak, otherwise, Cref is treated as a true R-peak [34]. Then turn to Step E.(5)Step E. The false maximum is replaced by the third local maximum (Rref) just behind the two maximums, which is treated as a new Cref, and then return to Step B. If there is no Rref in the time period, turn to Step A.

Tref = localmax1; Cref = localmax2; Rref=localmax3; t_p = time period (abs (Tref - Cref));
WTref = width (Tref); WCref = width (Cref); ATref = amplitude (Tref); ACref = amplitude (Cref);
If t_p < ti_t
  If WTref < WCref
    localmax1 is a true R-peak;
    Else if WTref > WCref
      localmax2 is a true R-peak;
      Else if ATref > ACref
        localmax1 is a true R-peak;
        Else
          localmax2 is a true R-peak;
      End if
    End if
  Else
    localmax1 is a true R-peak;
  End if
End if
If Rref is not NULL
  False R-peak is replaced by the Rref, and repeat the algorithm with new Tref and Cref;
End if
Pseudocode 1: The pseudocode of the threshold procedure.

3. Experiment Designs

3.1. Datasets

The MITDB comprises 48 ECG records, and each contains 30-minute ECG signal [35, 36], resulting in a total of 109966 beats that were all used. The ECG records have acceptable quality, sharp and tall P and T waves, negative R waves, small R-peak amplitudes, wider R waves, muscle noise, baseline drift, sudden changes in heartbeat morphology, multiform PVC, long pauses, and irregular heart rhythms [25].

The QTDB contains a total of 105 15-minute ECGs. ECGs in this database were chosen to represent a wide variety of QRS and ST-T morphologies with real-world variability to challenge the detection algorithms [35, 37]. A total of 86995 beats from 82 records were used, and the rest 23 records of sel30-sel52 were excluded since the QRS annotations were not given.

It should be noted that both databases provide two channels of ECG signals. In this study, only the first channel was used for algorithm development and test.

3.2. Evaluation Indices

Experimental results are evaluated in terms of sensitivity , positive predictivity , and accuracy . The definitions of the indices are expressed in where (true positive) is the number of R-peaks correctly recognized, (false negative) is the number of R-peaks missed, and (false positive) is the number of false R-peaks recognized as true R-peaks. The , , and , verified by the annotations announced in [25], are calculated based on a tolerance window of 50 ms.

Time complexity is also tested, which quantifies the amount of time taken by an algorithm to run as a function of the length of string representing the input. It reflects the increment of time consumption when the input data increase. Time complexity of an algorithm is commonly expressed using O notation. If the number of input data n multiplies, the time consumption multiplies with an increment of , the algorithm is called to have an m-order time complexity symbolized as . In this study, all the time cost experiments were carried out on a desktop (CPU i7-2600 3.40GHz, 8GB RAM, 64-bit Windows 7 Enterprise) installed with Matlab 2016b.

4. Results

First, for both databases, and were initially set as 0.25 and 0.45, respectively. The length of shifting signal was set as 10 s for each thresholding operation. Then, we tested the influences of the parameters and .

4.1. R-Peak Detection Results

The testing results on MITDB are summarized in Table 1. The results demonstrate a satisfactory performance on the records. The algorithm has a total detection failure of 1229 beats (668 beats and 561 beats) out of 109966 beats; the average , , and are 99.39%, 99.49%, and 98.89% respectively.

The testing results on QTDB are shown in Table 2. The algorithm has a total detection failure of 238 beats (147 beats and 91 beats); out of 86995 beats, the average , , and are 99.83%, 99.90%, and 99.73% respectively. Compared with MITDB, the ECG signals from QTDB have much better waveforms with higher quality; distractors such as motion artifacts, burst noise, large P, and T waves are much less. Consequently, the algorithm achieves a more satisfactory performance over QTDB.

Our algorithm is also compared with several existing methods, including the most widely used Pan-Tompkins method, as shown in Table 3. The comparison indicates that our algorithm achieves comparable high performance.

4.2. Time Consumption and Time Complexity

The time consumption for each record from MITDB is described in Figure 6(a). In general, the Pan-Tompkins method consumes more time than our method for most records. The mean time of this method is 1.256 s to process one record, while our algorithm consumes 0.872 s, achieving about 30.6% of time reduction. Figure 6(b) shows the time consumption ratio of the proposed method over the Pan-Tompkins method. It is obvious that our method consumes less time for most records except for records 107, 109, 113, and 116, which contain large T waves that cause more frequent thresholding manipulations. In some cases, our algorithm economizes nearly 50% of time than the Pan-Tompkins method.

The time consumption for each record from QTDB is described in Figure 7(a) and Figure 8(a). It is obvious that our method consumes less time than the Pan-Tompkins method for all the records. The mean time of the Pan-Tompkins method is 1.137 s to process one record, while our algorithm consumes 0.763 s, achieving about 32.9% of time reduction. Figure 7(b) and Figure 8(b) show the time consumption ratio of the proposed method over the Pan-Tompkins method. It can be seen that all the ratios are less than 1. The outstanding performance can be attributed to the high-quality ECG signals in QTDB.

The time consumption reveals an important characteristic of the two methods. The number of sampling points of each QTDB ECG is 225000, and the number is 650000 of each MITDB ECG. Although the number has increased about two times from QTDB to MITDB, the time consumed increases only 12.5% using our method and 9.2% using the Pan-Tomkins method. It indicates that when data multiplies, the time consumption increases slightly instead of multiplying correspondingly. Both our method and the Pan-Tomkins method are not so sensitive to data increase.

However, for records 107, 109, 113, and 116 in MITDB, our method consumes the same and even more time than the Pan-Tompkins method. The disadvantage of our method is plotted in Figure 9 versus the Pan-Tompkins method in terms of time complexity. In each subfigure, the abscissa represents the quantitative increment of samples. The basic number of samples is 720; it is multiplied by the number shown in the abscissa. In the top row, the ordinate is the multiple increments of time consumed by processing the multiplied samples shown in the abscissa. In the bottom row, the ordinates are the increment of time multiples calculated from the ordinates in the top row. Intuitively, the four records have high increments of time consumption as the amount of data increases, especially when the data quantity is less than 500 times of the basic number 720. One important reason is that the four records contain numerous large T waves or P waves to be compared by time interval threshold or large negative amplitudes to be estimated for mirroring. Time interval comparison is more frequent for these waveforms, leading to higher time complexity than that of the Pan-Tompkins method. But on the other hand, the curves prove that our algorithm is not so sensitive to data increase. The increments of time multiple fluctuate in small ranges, basically remaining unchanged. The multiples of time consumption have linear relationships with the increments of sample amount.

4.3. Kamp and Ktime

Amplitude threshold takes a significant role in truncating burst points along the baseline; time interval threshold is a critical measure to further distinguish false R-peaks. According to (3), the thresholds depend significantly on and . To determine optimal coefficients and validate the feasibility of adaptive thresholding, different and values are tested using the 48 ECG records in MITDB. RR intervals of the records range from 0.54 s to 1.19 s; an average of 0.82 s is adopted to cope with different heart rates. The validation results are summarized in Table 4.

It can be seen that the optimal is generally times the , with reaching the maximum , and as illustrated in Table 1. On one hand, large is beneficial for the algorithm to cut off points located near the baseline, but small true R-peaks may be missed if is too large. As shown in the table, when is larger than 0.40, most of the false R-peaks are filtered as well as true ones, resulting in high but low and correspondingly low but high . On the other, smaller generates fewer omitted R-peaks, but this will increase the computational cost in time interval thresholding. The worst situation is that all the potential R-peaks are compared, as revealed in the table when is smaller than 0.20. In this case, most of the true R-peaks are included in local maximums as well as false ones, resulting in low but high and correspondingly high but low . To coordinate with different heart rates, is recommended to be selected from interval [0.2, 0.3] and from [0.42, 0.48]. After and are determined, and can be automatically updated by the shift of new coming signal.

5. Discussion

The proposed method has two advantages. One is from the time efficiency as indicated in Section 4.2. The main difference between our method and the Pan-Tompkins method is that the latter calculates more measures for R-peak recognition. It includes a search back operation after a complete detection circulation, thus resulting in a high computational complexity. Our method exclusively uses the thresholding method, and it does not require any search back operation. Besides, the amplitude threshold can also contribute to the calculation efficiency since it excludes most distractors and significantly reduces the amount of threshold comparisons.

Another advantage is from that there is no time length limitation for thresholding. As described in Section 2.4, the length of new coming segment is flexible, and the thresholding procedure can operate not only for a single heartbeat but also for a complete ECG record. With adaptive and , our method is suitable for different lengths of ECG signal and it requires no prelearning procedure.

From Table 1, we can see that about half of the failed beats (314 and 328 ) are from record 207. This record consists of numerous distorted heartbeats that are extremely difficult to be recognized even by a cardiologist. However, few literatures reported the results on this specific record. It is also unclear if record 207 is excluded in the evaluations to achieve a high score. This record is also the main interference that significantly reduces the detection accuracy of our method.

Apart from record 207, there are still some missing and false recognitions on some other records. The main methodological defect of the algorithm is that amplitude threshold may fail to detect small R-peaks mixed in large ones (records 105, 106, 108, and 228). The is selected based on the local maximums; if a segment contains numerous large R-peaks, will be larger than small R-peaks. This weights against the identification for small R-peaks because they are prone to be partitioned below the , as illustrated in Figure 10. The signal contains baseline drift, large T waves, and large negative amplitude R-peaks. Although most of the R-peaks are recognized, there are two FN detections for small R-peaks. If an R-peak is missed, would probably take the following maximum as the candidate R-peak, which actually is a distractor.

6. Conclusions

In this study, an adaptive and time-efficient methodology has been developed for automatic ECG R-peak detection. It is an adaptive method integrating WMRA, signal mirroring, local maximum detection, and amplitude and time interval thresholding. The accuracy performances were tested by using ECG records from MITDB and QTDB. Experimental results indicate that the proposed algorithm achieves average , , and of 99.39%, 99.49%, and 98.89% for MITDB, and 99.83%, 99.90%, and 99.73% for QTDB, respectively. In addition, time consumption and time complexity of the algorithm are computed to prove its time efficiency. By processing one ECG record, the average time cost is 0.872 s for MITDB and 0.763 s for QTDB, achieving 30.6% and 32.9%, respectively, of time reduction compared to the Pan-Tompkins method. Experiments on time complexity demonstrate that the proposed method is provided with linear time complexity; both our method and the Pan-Tompkins method are less sensitive to data increase.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study is supported by the National Natural Science Foundation of China (61571113). This study is also supported by the Natural Science Foundation of Jiangsu Province of China (BK20160697), the International S&T Cooperation Program of China (2015DFA10490), and the China Scholarship Council (CSC).