Skip to main content
Top
Published in: Medical & Biological Engineering & Computing 1/2022

Open Access 22-10-2021 | Original Article

A community effort to assess and improve computerized interpretation of 12-lead resting electrocardiogram

Authors: Zijian Ding, Guijin Wang, Huazhong Yang, Ping Zhang, Dapeng Fu, Zhen Yang, Xinkang Wang, Xia Wang, Zhourui Xia, Chiming Zhang, Wenjie Cai, Binhang Yuan, Dongya Jia, Bo Chen, Chengbin Huang, Jing Zhang, Yi Li, Shan Yang, Runnan He

Published in: Medical & Biological Engineering & Computing | Issue 1/2022

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Computerized interpretation of electrocardiogram plays an important role in daily cardiovascular healthcare. However, inaccurate interpretations lead to misdiagnoses and delay proper treatments. In this work, we built a high-quality Chinese 12-lead resting electrocardiogram dataset with 15,357 records, and called for a community effort to improve the performances of CIE through the China ECG AI Contest 2019. This dataset covers most types of ECG interpretations, including the normal type, 8 common abnormal types, and the other type which includes both uncommon abnormal and noise signals. Based on the Contest, we systematically assessed and analyzed a set of top-performing methods, most of which are deep neural networks, with both their commonalities and characteristics. This study establishes the benchmarks for computerized interpretation of 12-lead resting electrocardiogram and provides insights for the development of new methods.
Extras
Notes

Supplementary information

The online version contains supplementary material available at https://​doi.​org/​10.​1007/​s11517-021-02420-z.

Supplementary information

The online version contains supplementary material available at https://​doi.​org/​10.​1007/​s11517-021-02420-z.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Cardiovascular disease is the leading cause of death around the globe [25] and becomes a heavy burden in the world’s largest population—China [20]. Electrocardiogram (ECG) is essential to diagnose and screen cardiovascular diseases (CADs) including arrhythmia, myocardial infarction, and hypertrophy. It is one of the most common procedures in daily cardiovascular healthcare, with 3 million ECGs estimated to be performed worldwide every day [27]. However, about 20 percent of CIE is incorrect based on a rough estimation [22], and unrecognized mistakes are more likely to result in misdiagnoses and delay the proper treatments [26]. Therefore, improving CIE help lay the foundation for the precision diagnosis of CADs, leading to better cardiovascular healthcare.
High-quality ECG data helps promote the development of CIE. Most previous studies are based on the MIT-BIH Arrhythmia Database, which consists of 2-lead Holter data monitored from 48 patients [24]. Though these 48 ECG records were carefully annotated, standard 12-lead ECGs have become the mainstream in clinical practice. The Common Standards for Electrocardiography database, containing 1,000 standard 12-lead resting ECG records, is applied to assess for wave delineation since the late 1980s [31]. More standard 12-lead ECG datasets are published later, such as the Physionet 2011 challenge dataset for signal quality evaluation and the STAFF III dataset for coronary artery identification [23, 32]. The CPSC2018 dataset provides about 9 thousand 12-lead resting ECGs with nine types of interpretations, which however take only a small fraction among various clinical interpretations [19]. Though various methods showed their efficiencies on these datasets with limited patient samples [3, 30, 33], lacking disease patterns in a larger population hinders algorithm developing and performance assessment.
Deep neural networks are promising to play an important role in the daily clinical practice of ECG monitoring and interpretation [16]. For example, a 34-layer convolutional neural network was reported to outperform ECG technicians on single-lead Holter data [9]. Physionet Challenge 2017 offers a chance for the research community to compete on atrial fibrillation prediction based on short-duration single-lead data [4]. Three of the four winning teams utilized deep neural networks combined with handcrafted expert features [5, 12, 28]. However, these single-lead ECG records were recorded by wearables and cannot provide as much information as standard 12-lead ECGs.
A large volume of 12-lead ECG data with high-quality interpretations is crucial to assess the deep learning based CIE. Though CPSC2018 made the first attempt in China [19], its criteria for assessment ignores the fact that a record may contain more than one abnormality. Therefore, there is urgent need for a better understanding that how much machine learning methods, especially deep neural networks, can improve the predictive performance for standard 12-lead ECG data. However, to our knowledge, there are no previous studies that systematically assess and analyze a set of algorithms based on a common dataset.
In this paper, we report a novel dataset consisting of 15 thousand 12-lead resting ECG records, as well as a systematic assessment and analysis of benchmark algorithms from the China ECG AI Contest (CEAC) 2019 [1]. This dataset covers most types of clinical interpretations revised by four doctors and reflects the multi-label characteristics in clinical practice. Based on this novel dataset, CEAC 2019 calls for a community effort to assess and improve the computerized interpretation of 12-lead ECG.
We analyzed the top-performing methods, most of which are deep neural networks, aiming to identify successful cases. Our findings mainly include four aspects: (1) the network structure composed of CNN, RNN and attention can achieve excellent predictive performances; (2) incorporating external information, such as learning from other data or expert knowledge, can alleviate the overfitting problem; (3) data augmentation, focal loss and weighted cross-entropy are effective for imbalanced data; (4) multi-task learning and post-processing are utilized to deal with the multi-label classification problem. This systematic analysis may provide insights for future researches.

2 The CEAC dataset and evaluation tasks

An ECG records the electrical activities in the heart, and a 12-lead resting ECG is a common examination in the clinic to diagnose arrhythmia, myocardial infarction, and hypertrophy. We built a novel dataset consisting of about 15 thousand 12-lead resting ECG records, to train, validate and test different algorithms from both academia and industry. Since CEAC 2019 calls for a community effort to improve CIE, this dataset is defined as CEAC Dataset V1.0, which will be added with more data and more careful annotations in the future. To our knowledge, this is currently the largest dataset with a 60 percent increase compared to the state-of-the-art dataset [19].
There are mainly three points that distinguish the CEAC dataset from others:
(1)
it is currently the largest standard 12-lead ECG dataset in China to our knowledge;
 
(2)
it covers most types of clinical interpretations revised by doctors and technicians;
 
(3)
it reflects the fact that one ECG record may contain more than one abnormality.
 
This dataset provides the training, validation, and test set with the same statistical characteristics for assessing different algorithms. Researchers are welcome to have access to the CEAC dataset by contact with the corresponding author through the website [2].

2.1 Dataset building

All ECG records were collected from four hospitals in China. Four experts focused their time and efforts to annotate and review all ECG samples. To make the interpretations as correct as possible, two doctors and two technicians made up two teams, with each team consisting of one doctor and one technician. The workflow of annotating and reviewing is the same as in clinical practice, with one technician annotating an ECG record, and one doctor reviewing this record. The experts utilized a web-based tool for distributed ECG annotation in a local area network [7]. The dataset building has been approved by the ethics committees of the four hospitals.

2.2 Basic statistics

The complete dataset consists of 15,357 records. We select labels with as many samples as possible to represent the interpretations, resulting in 10 labels including normal ECGs (Normal), atrial fibrillation (AF), first-degree atrioventricular node block (FDAVB), right bundle branch block (RBBB), left anterior fascicular block (LAFB), premature ventricular contractions (PVC), premature atrial contraction (PAC), early repolarization (ER), T wave change (TWC) and other ECGs (Others). The clinical definitions of each label are summarized in the Supplementary Materials. In short, the first 9 labels refer to normal ECGs and those with abnormalities, and ‘Others’ refers to those records which cannot be exactly descript by any of the 9 types. Since many abnormalities are relatively rare according to daily practices in the clinic, we gathered these types in one type ‘others’ such as atrial flutter and pre-excitation. As a result, compared to the latest 12-lead resting ECG dataset [19], the CEAC dataset covers most interpretation types.
The interpretations of ECG records are shown in Fig. 1. In Fig. 1(a), the darker green boxes represent the larger numbers of samples. The samples labeled as Normal, TWC, and PVC are the top three, while LAFB, Others, and ER are the bottom three. Since one ECG record may contain more than one abnormality, Fig. 1(a) also shows the co-existence for every pair of labels. The darker blue boxes represent the more frequent pairs. For example, AF is more often to co-exist with TWC and RBBB. The lighter blue boxes represent the less frequent pairs. For example, normal never co-exist with other labels; Others never co-exist with any other nine labels; AF never co-exists with FDAVB. The proportions of multi-label records are shown in Fig. 1(b). The number of multi-label samples takes up no more than 15 percent in total. Among these samples, the majority have two labels. The samples with more than four labels take up less than 1 percent.
The clinical variables including age and gender are shown in Fig. 2. Figure 2(a) shows the age distribution under each label, with gender as the covariate. Since some records have no gender information, Fig. 2(a) represents them as missing data. Patients labeled as Normal is relatively younger than most of the other 9 labels since the elderly are more likely to have cardiovascular diseases. Male patients with ER are the second youngest than the others except for Normal. This suggests that both age and gender can be a feature to predict ER. As shown in Fig. 2(b), though most samples are recorded for 10 seconds, the time length varies across all records.
To assess different algorithms, the complete dataset is divided into the training set with 6,689 records, validation set with 559 records, and test set with 8110 records with similar statistical characteristics. All algorithms can be trained and validated on the training and validation set. The test set remains private to assess or evaluate the generalization ability.

2.3 Evaluation tasks

CEAC 2019 aims to call for a community effort to evaluate the current state of computerized interpretation of 12-lead resting ECGs, to set up the benchmark predictive performances, and to provide insights for further research. Three rounds of the contest, including a preliminary, a rematch and a final, were set to gradually screen competitive participating teams.
During the three rounds of the contest, we set up three evaluation tasks respectively: (1) how well do algorithms distinct abnormal ECGs from normal ones? (2) how well do algorithms predict the eight abnormalities or Normal for one ECG record? (3) how well do algorithms predict a record that falls into none of the nine pre-defined categories, namely the Others? In the preliminary, we screened the top 100 among the 354 participating teams; in the rematch, we screened the top 23 teams among the 68 valid submissions; in the final, we received 21 valid submissions.
In this paper, we discuss the third task set up for the final of the contest. Because this is the most complete task that is closely related to clinical practices, and also requires the complete dataset to develop and assess algorithms.
There are several challenges in the final evaluation task as follows:
  • how to efficiently extract features from data with variable time lengths;
  • how to overcome the overfitting problem, which is quite usual to deep neural networks;
  • the number of samples varies among different labels. Imbalanced data often leads to overfitting on labels with more data [10];
  • one record may contain more than one abnormality, thus a multi-label classification problem needs to be solved.
In addition, all participating teams faced a common difficulty in that there was no glance at the hidden test set. All developing and training procedures should be accomplished based on the training and validation set.

2.4 The scoring metrics

To assess the predictive performances, we use the measurements based on multi-label classification [37]. For each of the category 1 ≤ j ≤ 10 and each of the ECG record 1 ≤ iN, there are four quantities to measure predictive results.
$$ TP_{j} = |{x_{i}|y_{j} \in Y_{i}, y_{j} \in f(x_{i}), 1\leq i \leq N}| $$
(1)
$$ FP_{j} = |{x_{i}|y_{j} \not\in Y_{i}, y_{j} \in f(x_{i}), 1\leq i \leq N}| $$
(2)
$$ TN_{j} = |{x_{i}|y_{j} \not\in Y_{i}, y_{j} \not\in f(x_{i}), 1\leq i \leq N}| $$
(3)
$$ FN_{j} = |{x_{i}|y_{j} \in Y_{i}, y_{j} \not\in f(x_{i}), 1\leq i \leq N}| $$
(4)
Based on the four above quantities, we can define precision, recall and F1 score for each category,
$$ Precision_{j} = \frac{TP_{j}}{TP_{j} + FP_{j}} $$
(5)
$$ Recall_{j} = \frac{TP_{j}}{TP_{j} + FN_{j}} $$
(6)
$$ F_{1j} = \frac{2 \cdot Precision_{j} \cdot Recall_{j}}{Precision_{j} + Recall_{j}} $$
(7)
The final F1 score for each team is the average of each category.

3 Benchmark methods and performances

Aiming to identify success cases and to provide insights for further research, we analyzed the top 11 out of the 21 methods in the final of CEAC 2019, most of which are deep neural networks. We summarized their properties in the view of supervised learning. Table 1 lists the methods with some of their key properties and the final F1 scores. Figure 3 shows the F1 scores of each method on each label. We also calculated the accuracies of each method as in Table 1 in the Supplementary File.
Table 1
Summary of the Top-Performing 11 Benchmark Methods. All methods are ranked according to their F1 scores. The network structures are summarized and their characteristics are shown as in data augmentataion and transfer learning, etc
https://static-content.springer.com/image/art%3A10.1007%2Fs11517-021-02420-z/MediaObjects/11517_2021_2420_Figu_HTML.png
1FL refers to focal loss. 2CE refers to cross entropy
To summarize how the top methods deal with the four challenges mentioned above, key properties are grouped into data preprocessing, feature engineering, and classifiers. In data processing, it is shown that data alignment is necessary to cope with various time lengths (Challenge 1). In feature engineering, the first part summarizes how to design network structures to efficiently extract features; the second part summarizes how to apply external information to overcome the overfitting problem (Challenge 2). In the design of classifiers, focal loss and weighted cross-entropy are found to perform excellent among the top methods (Challenge 3); multi-task learning and postprocessing are utilized for multi-label classification (Challenge 4).

3.1 Data preprocessing

The main purpose of data preprocessing is to provide samples that are suitable for feature engineering. Considering the characteristics of 12-lead resting ECG data, researchers need to design strategies to cope with signals of various time lengths and improve signal qualities.
Signal processing
is utilized to improve signal qualities. Since the unit of ECG signals is millivolts, and ECG is often contaminated with noises such as baseline wander, muscle artifact and electrode motion artifact, etc., denoising is a key step to improve signal-to-noise ratios.
Data alignment
The ECG records of the CEAC dataset vary in time lengths, as shown in Fig. 2(b). Deep neural networks such as CNNs usually require a fixed input size of data for feature learning. Therefore, appropriate processing strategies are essential to align all ECG records to an equal length.
Padding and cropping are applied for data alignment. For short signals, padding to either side helps to fix time lengths. One strategy is to pad with zeros, which adds no information and can be handled by convolutions. Another strategy is to pad with self-repeated signals, which adds repetitive information.
Long signals are cropped into multiple segments with or without overlapping windows. These segments are labeled after the original long signals. However, for isolated abnormalities such as PVC and PAC, some segments without them are also labeled as PAC or PVC, which result in incorrectly labeled samples. To deal with this disadvantage, Method 3 manually labels all PAC and PVC segments; Method 4 applies a heuristic strategy to filter segments unlikely to be PVC or PAC. As a result, both methods achieved high F1 scores in PVC and PAC.
Data augmentation
Cropping one signal to several segments can also be seen as a data augmentation strategy. Long signals belonging to labels with fewer samples can be augmented, which may help deal with the imbalanced data problem. Up-sampling with replacement is also applied to overcome the ignorance of those labels with fewer data. Instead of directly up-sampling, some methods multiply the signals with a random coefficient closed to 1. Some methods also down-sample normal and TWC samples, which have the largest data sizes according to Fig. 1(a). Both up-sampling and down-sampling help alleviate overfitting on labels with more data.
Method 2 converts the 1D time-series data to 2D images by plotting the signals as curves on a fixed background, therefore transforming the original task into a computer vision task. However, several methods also utilize this strategy but never achieved as high F1 scores. An important trick is to color the signal curves on the images, and different color combinations affect predictive performances on both the training and validation set. As for data augmentation, Method 2 finds that affine transformation can improve predictive performances, while other traditional image processing procedures like flipping, lighting or rotating decrease the F1 scores.

3.2 Feature engineering

Feature engineering here refers to extracting useful features that can represent the key ECG characteristics of different labels. In this section, the first part mainly summarizes efficient strategies to design deep neural networks; the second part mainly summarizes how to incorporate external information and overcome the overfitting problem.

3.2.1 Design of deep neural networks

It is found that a common network structure is utilized and proved its efficiency based on the high F1 scores. As shown in Fig. 4, this common structure is composed of convolutional layers, recurrent layers, and attention modules. This network structure is reasonable to analyze time-series data such as ECG signals [34]. Firstly, the convolutional layers extract features and reduce dimensions. Deeper features with fewer dimensions generally represent data at a more abstract level [17]. Secondly, the recurrent layers, usually bidirectional RNN or LSTM, learn the correlations among the deep features. This is suitable for ECG signals in that time dependencies are represented in the P-QRS-T waves. Thirdly, the attention modules, which allow modeling of dependencies without regard to the distance in the input sequences [29], can give higher weights to features that are correlated to a specific label.
Other methods among the top 11 also follow this structure, but lacking either the recurrent layers or the attention modules. For example, Method 2 transforms to an image classification problem and therefore ignores the recurrent layers; Method 4 applies a 1D ResNet [35] and ignores both recurrent layers and attention modules; Method 5 utilizes a 1D DenseNet and self-attention. These methods ignore the time dependencies and focus more on the shapes of ECG signals.
The backbone CNNs
are essential to extract features from ECG signals. More powerful feature extraction is more likely to achieve higher F1 scores. In Table 1, both method 6 and method 8 apply a relatively shallow backbone consisting of 15 convolutional layers and achieve F1 scores close to other more complex structures. Their high performances suggest that most ECG features can be captured by relatively shallow backbone CNNs.
According to Table 1, 6 out of 11 methods apply residual blocks, including Res2Net [8], SE-Resnet [13] and different versions of ResNet [11], as shown in Fig. 4. Stacking more residual blocks to form deeper CNNs enhances feature representation abilities and increases predictive performances [11]. Method 1 applies Res2Net to promote multi-scale representation ability [8]; Method 2 applies SE-ResNet to capture the channel-wise relationships [13]. Besides ResNet, DenseNet is applied by 2 methods [14]. With the increase of the CEAC data size in the future, these deeper networks may develop stronger capacities and capabilities.
The recurrent layers
are essential to explore the dependencies among features representing ECG signals. Simple RNN, LSTM, GRU and their bidirectional versions are applied by 7 methods as shown in Table 1. These applications of recurrent layers can be roughly divided into two types. One type is to learn the correlations among features of one segment; the other is to learn the correlations among features from several segments, as shown in Fig. 4.
The attention modules
give different weights to different features. According to Table 1, 8 methods utilize various types of attention modules, which can be grouped into three types of strategies, as shown in Fig. 4. The first one is to weigh different features output by the recurrent layers. The second one is to apply one attention module to each label [21]. The last one is to combine with the backbone CNNs, such as the squeeze-and-excitation layers combined with ResNet in Method 2 [13] and the self-attention combined with DenseNet in Method 5. Attention modules are supposed to be effective for predicting labels with isolated events, including PVC and PAC. From Fig. 3(a) and (b), methods with attention modules often achieve high F1 scores on these two labels.

3.2.2 Incorporating external information

Incorporating more information is effective to overcome the overfitting problem. For example, learning from other data or transfer learning can pretrain networks by external datasets and therefore incorporate information from these datasets; learning from expert knowledge can also improve predictive performances by introducing inductive bias. According to Fig. 5, several top methods are summarized and grouped into either learning from other data or from expert knowledge.
Learning from other data
or transfer learning refers to modeling a neural network on a different but somehow similar problem and therefore partially reuse the network parameters to accelerate training and improving performance. Since Method 2 transforms into an image classification problem, it pretrains the SE-ResNet on the ImageNet dataset [6]. Method 11 pretrains its network on the CPSC2018 dataset, whose labels are different from the CEAC dataset.
Learning from expert knowlege
to extract handcrafted features can assist neural networks to improve performances. Three methods apply handcrafted features in different ways. Method 1 identifies fiducial points of P-QRS-T waves on each of the leads, then inputs this information to a deep neural network for automatic feature extraction [38]. Method 4 first identifies the R peaks and then calculates statistical features such as RR intervals and QRS wave widths. Instead of inputting these features to a neural network, Method 4 combines the handcrafted features with deep features extracted by the backbone ResNet [35]. Method 9 extracts various types of handcrafted features related to LAFB and ER and inputs them to an XGBoost.

3.3 Classifiers

All methods need to predict multi-labels for each ECG record. According to Fig. 1(b), about 15 percent of all records are labeled with more than one abnormality. In this section, it is shown that all top 11 methods apply multi-task learning to make multi-label predictions. Due to the imbalanced data problem, these methods also need to find proper loss functions. Also, almost all methods apply ensemble learning to improve accuracy and postprocessing to make more reasonable predictions due to some known relationships among different labels.
Multi-task learning
treats each label prediction as a separate task and solves all tasks simultaneously. One of its benefits is to exploit commonalities across different tasks, which leads to smaller model sizes and better performances. According to Table 1, all top 11 methods use multi-task learning to design their networks, in which the decision layers are composed of multiple sigmoid functions. This strategy defines each label prediction as a binary classification task [15]. A positive prediction means the record belongs to one label, while a negative prediction means the opposite. In comparison, some of the methods that are not among the top 11 transform multi-label prediction to several binary classification tasks. This strategy does not share model parameters across different tasks. Modeling correlations among different labels may help improve performances. Method 10 outputs the predicted probabilities of each neural network and input them to an ML-KNN [36].
Loss function
plays an important role to deal with imbalanced data according to Table 1. Most methods utilize the weighted binary cross-entropy. It sets a weight coefficient for each label and therefore alleviates overfitting on labels with more data. Both Method 1 and Method 2 apply the focal loss to deal with the class imbalance problem. The standard cross-entropy loss is reshaped such that it down-weights the loss assigned to well-classified samples [18]. In the case of 12-lead resting ECG data, the focal loss results in better performances.
Ensemble learning
reduces variances and increases robustness. Since many methods crop long signals into several segments, the summation of corresponding predictions can be either averaging probabilities or majority voting. Some methods also apply bagging to train models on re-sampled datasets.
Postprocessing
focuses on the correlations among different labels. The idea is to post-process the results and output more reasonable predictions according to some known relationships. For example, normal ECGs do not co-exist with either abnormalities or Others; AF does not co-exist with FDAVB, etc. The postprocessing strategy can be a good choice when the number of labels is modest. It may become too complex to handle when the number gets too large. Therefore, modeling the correlations among labels can be future directions.

4 Conclusion

The building of the largest Chinese 12-lead resting ECG data makes it possible to comprehensively assess different algorithms for CIE. Based on CEAC 2019 [1], we called for a community effort to improve the computerized interpretation of 12-lead resting ECGs. The systematic assessment and analysis of the top-performing deep neural networks establish benchmarks and provide insights for developing new methods. To our knowledge, no previous studies have analyzed a comprehensive set of algorithms based on a common 12-lead resting ECG dataset. We hope these findings might eventually lead to improvements in daily cardiovascular healthcare.

Acknowledgements

ZJ Ding wrote the manuscript. ZJ Ding, GJ Wang and HZ Yang assessed and analyzed all methods. ZJ Ding, GJ Wang, P Zhang, DP Fu, Z Yang and XK Wang contributed most to build the CEAC dataset. ZJ Ding, GJ Wang and X Wang contributed most to organize the CEAC 2019. R Xia, C Zhang, W Cai, B Yuan, D Jia, B Chen, C Huang, J Zhang, Y Li, S Yang, and R He participated the CEAC 2019 and contributed the 11 top-performing methods. All authors participated in polishing the manuscript.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Supplementary information

The online version contains supplementary material available at https://​doi.​org/​10.​1007/​s11517-021-02420-z.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix

Electronic supplementary material

Below is the link to the electronic supplementary material.
Literature
3.
go back to reference Chen M, Wang G, Xie P, Sang Z, Lv T, Zhang P, Yang H (2018) Region aggregation network: Improving convolutional neural network for ecg characteristic detection. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, pp 2559–2562 Chen M, Wang G, Xie P, Sang Z, Lv T, Zhang P, Yang H (2018) Region aggregation network: Improving convolutional neural network for ecg characteristic detection. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, pp 2559–2562
4.
go back to reference Clifford GD, Liu C, Moody B, Li-wei HL, Silva I, Li Q, Johnson AE, Mark RG (2017) Af classification from a short single lead ecg recording: the physionet/computing in cardiology challenge 2017. In: 2017 Computing in Cardiology (CinC). IEEE, pp 1–4 Clifford GD, Liu C, Moody B, Li-wei HL, Silva I, Li Q, Johnson AE, Mark RG (2017) Af classification from a short single lead ecg recording: the physionet/computing in cardiology challenge 2017. In: 2017 Computing in Cardiology (CinC). IEEE, pp 1–4
5.
go back to reference Datta S, Puri C, Mukherjee A, Banerjee R, Choudhury AD, Singh R, Ukil A, Bandyopadhyay S, Pal A, Khandelwal S (2017) Identifying normal, af and other abnormal ecg rhythms using a cascaded binary classifier. In: 2017 Computing in Cardiology (CinC). IEEE, pp 1–4 Datta S, Puri C, Mukherjee A, Banerjee R, Choudhury AD, Singh R, Ukil A, Bandyopadhyay S, Pal A, Khandelwal S (2017) Identifying normal, af and other abnormal ecg rhythms using a cascaded binary classifier. In: 2017 Computing in Cardiology (CinC). IEEE, pp 1–4
6.
go back to reference Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee
7.
go back to reference Ding Z, Qiu S, Guo Y, Lin J, Sun L, Fu D, Yang Z, Li C, Yu Y, Meng L, et al. (2019) Labelecg: A web-based tool for distributed electrocardiogram annotation. arXiv:1908.06553. Accepted by MLMECH-MICCAI, 2019 Ding Z, Qiu S, Guo Y, Lin J, Sun L, Fu D, Yang Z, Li C, Yu Y, Meng L, et al. (2019) Labelecg: A web-based tool for distributed electrocardiogram annotation. arXiv:1908.​06553. Accepted by MLMECH-MICCAI, 2019
8.
go back to reference Gao S-H, Cheng M-M, Zhao K, Zhang X-Y, Yang M-H, Torr P (2019) Res2net: A new multi-scale backbone architecture. arXiv:1904.01169 Gao S-H, Cheng M-M, Zhao K, Zhang X-Y, Yang M-H, Torr P (2019) Res2net: A new multi-scale backbone architecture. arXiv:1904.​01169
9.
go back to reference Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, Ng AY (2019) Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med 25(1):65CrossRef Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, Ng AY (2019) Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med 25(1):65CrossRef
10.
go back to reference He H, Garcia EA (2008) Learning from imbalanced data. IEEE Trans Knowl Data Eng (9)1263–1284 He H, Garcia EA (2008) Learning from imbalanced data. IEEE Trans Knowl Data Eng (9)1263–1284
11.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
12.
go back to reference Hong S, u M, Zhou Y, Wang Q, Shang J, Li H, Xie J (2017) Encase: An ensemble classifier for ecg classification using expert features and deep neural networks. In: 2017 Computing in Cardiology (CinC), pages 1–4. IEEE Hong S, u M, Zhou Y, Wang Q, Shang J, Li H, Xie J (2017) Encase: An ensemble classifier for ecg classification using expert features and deep neural networks. In: 2017 Computing in Cardiology (CinC), pages 1–4. IEEE
13.
go back to reference Hu J, Li S, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141 Hu J, Li S, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
14.
go back to reference Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
15.
go back to reference Huang Y, Wang W, Wang L, Tan T (2013) Multi-task deep neural network for multi-label learning. In: 2013 IEEE International conference on image processing. IEEE, pp 2897–2900 Huang Y, Wang W, Wang L, Tan T (2013) Multi-task deep neural network for multi-label learning. In: 2013 IEEE International conference on image processing. IEEE, pp 2897–2900
16.
go back to reference Johnson KW, Soto JT, Glicksberg BS, Shameer K, Miotto R, Ali M, Ashley E, Dudley JT (2018) Artificial intelligence in cardiology. J Am Coll Cardiol 71(23):2668–2679CrossRef Johnson KW, Soto JT, Glicksberg BS, Shameer K, Miotto R, Ali M, Ashley E, Dudley JT (2018) Artificial intelligence in cardiology. J Am Coll Cardiol 71(23):2668–2679CrossRef
17.
18.
go back to reference Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988 Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
19.
go back to reference Liu F, Liu C, Zhao L, Zhang X, Wu X, Xu X, Liu Y, Ma C, Wei S, He Z, et al. (2018) An open access database for evaluating the algorithms of electrocardiogram rhythm and morphology abnormality detection. J Med Imaging Health Infor 8(7):1368–1373CrossRef Liu F, Liu C, Zhao L, Zhang X, Wu X, Xu X, Liu Y, Ma C, Wei S, He Z, et al. (2018) An open access database for evaluating the algorithms of electrocardiogram rhythm and morphology abnormality detection. J Med Imaging Health Infor 8(7):1368–1373CrossRef
20.
go back to reference Liu S, Li Y, Zeng X, Wang H, Yin P, Wang L, Liu Y, Liu J, Qi J, Ran S et al (2019) Burden of cardiovascular diseases in China, 1990-2016: Findings from the 2016 global burden of disease study. JAMA cardiology 4(4):342–352CrossRef Liu S, Li Y, Zeng X, Wang H, Yin P, Wang L, Liu Y, Liu J, Qi J, Ran S et al (2019) Burden of cardiovascular diseases in China, 1990-2016: Findings from the 2016 global burden of disease study. JAMA cardiology 4(4):342–352CrossRef
21.
go back to reference Liu Y, He R, Wang K, Li Q, Sun Q, Zhao N, Zhang H (2019) Automatic detection of ecg abnormalities by using an ensemble of deep residual networks with attention. arXiv:1908.10088. Accepted by MLMECH-MICCAI 2019 Liu Y, He R, Wang K, Li Q, Sun Q, Zhao N, Zhang H (2019) Automatic detection of ecg abnormalities by using an ensemble of deep residual networks with attention. arXiv:1908.​10088. Accepted by MLMECH-MICCAI 2019
22.
go back to reference Madias JE (2018) Computerized interpretation of electrocardiograms: Taking stock and implementing new knowledge. Journal of electrocardiology 51(3):413CrossRef Madias JE (2018) Computerized interpretation of electrocardiograms: Taking stock and implementing new knowledge. Journal of electrocardiology 51(3):413CrossRef
23.
go back to reference Martínez JP, Pahlm O, Ringborn M, Warren S, Laguna P, Sörnmo L (2017) The staff iii database: Ecgs recorded during acutely induced myocardial ischemia. In: 2017 Computing in Cardiology (CinC). IEEE, pp 1–4 Martínez JP, Pahlm O, Ringborn M, Warren S, Laguna P, Sörnmo L (2017) The staff iii database: Ecgs recorded during acutely induced myocardial ischemia. In: 2017 Computing in Cardiology (CinC). IEEE, pp 1–4
24.
go back to reference Moody GB, Mark RG (2001) The impact of the mit-bih arrhythmia database. IEEE Eng Med Biol Mag 20(3):45–50CrossRef Moody GB, Mark RG (2001) The impact of the mit-bih arrhythmia database. IEEE Eng Med Biol Mag 20(3):45–50CrossRef
25.
go back to reference Sacco RL, Roth GA, Srinath Reddy K, Arnett DK, Bonita R, Gaziano TA, Heidenreich PA, Huffman MD, Mayosi BM, Mendis S, et al. (2016) The heart of 25 by 25: achieving the goal of reducing global and regional premature deaths from cardiovascular diseases and stroke: a modeling study from the american heart association and world heart federation. Circulation 133(23):e674–e690CrossRef Sacco RL, Roth GA, Srinath Reddy K, Arnett DK, Bonita R, Gaziano TA, Heidenreich PA, Huffman MD, Mayosi BM, Mendis S, et al. (2016) The heart of 25 by 25: achieving the goal of reducing global and regional premature deaths from cardiovascular diseases and stroke: a modeling study from the american heart association and world heart federation. Circulation 133(23):e674–e690CrossRef
26.
go back to reference Schläpfer J, Wellens HJ (2017) Computer-interpreted electrocardiograms: benefits and limitations. J Am Coll Cardiol 70(9):1183–1192CrossRef Schläpfer J, Wellens HJ (2017) Computer-interpreted electrocardiograms: benefits and limitations. J Am Coll Cardiol 70(9):1183–1192CrossRef
27.
go back to reference Steijlen ASM, Jansen KMB, Albayrak A, Verschure DO, Van Wijk DF (2018) A novel 12-lead electrocardiographic system for home use: Development and usability testing. JMIR mHealth and uHealth 6(7):e10126CrossRef Steijlen ASM, Jansen KMB, Albayrak A, Verschure DO, Van Wijk DF (2018) A novel 12-lead electrocardiographic system for home use: Development and usability testing. JMIR mHealth and uHealth 6(7):e10126CrossRef
28.
go back to reference Teijeiro T, García CA, Castro D, Félix P (2017) Arrhythmia classification from the abductive interpretation of short single-lead ecg records. In: 2017 Computing in Cardiology (CinC). IEEE, pp 1–4 Teijeiro T, García CA, Castro D, Félix P (2017) Arrhythmia classification from the abductive interpretation of short single-lead ecg records. In: 2017 Computing in Cardiology (CinC). IEEE, pp 1–4
29.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
30.
go back to reference Wang G, Zhang C, Liu Y, Yang H, Fu D, Wang H, Zhang P (2019) A global and updatable ecg beat classification system based on recurrent neural networks and active learning. Inform Sci 501:523–542CrossRef Wang G, Zhang C, Liu Y, Yang H, Fu D, Wang H, Zhang P (2019) A global and updatable ecg beat classification system based on recurrent neural networks and active learning. Inform Sci 501:523–542CrossRef
31.
go back to reference Willems JL, Arnaud P, Van Bemmel JH, Bourdillon PJ, Degani R, Denis B, Graham I, Harms FMA, Macfarlane PW, Mazzocca G, et al. (1987) A reference data base for multilead electrocardiographic computer measurement programs. J Am Coll Cardiol 10(6):1313–1321CrossRef Willems JL, Arnaud P, Van Bemmel JH, Bourdillon PJ, Degani R, Denis B, Graham I, Harms FMA, Macfarlane PW, Mazzocca G, et al. (1987) A reference data base for multilead electrocardiographic computer measurement programs. J Am Coll Cardiol 10(6):1313–1321CrossRef
32.
go back to reference Xia H, Garcia GA, McBride JC, Sullivan A, Bock TD, Bains J, Wortham DC, Zhao X (2011) Computer algorithms for evaluating the quality of ecgs in real time. In: 2011 Computing in cardiology. IEEE, pp 369–372 Xia H, Garcia GA, McBride JC, Sullivan A, Bock TD, Bains J, Wortham DC, Zhao X (2011) Computer algorithms for evaluating the quality of ecgs in real time. In: 2011 Computing in cardiology. IEEE, pp 369–372
33.
go back to reference Xie P, Wang G, Zhang C, Chen M, Yang H, Lv T, Sang Z, Zhang P (2018) Bidirectional recurrent neural network and convolutional neural network (bircnn) for ecg beat classification. In: 2018 40th annual international conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, pp 2555–2558 Xie P, Wang G, Zhang C, Chen M, Yang H, Lv T, Sang Z, Zhang P (2018) Bidirectional recurrent neural network and convolutional neural network (bircnn) for ecg beat classification. In: 2018 40th annual international conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, pp 2555–2558
34.
go back to reference Yao Q, Wang R, Fan X, Liu J, Li Y (2020) Multi-class arrhythmia detection from 12-lead varied-length ecg using attention-based time-incremental convolutional neural network. Information Fusion 53:174–182CrossRef Yao Q, Wang R, Fan X, Liu J, Li Y (2020) Multi-class arrhythmia detection from 12-lead varied-length ecg using attention-based time-incremental convolutional neural network. Information Fusion 53:174–182CrossRef
35.
go back to reference Yuan B, Xing W (2019) Diagnosing cardiac abnormalities from 12-lead electrocardiograms using enhanced deep convolutional neural networks. arXiv:1908.06802. Accepted by MLMECH-MICCAI 2019 Yuan B, Xing W (2019) Diagnosing cardiac abnormalities from 12-lead electrocardiograms using enhanced deep convolutional neural networks. arXiv:1908.​06802. Accepted by MLMECH-MICCAI 2019
36.
go back to reference Zhang M-L, Zhou Z-H (2007) Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition 40(7):2038–2048CrossRef Zhang M-L, Zhou Z-H (2007) Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition 40(7):2038–2048CrossRef
37.
go back to reference Zhang M-L, Zhou Z-H (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837CrossRef Zhang M-L, Zhou Z-H (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837CrossRef
38.
go back to reference Zhourui X, Zhenhua S, Yutong G, Ji W, Chenguang H, Yanlin C, Sifan Y, Long M (2019) Automatic multi-label classification in 12-lead ecgs using neural networks and characteristic points. Accepted by MLMECH-MICCAI 2019 Zhourui X, Zhenhua S, Yutong G, Ji W, Chenguang H, Yanlin C, Sifan Y, Long M (2019) Automatic multi-label classification in 12-lead ecgs using neural networks and characteristic points. Accepted by MLMECH-MICCAI 2019
Metadata
Title
A community effort to assess and improve computerized interpretation of 12-lead resting electrocardiogram
Authors
Zijian Ding
Guijin Wang
Huazhong Yang
Ping Zhang
Dapeng Fu
Zhen Yang
Xinkang Wang
Xia Wang
Zhourui Xia
Chiming Zhang
Wenjie Cai
Binhang Yuan
Dongya Jia
Bo Chen
Chengbin Huang
Jing Zhang
Yi Li
Shan Yang
Runnan He
Publication date
22-10-2021
Publisher
Springer Berlin Heidelberg
Published in
Medical & Biological Engineering & Computing / Issue 1/2022
Print ISSN: 0140-0118
Electronic ISSN: 1741-0444
DOI
https://doi.org/10.1007/s11517-021-02420-z

Other articles of this Issue 1/2022

Medical & Biological Engineering & Computing 1/2022 Go to the issue

Premium Partner