Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift
Introduction
The class imbalance problem has been studied extensively [1], [2], [3], [4], [5] in the past decade or so. Imbalanced datasets are common in real-world applications such as medical diagnosis, fraud detection, spam filtering, bioinformatics, text classification, etc [1]. Recently, the class imbalance problem in sequential learning has also attracted attention of researchers from various application domains [6], [7], [8], [9].
Sequential or incremental learners are computationally efficient as compared to batch learners [10], [11], [12], [13] since batch learners require retraining with the complete dataset whenever new samples arrive. Sequential learners store the previously learnt information and update themselves only with the newly arrived data samples. However, since statistical characteristics of training data streams may change over time, class concepts tend to drift in non-stationary environments. Concept drift learning raises the so-called stability-plasticity dilemma. Sequential learning framework should be able to achieve a meaningful balance between previously acquired information (stability) and learning new information (plasticity) [12]. The class imbalance problem further complicates sequential learning from drifting data streams.
Class imbalance and concept drift are two challenging problems which can occur in the same data stream. Recently, class imbalance problem in drifting data streams has received attention for chunk-by-chunk learning [14], [15], [16], [17], [18], [19]. NIE [16] is a state-of-the-art method in this area. family of methods is based on ensemble learning framework in which a new classifier is trained with each arriving chunk of data and added to the ensemble. .NIE works better than most of the competing methods in recurring environments since it does not remove old classifiers from the ensemble. For imbalance datasets, the prior knowledge of recurring concepts can be particularly useful since the minority class samples are usually rare. In general, incremental learning methods should meet the single-pass requirement i.e. once samples are learnt, they are discarded [12]. However, some methods store previously learnt minority class samples, thus, violating the single-pass requirement [17], [18], [19]. Gao [17] proposed to collect the minority class samples from all previous chunks while SERA [18] and REA [19] select the minority class samples from previous chunks which are similar to samples in the current chunk. Moreover, most of the existing methods assume that a full chunk of data is always available for training [14]. If the samples are arriving continuously or one-by-one, updating of the classification model is delayed until a full chunk is completed. Hence, the need for a class imbalance learning (CIL) method for non-stationary environments, which can learn in both chunk-by-chunk and one-by-one modes, is timely.
Extreme learning machine (ELM) [11], [20] is becoming popular in large dataset and online learning applications due to its fast learning speed. ELM provides a single step least square estimatation (LSE) method for training single hidden layer feed forward network (SLFN) instead of using iterative gradient descent methods, such as back propagation algorithms. Very recently, a weighted online sequential extreme learning machine (WOS-ELM) was proposed for class imbalance learning [9]. WOS-ELM has been shown to effectively tackle the class imbalance problem in both chunk-by-chunk and one-by-one learning. However, WOS-ELM was proposed only for stationary environments and may not be appropriate for concept drift learning. Moreover, OS-ELM [11] related methods, with random hidden node parameters, may not always adapt well to the new data [21]. Thus, ensemble methods [21], [22], [23] are generally preferred over single OS-ELM methods [6], [9], [11].
In this paper, a computationally efficient framework, referred to as ensemble of subset online sequential extreme learning machine (ESOS-ELM), is proposed for class imbalance learning from a concept-drifting data stream. In ESOS-ELM, a minority class sample is processed by ‘m’ classifiers (‘m’ is the imbalance ratio) while a majority class sample is processed by a single classifier. The majority class samples are processed in a round robin fashion, i.e., the first majority class sample is processed by the first classifier, the second sample by the second classifier and so on. In this way, classifiers in the ensemble are trained with balanced subsets from the original imbalanced dataset. Note that the proposed framework tackles class imbalance and concept drift problems in both the one-by-one and chunk-by-chunk modes.
Ensemble learning methods are widely used in concept drift learning. Compared to single classifier methods, ensemble methods tend to better cope with concept drift problem, particularly with gradual drifts [14]. Dynamic weighted majority (DWM) [24] is a state-of-the-art ensemble method for concept drift learning with balanced datasets. In DWM, voting weights are decreased proportional to the error rate of the classifier. ESOS-ELM also uses dynamic weighted majority voting for concept drift learning. For tackling the class imbalance problem, ESOS-ELM processes incoming samples in a way that each OS-ELM network is trained with approximately equal number of minority and majority class samples. Unlike DWM, voting weights are updated proportional to an appropriate performance measure for CIL. In recurring environments [12], [16], [25], DWM may not be able to leverage the prior knowledge since old concepts are usually forgotten. To avoid this problem, we propose a novel information storage mechanism, using ELM theory, which is efficient both in terms of memory and computation. A change detection mechanism is also employed in the learning framework to promptly react to both the sudden and gradual drifts.
The new framework achieves better performance than that of NIE(gm) on most of the datasets used in this paper. ESOS-ELM achieves this performance with fewer hypotheses than in NIE(gm). The new method also obtained better performance than DWM in recurring environments. This superiority is due to the ELM-Store module which helps leverage the prior knowledge of old concepts. ESOS-ELM is also applied on benchmark imbalanced datasets without concept drift. ESOS-ELM outperformed WOS-ELM, OTER and SMOTE for all the 15 imbalanced datasets used in [9].
This paper is organized as follows: Section 2 discusses the preliminaries. Section 3 presents the details of the ESOS-ELM method. This is followed by experiments for validating the performance of the proposed framework in Section 4. Finally, Section 5 concludes the paper.
Section snippets
ELM and OS-ELM
Extreme learning machine (ELM) [20] is a single step least square error estimate solution originally proposed for single hidden layer feed forward networks and later extended for non-neuron like networks. The input weights and biases connecting input layer to the hidden layer (hidden node parameters) are assigned randomly and the weights connecting the hidden layer and the output layer (output weights) are determined analytically. Compared to the traditional iterative gradient decent methods
Method
In this section, an ensemble of subset online sequential extreme learning machine (ESOS-ELM) is proposed for class imbalance learning from drifting data stream. As shown in Fig. 1, the proposed ESOS-ELM method consists of three blocks, the main ensemble block, the ELM-Store block and the change detector block. These blocks are discussed in details as follows.
Experiments
The performance of ESOS-ELM is first evaluated for class imbalance learning with concept drift in Section 4.1. Later it is also evaluated for class imbalance learning without concept drift in Section 4.2.
Conclusions
In this paper, we have proposed an ensemble of subset online sequential extreme learning machine (ESOS-ELM) for class imbalance learning from drifting data streams. ESOS-ELM consists of a main ensemble for classification in the current imbalanced environment, an ELM-Store module for storing information of old concepts and a change detector for promptly detecting concept drifts. In ESOS-ELM, the main ensemble is trained with balanced subsets of the data stream. Similar to WOS-ELM, the new method
Acknowledgments
The authors would like to thank the anonymous reviewers whose insightful and helpful comments greatly improved this paper.
Bilal Mirza received the M.Sc. (signal processing) degree from Nanyang Technological University (NTU) in 2010. He is currently working towards the Ph.D. degree from NTU. His research interests include machine learning and its application in bio-signal processing, class imbalance and online sequential learning.
References (36)
- et al.
Weighted extreme learning machine for imbalance learning
Neurocomputing
(2013) - et al.
An online learning network for biometric scores fusion
Neurocomputing
(2013) - et al.
Extreme learning machine: theory and applications
Neurocomputing
(2006) - et al.
Ensemble of online sequential extreme learning machine
Neurocomputing
(2009) - et al.
Voting based extreme learning machine
Inf. Sci.
(2012) - et al.
Learning from imbalanced data
IEEE Trans. Knowl. Data Eng.
(2009) - M. Kubat, S. Matwin, Addressing the curse of imbalanced training sets: one-sided selection, in: Proceedings of the...
- et al.
SMOTE: synthetic minority over-sampling technique
J. Artif. Intell. Res.
(2002) - et al.
Cost-sensitive learning and the class imbalance problem
Encycl. Mach. Learn.
(2008) - et al.
Mining data streams with skewed distribution by static classifier ensemble
Stud. Comput. Intell.
(2009)
Weighted online sequential extreme learning machine for class imbalance learning
Neural Process. Lett.
Incremental learning of chunk data for online pattern classification
IEEE Trans. Neural Netw.
A fast and accurate online sequential learning algorithm for feedforward networks
IEEE Trans. Neural Netw.
Incremental learning of concept drift in nonstationary environments
IEEE Trans. Neural Netw.
Incremental training of support vector machines
IEEE Trans. Neural Netw.
Learning from streaming data with concept drift and imbalance: an overview
Prog. Artif. Intell.
Cited by (166)
A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation
2024, Expert Systems with ApplicationsPro-IDD: Pareto-based ensemble for imbalanced and drifting data streams
2023, Knowledge-Based SystemsMixture extreme learning machine algorithm for robust regression
2023, Knowledge-Based SystemsMulticlass imbalanced and concept drift network traffic classification framework based on online active learning
2023, Engineering Applications of Artificial IntelligenceData Stream Classification Based on Extreme Learning Machine: A Review
2022, Big Data Research
Bilal Mirza received the M.Sc. (signal processing) degree from Nanyang Technological University (NTU) in 2010. He is currently working towards the Ph.D. degree from NTU. His research interests include machine learning and its application in bio-signal processing, class imbalance and online sequential learning.
Zhiping Lin received the B.Eng. degree in control engineering from South China Institute of Technology, Canton, China in 1982 and the Ph.D. degree in information engineering from the University of Cambridge, England in 1987. He was with the University of Calgary, Canada for 1987–1988, with Shantou University, China for 1988–1993, and with DSO National Laboratories, Singapore for 1993–1999. Since February, 1999, he has been an Associate Professor at Nanyang Technological University (NTU), Singapore. He is also the Program Director of Bio-Signal Processing, Valens Centre of Excellence, NTU. Dr. Lin is currently serving as the Editor-in-Chief of Multidimensional Systems and Signal Processing after serving as an editorial board member for 1993–2004 and a CO-Editor for 2005–2010. He was an Associate Editor of Circuits, Systems and Signal Processing for 2000–2007 and an Associate Editor of IEEE Transactions on Circuits and Systems, Part II, for 2010–2011. He also serves as a reviewer for Mathematical Reviews. He is General Chair of the 9th International Conference on Information, Communications and Signal Processing (ICICS), 2013. His research interests include multidimensional systems and signal processing, statistical and biomedical signal processing, and more recently machine learning. He is the co-author of the 2007 Young Author Best Paper Award from the IEEE Signal Processing Society, Distinguished Lecturer of the IEEE Circuits and Systems Society for 2007–2008, and the Chair of the IEEE Circuits and Systems Singapore Chapter for 2007–2008.
Nan Liu received the B.Eng. degree in electrical engineering from University of Science and Technology Beijing, China, and the Ph.D. degree in electrical engineering from Nanyang Technological University, Singapore. He is currently a Senior Research Scientist at the Department of Emergency Medicine, Singapore General Hospital. His research interests include pattern recognition, machine learning, and biomedical signal processing.