Elsevier

Neurocomputing

Volume 149, Part A, 3 February 2015, Pages 316-329
Neurocomputing

Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift

https://doi.org/10.1016/j.neucom.2014.03.075Get rights and content

Abstract

In this paper, a computationally efficient framework, referred to as ensemble of subset online sequential extreme learning machine (ESOS-ELM), is proposed for class imbalance learning from a concept-drifting data stream. The proposed framework comprises a main ensemble representing short-term memory, an information storage module representing long-term memory and a change detection mechanism to promptly detect concept drifts. In the main ensemble of ESOS-ELM, each OS-ELM network is trained with a balanced subset of the data stream. Using ELM theory, a computationally efficient storage scheme is proposed to leverage the prior knowledge of recurring concepts. A distinctive feature of ESOS-ELM is that it can learn from new samples sequentially in both the chunk-by-chunk and one-by-one modes. ESOS-ELM can also be effectively applied to imbalanced data without concept drift. On most of the datasets used in our experiments, ESOS-ELM performs better than the state-of-the-art methods for both stationary and non-stationary environments.

Introduction

The class imbalance problem has been studied extensively [1], [2], [3], [4], [5] in the past decade or so. Imbalanced datasets are common in real-world applications such as medical diagnosis, fraud detection, spam filtering, bioinformatics, text classification, etc [1]. Recently, the class imbalance problem in sequential learning has also attracted attention of researchers from various application domains [6], [7], [8], [9].

Sequential or incremental learners are computationally efficient as compared to batch learners [10], [11], [12], [13] since batch learners require retraining with the complete dataset whenever new samples arrive. Sequential learners store the previously learnt information and update themselves only with the newly arrived data samples. However, since statistical characteristics of training data streams may change over time, class concepts tend to drift in non-stationary environments. Concept drift learning raises the so-called stability-plasticity dilemma. Sequential learning framework should be able to achieve a meaningful balance between previously acquired information (stability) and learning new information (plasticity) [12]. The class imbalance problem further complicates sequential learning from drifting data streams.

Class imbalance and concept drift are two challenging problems which can occur in the same data stream. Recently, class imbalance problem in drifting data streams has received attention for chunk-by-chunk learning [14], [15], [16], [17], [18], [19]. Learn++.NIE [16] is a state-of-the-art method in this area. Learn++ family of methods is based on ensemble learning framework in which a new classifier is trained with each arriving chunk of data and added to the ensemble. Learn++.NIE works better than most of the competing methods in recurring environments since it does not remove old classifiers from the ensemble. For imbalance datasets, the prior knowledge of recurring concepts can be particularly useful since the minority class samples are usually rare. In general, incremental learning methods should meet the single-pass requirement i.e. once samples are learnt, they are discarded [12]. However, some methods store previously learnt minority class samples, thus, violating the single-pass requirement [17], [18], [19]. Gao [17] proposed to collect the minority class samples from all previous chunks while SERA [18] and REA [19] select the minority class samples from previous chunks which are similar to samples in the current chunk. Moreover, most of the existing methods assume that a full chunk of data is always available for training [14]. If the samples are arriving continuously or one-by-one, updating of the classification model is delayed until a full chunk is completed. Hence, the need for a class imbalance learning (CIL) method for non-stationary environments, which can learn in both chunk-by-chunk and one-by-one modes, is timely.

Extreme learning machine (ELM) [11], [20] is becoming popular in large dataset and online learning applications due to its fast learning speed. ELM provides a single step least square estimatation (LSE) method for training single hidden layer feed forward network (SLFN) instead of using iterative gradient descent methods, such as back propagation algorithms. Very recently, a weighted online sequential extreme learning machine (WOS-ELM) was proposed for class imbalance learning [9]. WOS-ELM has been shown to effectively tackle the class imbalance problem in both chunk-by-chunk and one-by-one learning. However, WOS-ELM was proposed only for stationary environments and may not be appropriate for concept drift learning. Moreover, OS-ELM [11] related methods, with random hidden node parameters, may not always adapt well to the new data [21]. Thus, ensemble methods [21], [22], [23] are generally preferred over single OS-ELM methods [6], [9], [11].

In this paper, a computationally efficient framework, referred to as ensemble of subset online sequential extreme learning machine (ESOS-ELM), is proposed for class imbalance learning from a concept-drifting data stream. In ESOS-ELM, a minority class sample is processed by ‘m’ classifiers (‘m’ is the imbalance ratio) while a majority class sample is processed by a single classifier. The majority class samples are processed in a round robin fashion, i.e., the first majority class sample is processed by the first classifier, the second sample by the second classifier and so on. In this way, classifiers in the ensemble are trained with balanced subsets from the original imbalanced dataset. Note that the proposed framework tackles class imbalance and concept drift problems in both the one-by-one and chunk-by-chunk modes.

Ensemble learning methods are widely used in concept drift learning. Compared to single classifier methods, ensemble methods tend to better cope with concept drift problem, particularly with gradual drifts [14]. Dynamic weighted majority (DWM) [24] is a state-of-the-art ensemble method for concept drift learning with balanced datasets. In DWM, voting weights are decreased proportional to the error rate of the classifier. ESOS-ELM also uses dynamic weighted majority voting for concept drift learning. For tackling the class imbalance problem, ESOS-ELM processes incoming samples in a way that each OS-ELM network is trained with approximately equal number of minority and majority class samples. Unlike DWM, voting weights are updated proportional to an appropriate performance measure for CIL. In recurring environments [12], [16], [25], DWM may not be able to leverage the prior knowledge since old concepts are usually forgotten. To avoid this problem, we propose a novel information storage mechanism, using ELM theory, which is efficient both in terms of memory and computation. A change detection mechanism is also employed in the learning framework to promptly react to both the sudden and gradual drifts.

The new framework achieves better performance than that of Learn++.NIE(gm) on most of the datasets used in this paper. ESOS-ELM achieves this performance with fewer hypotheses than in Learn++.NIE(gm). The new method also obtained better performance than DWM in recurring environments. This superiority is due to the ELM-Store module which helps leverage the prior knowledge of old concepts. ESOS-ELM is also applied on benchmark imbalanced datasets without concept drift. ESOS-ELM outperformed WOS-ELM, OTER and SMOTE for all the 15 imbalanced datasets used in [9].

This paper is organized as follows: Section 2 discusses the preliminaries. Section 3 presents the details of the ESOS-ELM method. This is followed by experiments for validating the performance of the proposed framework in Section 4. Finally, Section 5 concludes the paper.

Section snippets

ELM and OS-ELM

Extreme learning machine (ELM) [20] is a single step least square error estimate solution originally proposed for single hidden layer feed forward networks and later extended for non-neuron like networks. The input weights and biases connecting input layer to the hidden layer (hidden node parameters) are assigned randomly and the weights connecting the hidden layer and the output layer (output weights) are determined analytically. Compared to the traditional iterative gradient decent methods

Method

In this section, an ensemble of subset online sequential extreme learning machine (ESOS-ELM) is proposed for class imbalance learning from drifting data stream. As shown in Fig. 1, the proposed ESOS-ELM method consists of three blocks, the main ensemble block, the ELM-Store block and the change detector block. These blocks are discussed in details as follows.

Experiments

The performance of ESOS-ELM is first evaluated for class imbalance learning with concept drift in Section 4.1. Later it is also evaluated for class imbalance learning without concept drift in Section 4.2.

Conclusions

In this paper, we have proposed an ensemble of subset online sequential extreme learning machine (ESOS-ELM) for class imbalance learning from drifting data streams. ESOS-ELM consists of a main ensemble for classification in the current imbalanced environment, an ELM-Store module for storing information of old concepts and a change detector for promptly detecting concept drifts. In ESOS-ELM, the main ensemble is trained with balanced subsets of the data stream. Similar to WOS-ELM, the new method

Acknowledgments

The authors would like to thank the anonymous reviewers whose insightful and helpful comments greatly improved this paper.

Bilal Mirza received the M.Sc. (signal processing) degree from Nanyang Technological University (NTU) in 2010. He is currently working towards the Ph.D. degree from NTU. His research interests include machine learning and its application in bio-signal processing, class imbalance and online sequential learning.

References (36)

  • H. Nguyen, E. Cooper, K. Kamei, Online learning from imbalanced data streams, in: Proceedings of the International...
  • B. Mirza et al.

    Weighted online sequential extreme learning machine for class imbalance learning

    Neural Process. Lett.

    (2013)
  • S. Ozawa et al.

    Incremental learning of chunk data for online pattern classification

    IEEE Trans. Neural Netw.

    (2008)
  • N.Y. Liang et al.

    A fast and accurate online sequential learning algorithm for feedforward networks

    IEEE Trans. Neural Netw.

    (2006)
  • R. Elwell et al.

    Incremental learning of concept drift in nonstationary environments

    IEEE Trans. Neural Netw.

    (2011)
  • A. Shilton et al.

    Incremental training of support vector machines

    IEEE Trans. Neural Netw.

    (2005)
  • T.R. Hoens et al.

    Learning from streaming data with concept drift and imbalance: an overview

    Prog. Artif. Intell.

    (2012)
  • G. Ditzler, R. Polikar, N.V. Chawla, An incremental learning algorithm for non-stationary environments and class...
  • Cited by (166)

    View all citing articles on Scopus

    Bilal Mirza received the M.Sc. (signal processing) degree from Nanyang Technological University (NTU) in 2010. He is currently working towards the Ph.D. degree from NTU. His research interests include machine learning and its application in bio-signal processing, class imbalance and online sequential learning.

    Zhiping Lin received the B.Eng. degree in control engineering from South China Institute of Technology, Canton, China in 1982 and the Ph.D. degree in information engineering from the University of Cambridge, England in 1987. He was with the University of Calgary, Canada for 1987–1988, with Shantou University, China for 1988–1993, and with DSO National Laboratories, Singapore for 1993–1999. Since February, 1999, he has been an Associate Professor at Nanyang Technological University (NTU), Singapore. He is also the Program Director of Bio-Signal Processing, Valens Centre of Excellence, NTU. Dr. Lin is currently serving as the Editor-in-Chief of Multidimensional Systems and Signal Processing after serving as an editorial board member for 1993–2004 and a CO-Editor for 2005–2010. He was an Associate Editor of Circuits, Systems and Signal Processing for 2000–2007 and an Associate Editor of IEEE Transactions on Circuits and Systems, Part II, for 2010–2011. He also serves as a reviewer for Mathematical Reviews. He is General Chair of the 9th International Conference on Information, Communications and Signal Processing (ICICS), 2013. His research interests include multidimensional systems and signal processing, statistical and biomedical signal processing, and more recently machine learning. He is the co-author of the 2007 Young Author Best Paper Award from the IEEE Signal Processing Society, Distinguished Lecturer of the IEEE Circuits and Systems Society for 2007–2008, and the Chair of the IEEE Circuits and Systems Singapore Chapter for 2007–2008.

    Nan Liu received the B.Eng. degree in electrical engineering from University of Science and Technology Beijing, China, and the Ph.D. degree in electrical engineering from Nanyang Technological University, Singapore. He is currently a Senior Research Scientist at the Department of Emergency Medicine, Singapore General Hospital. His research interests include pattern recognition, machine learning, and biomedical signal processing.

    View full text