A non-canonical hybrid metaheuristic approach to adaptive data stream classification

doi:10.1016/j.future.2019.07.067

Future Generation Computer Systems

Volume 102, January 2020, Pages 127-139

https://doi.org/10.1016/j.future.2019.07.067 Get rights and content

Highlights

•
A three-layered ensemble technique for non-stationary data stream classification task is proposed.
•
The framework is based on an evolutionary algorithm (RD) and a bio-inspired optimisation technique (PSO).
•
Five different variations of the proposed framework is provided.
•
The proposed framework is compared to other state-of-the-art methods using standard criteria.

Abstract

Data stream classification techniques have been playing an important role in big data analytics recently due to their diverse applications (e.g. fraud and intrusion detection, forecasting and healthcare monitoring systems) and the growing number of real-world data stream generators (e.g. IoT devices and sensors, websites and social network feeds). Streaming data is often prone to evolution over time. In this context, the main challenge for computational models is to adapt to changes, known as concept drifts, using data mining and optimisation techniques. We present a novel ensemble technique called RED-PSO that seamlessly adapts to different concept drifts in non-stationary data stream classification tasks. RED-PSO is based on a three-layer architecture to produce classification types of different size, each created by randomly selecting a certain percentage of features from a pool of features of the target data stream. An evolutionary algorithm, namely, Replicator Dynamics (RD), is used to seamlessly adapt to different concept drifts; it allows good performing types to grow and poor performing ones to shrink in size. In addition, the selected feature combinations in all classification types are optimised using a non-canonical version of the Particle Swarm Optimisation (PSO) technique for each layer individually. PSO allows the types in each layer to go towards local (within the same type) and global (in all types) optimums with a specified velocity. A set of experiments are conducted to compare the performance of the proposed method to state-of-the-art algorithms using real-world and synthetic data streams in immediate and delayed prequential evaluation settings. The results show a favourable performance of our method in different environments.

Introduction

As the digital world advances, the number of data streams produced by various sources such as IoT devices and sensors and social media networks grows rapidly. Such streams are usually characterised by high velocity and changes in data distributions over time. Therefore, a significant number of recent research are concentrated on data stream classification challenges specially in non-stationary data streams [1]. The main challenge in this context is adaptation to different concept drifts; that is, when the data distribution evolve over time in unforeseen ways.

Various forms of concept drifts can be categorised into four generic groups of sudden (abrupt), incremental, gradual and recurrent. In sudden concept drifts, the distribution of data is suddenly replaced with a new distribution. An incremental concept drift is a drift to the distribution of data when it goes through various, unstable distributions before being stable at a specific data distribution. In the gradual concept drifts, the ratio of a new distribution of incoming data raises, and the ratio of the data from the former distribution drops over time. In a recurrent concept drift, the same old distribution of data reappears after passing some time with different distribution/s.

Ensemble learning technique is a machine learning approach, where multiple classifiers are created and combined using a voting mechanism to establish a single class as the output of a data instance. Ensemble techniques have demonstrated superiority over other classification techniques for stream classification tasks in non-stationary environments [2], [3]. This is due to their flexibility in training and updating classifiers. Furthermore, ensemble techniques offer a solution to keep the effects of old and new instances using multiple classifiers since each instance in a data stream is processed only once on the arrival in most cases.

An ideal approach to non-stationary data stream classification should satisfy the following objective with a multifold features: having the least possible misclassification rate while minimising the computational complexity and quickly adapting to possible concept drifts. However, there is a lack of comprehensive approaches offering these features in a single framework; the majority of the existing ensemble methods are focused on either one or two of them, or a specific type of data streams.

The aim of this study is to propose a novel method using a modified bio-inspired algorithm, namely, Particle Swarm Optimisation (PSO), to comply with the aforementioned characteristics of an ideal approach to cope with evolving data streams in classification challenges. The proposed technique comprises a three-layer architecture. Each layer is initially assigned some predefined classification types that randomly created from a pool of features of the target data stream. We used an evolutionary algorithm called Replicator Dynamics (RD) to seamlessly cope with smooth (i.e. gradual or incremental) concept drifts; it allows the classification types with good performance to grow and those with poor performance to shrink in size. The combination of features in all types is then optimised using a modified version of PSO for each layer individually. This helps the method to cope with more sudden (i.e. recurring or abrupt) concept drifts. PSO allows the types in each layer to go towards local (within the same type) and global (in all types) optimums with a specified velocity.

The proposed method in this paper is evaluated over 5 real-world data streams and 4 synthetic (artificial) data stream generators. Different types of concept drifts are added to synthetic datasets in order to examine how the proposed method adapts to different concept drifts compared to state-of-the-art methods. As the process of labelling instances in the real-world datasets is different depending on the application, real-world data streams can be categorised into two different labelling mechanisms of complete labelling and partial labelling. Complete labelling is done where the true labels of respective instances are completely accessible either instantly or after a delay with no/small extra overhead to the system. This is the case in most of the forecasting tasks such as weather forecasting, stock market analysis, forest monitoring, airline predictions, and bill estimation. Partial labelling is done where the real labels are accessible with an extra overhead to the system via a third party (usually human). This is the case where the real labels are retrieved after an analysis of related data in the stream (e.g. medical diagnosis in health data, anomaly/fraud detection in credit card transactions, etc.). The main goal of the proposed framework is to deal with complete labelling data streams. Hence in the experiments part of this paper, it is assumed that the true labels of instances are completely accessible either instantly or after a specified delay. In this regards, each data-set in the experiments are processed twice; once in the delayed setting (where the actual labels of instances are accessible after a specified time) as well as once in the immediate setting (where the actual labels are accessible instantly).

The rest of this paper is organised as follows: Section 2 overviews related research; Section 3 details our proposed method; Section 4 outlines the experimental setup and results, and compares the presented RED-PSO method to other state-of-the-art methods; Section 5 overviews conclusion and outlines the possible future work.

Section snippets

Related work

A majority of the existing data stream classification algorithms for evolving environments use ensemble learning techniques [1], [2], [3], [4] due to their flexibility in updating the classification model (i.e. adding, removing and retraining their constituent classifiers), and consequently the fact that such methods are more trustworthy than single-classifier methods, especially in non-stationary environments, thanks to their experimentally validated higher accuracy.

In general, many of the

RED-PSO framework

In this section, we present RED-PSO, a novel ensemble learning framework for non-stationary classification using the RD and PSO techniques. We train an ensemble of classifiers comprising three categories (layers) of different classification types generated by randomly selecting features (subspaces) from the set of attributes of the target dataset. For each layer of the proposed ensemble, the number of classification types and features in each one of them is chosen so that the higher the number

Experimental study

A collection of experiments is conducted using five real-world data streams and four synthetic data stream generators to evaluate the proposed RED-PSO framework and compare it with the existing state-of-the-art methods that have shown good performance and liable results [13], [20], including Dynamic Weighted Majority (DWM) [9], Online Accuracy Updated Ensemble (OAUE)[13], OSBoost [8], Leveraging Bag (LevBag) [19], Adaptive Random Forest (ARF) [20], Learn++.NSE [27] and Adaptable Diversity-based

Conclusion and future work

We proposed a novel ensemble learning framework called RED-PSO to seamlessly adapt to different concept drifts in non-stationary data stream classification tasks. RED-PSO framework is based on a three-layer architecture to produce classification types of different size that are created by randomly selecting features from a pool of features of the target data stream. An evolutionary algorithm, Replicator Dynamics (RD), is used to seamlessly adapt to different concept drifts. Furthermore, a

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Hossein Ghomeshi is a Ph.D. student and a Visiting Lecturer in the Cloud Computing department at Birmingham City University. His research interests are in the area of data analytics, big data and machine learning. Hossein holds a M.Sc. degree in computer engineering from the University of Tehran, Iran, where his thesis looked into developing a framework for tag-based social image search based on textual and visual features. He also holds a B.Sc. computer engineering degree from Shahid Chamran

References (38)

KrawczykB. et al.
Ensemble learning for data stream analysis: A survey
Inf. Fusion
(2017)
FreundY. et al.
A decision-theoretic generalization of on-line learning and an application to boosting
J. Comput. Syst. Sci.
(1997)
BrzezinskiD. et al.
Combining block-based and online methods in learning ensembles from concept drifting data streams
Inform. Sci.
(2014)
BlackardJ.A. et al.
Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables
Comput. Electron. Agric.
(1999)
GamaJ. et al.
A survey on concept drift adaptation
ACM Comput. Surv.
(2014)
GomesH.M. et al.
A survey on ensemble learning for data stream classification
ACM Comput. Surv.
(2017)
ChuF. et al.
Fast and light boosting for adaptive mining of data streams
BreimanL.
Bagging predictors
Mach. Learn.
(1996)
OzaN.C.
Online bagging and boosting
ChenS.-T. et al.
An online boosting algorithm with theoretical justifications
(2012)

KolterJ.Z. et al.

Dynamic weighted majority: An ensemble method for drifting concepts

J. Mach. Learn. Res.

(2007)

BrzezinskiD. et al.

Reacting to different types of concept drift: The accuracy updated ensemble algorithm

IEEE Trans. Neural Netw. Learn. Syst.

(2014)

WangH. et al.

Mining concept-drifting data streams using ensemble classifiers

DomingosP. et al.

Mining high-speed data streams

JaberG.

An approach for online learning in the presence of concept change

(2013)

de Carvalho SantosS.G.T. et al.

Speeding up recovery from concept drifts

BifetA. et al.

Learning from time-changing data with adaptive windowing

ChuF. et al.

Fast and light boosting for adaptive mining of data streams

BifetA. et al.

New ensemble methods for evolving data streams

Cited by (20)

A novel ensemble framework driven by diversity and cooperativity for non-stationary data stream classification
2023, Data and Knowledge Engineering
Data stream classification is of great significance to numerous real-world scenarios. Nevertheless, the prevalent data stream classification techniques are influenced by concept drift and demonstrate unreliability in non-stationary environments. Ensemble models are typically successful when they increase diversity among their members. Several ensembles that enhance diversity have been proposed in literatures. Regrettably, there is no established method to verify that cooperativity indeed improves performance. In response to this knowledge gap, we have developed an innovative ensemble learning framework driven by diversity and cooperativity, termed EDDC, to address the issue. EDDC first dynamically maintains multiple groups of classifiers, with primary classifier in each group chosen to enhance diversity. Next, cooperativity is employed to update groups and replace outdated members. Finally, when environment changes, EDDC adaptively selects either diversity or cooperativity as the strategy for predicting labeling of new instances, while also establishing an excellent performance guarantee. Through simulation experiments, we assessed the performance of EDDC and the benefits of cooperativity for enhancing prediction. The results demonstrated that EDDC is efficient and robust in most scenarios, particularly when dealing with gradual drift. Furthermore, EDDC maintains a competitive edge in terms of classification accuracy and other metrics.
Data Stream Classification Based on Extreme Learning Machine: A Review
2022, Big Data Research
Many daily applications are generating massive amount of data in the form of stream at an ever higher speed, such as medical data, clicking stream, internet record and banking transaction, etc. In contrast to the traditional static data, data streams are of some inherent properties, to name a few, infinite length, concept drift, multiple labels and concept evolution. Among all the data mining tasks, classification is one of the basic topics in data stream mining and has gained more and more attentions among different research communities. Extreme Learning Machine (ELM) has drawn much interests in data classification due to its high efficiency, universal approximation capability, generalization ability, and simplicity, which have greatly inspired the development of many ELM-based algorithms and their applications during the past decades. In this paper, we mainly provide a comprehensive review on ELM theoretical research and its variants in data stream classification, and categorize these algorithms from different perspectives. Firstly, we briefly introduce the basic principles of ELM and its characteristics. Secondly, we give an overview of different ELM variants to address the particular issues of data stream classification. Thirdly, we present an overview of different strategies to optimize the ELM, which have further improved the stability, accuracy and generalization ability of ELM, and briefly introduce some practical applications of ELM in data stream classification. Finally, we conduct several groups of experiments to compare the performance of ELM based models addressing the focused issues. Also, the open issues and prospects of ELM models used for stream classification are discussed, which are worthwhile to be further studied in the future.
Semi-supervised classification on data streams with recurring concept drift and concept evolution
2021, Knowledge-Based Systems
Citation Excerpt :
In the real world, many applications such as online shopping, stock market and medical health generate a huge amount of data in the form of stream from a variety of sources at an ever higher speed, called data streams. Although these streams are integral part of our daily lives and have been extensively studied in literatures [1,2], it is still challenging for us to deal with them in an effective and efficient method. This is because the streaming data pose a lot of inherent properties, including continuous orderly arrival, infinite length, concept drift and evolution, limited labeled data, and so on.
Mining non-stationary stream is a challenging task due to its unique property of infinite length and dynamic characteristics let alone the issues of concept drift, concept evolution and limited labeled data. Although more attention has been attracted on the issues of concept drift and evolution in data streams, however, most of existing methods are supervised in nature, which probably result in a worse classification performance and lower efficiency in the case with scarcity of labeled data. Thus, in this paper, we proposed a semi-supervised framework with recurring concept drift and novel class detection called ESCR, which aims to detect recurring concept drift and concept evolution in data streams with partially labeled data. It is firstly built on an ensemble model consisted of several clustering-based classifiers. In terms of this framework, we adopt Jensen–Shannon divergence based change detection technique on classifier confidence score instead of classification error rate to detect recurring concept drifts. Meanwhile, we take concept evolution into consideration by monitoring the outliers with strong cohesion. Moreover, we further improve the execution efficiency of our framework by exploiting the recursive function and dynamic programming. Finally, extensive experiments conducted on both benchmark and synthetic data sets demonstrate the effectiveness and efficiency of our proposed semi-supervised framework in the handling of data streams with recurring concept drifts and concept evolution, as compared to several well-known semi-supervised data stream classification methods.
Data stream classification using a deep transfer learning method based on extreme learning machine and recurrent neural network
2024, Multimedia Tools and Applications
Artificial Intelligence in Prognosis Seasonal Forecast: Network Application Artificial Neural
2023, SSRN
Concept Drift Detection of Industrial Data Flow Based on Multivariate Region Set Partition
2023, Tien Tzu Hsueh Pao/Acta Electronica Sinica

View all citing articles on Scopus

Mohamed Gaber is a Professor in Data Analytics at the School of Computing and Digital Technology, Birmingham City University. Mohamed received his Ph.D. from Monash University, Australia. He has published over 200 papers, co-authored 3 monograph-style books, and edited/co-edited 6 books on data mining and knowledge discovery. His work has attracted well over four thousand citations, with an h-index of 32. Mohamed has served in the program committees of major conferences related to data mining, including ICDM, PAKDD, ECML/PKDD and ICML. He has also co-chaired numerous scientific events on various data mining topics. Professor Gaber is recognised as a Fellow of the British Higher Education Academy (HEA). He is also a member of the International Panel of Expert Advisers for the Australasian Data Mining Conferences. In 2007, he was awarded the CSIRO teamwork award.

Yevgeniya Kovalchuk is a Senior Lecturer in Computer Science at the School of Computing and Digital Technology and its Cyber Security Research Group, Birmingham City University. She received an M.Sc. in Economic Cybernetics from the National Technical University of Ukraine and Ph.D. in Computer Science from the University of Essex. During her professional career, she worked in both pure industrial and academic environments, as well as on projects connecting the two. She has a strong track record of applying computer and data science to solve problems across a wide range of business areas, including banking, insurance, logistics, healthcare, sport and entertainment, among many others.

View full text

A non-canonical hybrid metaheuristic approach to adaptive data stream classification

Highlights

Abstract

Introduction

Section snippets

Related work

RED-PSO framework

Experimental study

Conclusion and future work

Declaration of Competing Interest

Inf. Fusion

J. Comput. Syst. Sci.

Inform. Sci.

Comput. Electron. Agric.

A survey on concept drift adaptation

ACM Comput. Surv.

A survey on ensemble learning for data stream classification

ACM Comput. Surv.

Fast and light boosting for adaptive mining of data streams

Bagging predictors

Mach. Learn.

Online bagging and boosting

An online boosting algorithm with theoretical justifications

Dynamic weighted majority: An ensemble method for drifting concepts

J. Mach. Learn. Res.

Reacting to different types of concept drift: The accuracy updated ensemble algorithm

IEEE Trans. Neural Netw. Learn. Syst.

Mining concept-drifting data streams using ensemble classifiers

Mining high-speed data streams

An approach for online learning in the presence of concept change

Speeding up recovery from concept drifts

Learning from time-changing data with adaptive windowing

Fast and light boosting for adaptive mining of data streams

New ensemble methods for evolving data streams