01.12.2015 | Original Article
An online ensembles approach for handling concept drift in data streams: diversified online ensembles detection
Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 6/2015
EinloggenAktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Abstract
Data Streams are continuous data instances arriving at a very high speed with varying underlying conceptual distribution. We present a novel online ensemble approach, Diversified online ensembles detection (DOED), for handling these drifting concepts in data streams. Our approach maintains two ensembles of weighted experts, an ensemble with low diversity and an ensemble with high diversity, which are updated as per their accuracy in classifying the new data instances. Our approach detects drifts by comparing the two accuracies: an accuracy of an ensemble on the recent examples and its accuracy since the beginning of the learning. The final prediction for an instance is the class predicted by the ensemble which gives better accuracy in classifying the recent examples. When a drift is detected by an ensemble, it is reinitialized still maintaining its diversity levels. Experimental evaluation using various artificial and real-world datasets proves that DOED provides very high accuracy in classifying new data instances, irrespective of the size of dataset, type of drift or presence of noise. We compare DOED with the other learners in terms of new performance metrics such as kappa statistic, model cost, and the evaluation time and memory requirements. Our approach proved to be highly resource effective achieving very high accuracies even in a resource constrained environment.