2012 | OriginalPaper | Buchkapitel
An Efficient Method of Building an Ensemble of Classifiers in Streaming Data
verfasst von : Joung Woo Ryu, Mehmed M. Kantardzic, Myung-Won Kim, A. Ra Khil
Erschienen in: Big Data Analytics
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
To efficiently refine a classifier in streaming data such as sensor data and web log data we have to decide whether each streaming unlabeled datum is selected or not. The exiting methods refine a classifier based on a regular time interval. They refine a classifier even if the classification accuracy of the classifier is high. Also it uses a classifier even if the classification accuracy is low. In this paper, our ensemble method selects data in an online process that should be labeled. The selected data are used to build new classifiers of an ensemble. Our selection methodology uses training data that are applied to generate an ensemble of classifiers over streaming data. We compared the results of our ensemble approach and of a conventional ensemble approach where new classifiers for an ensemble are periodically generated. In experiments with ten benchmark data sets including three real streaming data sets, our ensemble approach generated 12.9% new classifiers for the chunk-based ensemble approach using partially labeled samples, and used an average of 10% labeled samples for the ten data sets. In all the experiments, our ensemble approach produced comparable classification accuracy. We showed that our approach can efficiently maintain the performance of an ensemble over streaming data.