Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 1/2018

31-01-2015 | Original Article

A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority

Authors: Parneeta Sidhu, M. P. S. Bhatia

Published in: International Journal of Machine Learning and Cybernetics | Issue 1/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We present an online ensemble approach, diversified dynamic weighted majority (DDWM) to classify new data instances which have varying conceptual distributions. Our approach maintains two sets of weighted ensembles that differentiate in their level of diversity. An expert in either of the ensembles is updated or removed as per its classification accuracy and a new expert is added based on the final global prediction of the algorithm and the global prediction of the ensemble for any data instance. Experimental evaluation using various artificial and real-world datasets proves that DDWM provides very high accuracy in classifying new data instances, irrespective of size of dataset, type of drift or presence of noise. We compare DDWM with the other learners in terms of new performance metrics such as kappa statistic, model cost, and the evaluation time and memory requirements. Our approach proved to be highly resource effective achieving very high accuracies even in a resource constrained environment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Baena-Garcı´a M, Del Campo-Avila J, Fidalgo R, Bifet A (2006) Early drift detection method. In: Proceedings Fourth ECML PKDD Int’l Workshop Knowledge Discovery from Data Streams (IWKDDS’06), pp 77–86 Baena-Garcı´a M, Del Campo-Avila J, Fidalgo R, Bifet A (2006) Early drift detection method. In: Proceedings Fourth ECML PKDD Int’l Workshop Knowledge Discovery from Data Streams (IWKDDS’06), pp 77–86
2.
go back to reference Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis, a framework for stream classification and clustering. In: workshop on applications of pattern analysis, JMLR: Workshop and Conference Proceedings, vol 11. p 44 Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis, a framework for stream classification and clustering. In: workshop on applications of pattern analysis, JMLR: Workshop and Conference Proceedings, vol 11. p 44
3.
go back to reference Blum A (1997) Empirical support for winnow and weighted majority algorithms: results on a calendar scheduling domain, machine learning. Kluwer Academic Publisher, Boston Blum A (1997) Empirical support for winnow and weighted majority algorithms: results on a calendar scheduling domain, machine learning. Kluwer Academic Publisher, Boston
5.
go back to reference Dietterich TG (1997) Machine learning research: four current directions. Artif Intell 18(4):97–136 Dietterich TG (1997) Machine learning research: four current directions. Artif Intell 18(4):97–136
6.
go back to reference Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection, In: Proceeding Seventh Brazilian Symp. Artificial Intelligence (SBIA’04), pp. 286–295 Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection, In: Proceeding Seventh Brazilian Symp. Artificial Intelligence (SBIA’04), pp. 286–295
7.
go back to reference Gao J, Fan W, Han J (2007) On appropriate assumptions to mine data streams: analysis and practice. In: Proceedings IEEE Int’l Conf. Data Mining (ICDM,’07), pp 143–152 Gao J, Fan W, Han J (2007) On appropriate assumptions to mine data streams: analysis and practice. In: Proceedings IEEE Int’l Conf. Data Mining (ICDM,’07), pp 143–152
8.
go back to reference Harries M (1999) Splice-2 comparative evaluation: electricity pricing, Technical report. University of New South Wales, Australia Harries M (1999) Splice-2 comparative evaluation: electricity pricing, Technical report. University of New South Wales, Australia
9.
go back to reference Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams, In: Proceedings KDD’01, ACM Press. San Francisco, 2001, pp 97–106 Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams, In: Proceedings KDD’01, ACM Press. San Francisco, 2001, pp 97–106
10.
go back to reference Kolter JZ, Maloof MA (2003) Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Proceedings of the 3rd ICDM, USA, pp 123–130 Kolter JZ, Maloof MA (2003) Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Proceedings of the 3rd ICDM, USA, pp 123–130
11.
go back to reference Kolter JZ, Maloof MA (2005) Using additive expert ensembles to cope with concept drift. In: Proceedings Int’l Conf. Machine Learning (ICML’05), pp 449–456 Kolter JZ, Maloof MA (2005) Using additive expert ensembles to cope with concept drift. In: Proceedings Int’l Conf. Machine Learning (ICML’05), pp 449–456
12.
go back to reference Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790MATH Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790MATH
14.
go back to reference Mansoori M, Zakaria O, Gani A (2012) Improving exposure of intrusion deception system through implementation of hybrid honeypot. IAJIT 9 (5): 436–444 Mansoori M, Zakaria O, Gani A (2012) Improving exposure of intrusion deception system through implementation of hybrid honeypot. IAJIT 9 (5): 436–444
15.
go back to reference Minku FL, White A, Yao X (2010) The Impact of Diversity on On-Line Ensemble Learning in the Presence of Concept Drift. IEEE Trans Knowl Data Eng 22(5):730–742CrossRef Minku FL, White A, Yao X (2010) The Impact of Diversity on On-Line Ensemble Learning in the Presence of Concept Drift. IEEE Trans Knowl Data Eng 22(5):730–742CrossRef
16.
go back to reference Minku LL, Yao X (2012) DDD: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(4):619CrossRef Minku LL, Yao X (2012) DDD: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(4):619CrossRef
18.
go back to reference Nishida K, Yamauchi K (2007) Adaptive classifiers-ensemble system for tracking concept drift. In: Proceedings Sixth Int’l Conf. Machine Learning and Cybernetics (ICMLC’07), pp 3607–3612 Nishida K, Yamauchi K (2007) Adaptive classifiers-ensemble system for tracking concept drift. In: Proceedings Sixth Int’l Conf. Machine Learning and Cybernetics (ICMLC’07), pp 3607–3612
19.
go back to reference Nishida K, Yamauchi K (2007) Detecting concept drift using statistical testing. In: Proceedings 10th Int’l Conf. Discovery Science (DS’07), pp 264–269 Nishida K, Yamauchi K (2007) Detecting concept drift using statistical testing. In: Proceedings 10th Int’l Conf. Discovery Science (DS’07), pp 264–269
20.
go back to reference Nishida K, Yamauchi K, Omori T (2005) ACE: adaptive classifiers-ensemble system for concept-drifting environments. In: Proceedings of the 6th International Workshop on Multiple Classifier Systems, ser. Lect Notes Comput Sci 3541:176–185CrossRef Nishida K, Yamauchi K, Omori T (2005) ACE: adaptive classifiers-ensemble system for concept-drifting environments. In: Proceedings of the 6th International Workshop on Multiple Classifier Systems, ser. Lect Notes Comput Sci 3541:176–185CrossRef
21.
go back to reference Oza NC, Russell S (2001) Experimental comparisons of online and batch versions of bagging and boosting. In: Proceedings of the Seventh ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD’01), ACM Press, New York, pp 359–364 Oza NC, Russell S (2001) Experimental comparisons of online and batch versions of bagging and boosting. In: Proceedings of the Seventh ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD’01), ACM Press, New York, pp 359–364
22.
go back to reference Scholz M, Klinkenberg R (2005) An ensemble classifier for drifting concepts. In: Proceedings of the Second International Workshop on Knowledge Discovery from Data Streams (IWKDDS’05), Porto, pp 53–64 Scholz M, Klinkenberg R (2005) An ensemble classifier for drifting concepts. In: Proceedings of the Second International Workshop on Knowledge Discovery from Data Streams (IWKDDS’05), Porto, pp 53–64
23.
go back to reference Sidhu P, Bhatia MPS (2014) Extended dynamic weighted majority using diversity to handle drifts. New Trends Databases Inf Sys Adv Intell Sys Comput 241:389–395CrossRef Sidhu P, Bhatia MPS (2014) Extended dynamic weighted majority using diversity to handle drifts. New Trends Databases Inf Sys Adv Intell Sys Comput 241:389–395CrossRef
24.
go back to reference Stanley KO (2003) Learning concept drift with a Commitee of decision trees, Technical Report AI-TR-03-302, Dept. of Computer Sciences, Univ. of Texas, Austin Stanley KO (2003) Learning concept drift with a Commitee of decision trees, Technical Report AI-TR-03-302, Dept. of Computer Sciences, Univ. of Texas, Austin
25.
go back to reference Street W, Kim Y (2001) A streaming ensemble algorithm (SEA) for large-scale classification, In: Proceedings of the 7th ACM International Conference on Knowledge Discovery and Data Mining, ACM Press, New York, pp 377–382 Street W, Kim Y (2001) A streaming ensemble algorithm (SEA) for large-scale classification, In: Proceedings of the 7th ACM International Conference on Knowledge Discovery and Data Mining, ACM Press, New York, pp 377–382
26.
go back to reference Schlimmer JC, Granger RH (1986) Incremental learning from noisy data. Mach Learn 1(3):317–354 Schlimmer JC, Granger RH (1986) Incremental learning from noisy data. Mach Learn 1(3):317–354
27.
go back to reference Tsymbal A (2004) The problem of concept drift: definitions and related work, Technical Report TCD-CS-2004-15. Department of Computer Science, Trinity College Dublin, Ireland Tsymbal A (2004) The problem of concept drift: definitions and related work, Technical Report TCD-CS-2004-15. Department of Computer Science, Trinity College Dublin, Ireland
28.
go back to reference Kubat M, Widmer G (1996) Learning in the presence of concept drift and hidden contexts, Machine Learning, 23 (1): 69–101.16.Klinkenberg R., Learning drifting Kubat M, Widmer G (1996) Learning in the presence of concept drift and hidden contexts, Machine Learning, 23 (1): 69–101.16.Klinkenberg R., Learning drifting
29.
go back to reference Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51:181–207CrossRefMATH Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51:181–207CrossRefMATH
30.
go back to reference Tang EK, Sunganthan PN, Yao X (2006) An analysis of diversity measures. Mach Learn 65:247–271CrossRef Tang EK, Sunganthan PN, Yao X (2006) An analysis of diversity measures. Mach Learn 65:247–271CrossRef
31.
go back to reference Yule G (1900) On the association of attributes in statistics, Philosophical Trans. Royal Soc. of London, Series A, vol 194, pp 257–319 Yule G (1900) On the association of attributes in statistics, Philosophical Trans. Royal Soc. of London, Series A, vol 194, pp 257–319
32.
go back to reference Gama J, Sebastião R, Rodrigues PP (2009) Issues in evaluation of stream learning algorithms, In KDD’09, pp 329–338 Gama J, Sebastião R, Rodrigues PP (2009) Issues in evaluation of stream learning algorithms, In KDD’09, pp 329–338
33.
go back to reference Minku FL, Yao X (2009) Using diversity to handle concept drift in on-line learning, In: Proceedings Int’l Joint Conf. Neural Networks (IJCNN, 2009b), pp 2125–2132 Minku FL, Yao X (2009) Using diversity to handle concept drift in on-line learning, In: Proceedings Int’l Joint Conf. Neural Networks (IJCNN, 2009b), pp 2125–2132
34.
go back to reference Su L, Liu HY, Song ZH (2011) A new classification algorithm for data stream. International Journal of Modern Education and Computer Science 4:32–39CrossRef Su L, Liu HY, Song ZH (2011) A new classification algorithm for data stream. International Journal of Modern Education and Computer Science 4:32–39CrossRef
37.
go back to reference Tsai CJ, Lee CI, Yang WP (2009) Mining decision rules on data streams in the presence of concept drifts. Expert Syst Appl 36:1164–1178CrossRef Tsai CJ, Lee CI, Yang WP (2009) Mining decision rules on data streams in the presence of concept drifts. Expert Syst Appl 36:1164–1178CrossRef
38.
go back to reference Gaber MM, Yu PS (2006) Detection and classification of changes in evolving data streams. Int J Inf Technol Decis Mak 5:659–670CrossRef Gaber MM, Yu PS (2006) Detection and classification of changes in evolving data streams. Int J Inf Technol Decis Mak 5:659–670CrossRef
39.
go back to reference Yang Y, Wu X, Zhu X (2005) Combining proactive and reactive predictions for data streams, In Proceedings of ACM SIGKDD, pp 710–715 Yang Y, Wu X, Zhu X (2005) Combining proactive and reactive predictions for data streams, In Proceedings of ACM SIGKDD, pp 710–715
40.
go back to reference Wang H, Fan W, Yu PS, Han J (2001) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, pp 226–235 Wang H, Fan W, Yu PS, Han J (2001) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, pp 226–235
41.
go back to reference Chu F, Zaniolo C (2004) Fast and light boosting for adaptive mining of data streams. In: Proceedings Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD’04), pp 282–292 Chu F, Zaniolo C (2004) Fast and light boosting for adaptive mining of data streams. In: Proceedings Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD’04), pp 282–292
42.
go back to reference Scholz M, Klinkenberg R (2007) Boosting classifiers for drifting concepts. Intell Data Anal Spec Issue Knowl Discov Data Streams 11(1):3–28 Scholz M, Klinkenberg R (2007) Boosting classifiers for drifting concepts. Intell Data Anal Spec Issue Knowl Discov Data Streams 11(1):3–28
43.
go back to reference S. Ramamurthy, R. Bhatnagar, Tracking Recurrent Concept Drift in Streaming Data Using Ensemble Classifiers, In Proc. Int’l Conf. Machine Learning and Applications (ICMLA’07), pp. 404-409, 2007 S. Ramamurthy, R. Bhatnagar, Tracking Recurrent Concept Drift in Streaming Data Using Ensemble Classifiers, In Proc. Int’l Conf. Machine Learning and Applications (ICMLA’07), pp. 404-409, 2007
44.
go back to reference Gao J, Fan W, Han J, Yu P (2007) A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings SIAM Int’l Conf. Data Mining (ICDM) Gao J, Fan W, Han J, Yu P (2007) A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings SIAM Int’l Conf. Data Mining (ICDM)
45.
go back to reference He H, Chen S (2008) IMORL: incremental Multiple-Object Recognition and Localization. IEEE Trans Neural Networks 19(10):1727–1738CrossRef He H, Chen S (2008) IMORL: incremental Multiple-Object Recognition and Localization. IEEE Trans Neural Networks 19(10):1727–1738CrossRef
46.
go back to reference Polikar R, Udpa L, Udpa SS, Honavar V (2001) Learn ++: an incremental learning algorithm for supervised neural networks. IEEE Trans Sys Man Cybernet Part C 31(4):497–508CrossRef Polikar R, Udpa L, Udpa SS, Honavar V (2001) Learn ++: an incremental learning algorithm for supervised neural networks. IEEE Trans Sys Man Cybernet Part C 31(4):497–508CrossRef
49.
go back to reference Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, New York pp 71–80 Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, New York pp 71–80
50.
go back to reference Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco
51.
go back to reference Dewan MF, Zhang L, Hossain A, Chowdhury MR, Rebecca S, Graham S, Keshav D (2013) An adaptive ensemble classifier for mining concept drifting data streams. Expert Sys Appl 40(15):5895–5906. doi:10.1016/j.eswa.05.001 CrossRef Dewan MF, Zhang L, Hossain A, Chowdhury MR, Rebecca S, Graham S, Keshav D (2013) An adaptive ensemble classifier for mining concept drifting data streams. Expert Sys Appl 40(15):5895–5906. doi:10.​1016/​j.​eswa.​05.​001 CrossRef
52.
go back to reference Zliobaite I (2009) Learning under concept drift: an overview, Technical report faculty of mathematics and informatics. Vilnius UniversityLithuania, Vilnius Zliobaite I (2009) Learning under concept drift: an overview, Technical report faculty of mathematics and informatics. Vilnius UniversityLithuania, Vilnius
53.
go back to reference Tumer K, Ghosh J (1996) Error correlation and error reduction in ensemble classifiers. Connect Sci 8(3):385–404CrossRef Tumer K, Ghosh J (1996) Error correlation and error reduction in ensemble classifiers. Connect Sci 8(3):385–404CrossRef
54.
go back to reference Schlimmer J, Granger R (1986) Beyond incremental processing: tracking concept drift. In: Proceedings of the 5th National Conference on Artificial Intelligence, AAAI Press, Menlo Park, CA, pp 502–507 Schlimmer J, Granger R (1986) Beyond incremental processing: tracking concept drift. In: Proceedings of the 5th National Conference on Artificial Intelligence, AAAI Press, Menlo Park, CA, pp 502–507
55.
go back to reference Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22:1517–1531CrossRef Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22:1517–1531CrossRef
57.
go back to reference Baumgartner D, Serpen G (2013) Performance of global–local hybrid ensemble versus boosting and bagging ensembles. Int J Mach Learn Cybernet 4(4):301–317CrossRef Baumgartner D, Serpen G (2013) Performance of global–local hybrid ensemble versus boosting and bagging ensembles. Int J Mach Learn Cybernet 4(4):301–317CrossRef
58.
go back to reference Christou IT, Gekas G, Kyrikou A (2012) A classifier ensemble approach to the TV-viewer profile adaptation problem. Int J Mach Learn Cybernet 3(4):313–326CrossRef Christou IT, Gekas G, Kyrikou A (2012) A classifier ensemble approach to the TV-viewer profile adaptation problem. Int J Mach Learn Cybernet 3(4):313–326CrossRef
59.
go back to reference Wang XZ, Wang R, Feng HM, Wang H (2014) A new approach to classifier fusion based on upper integral. IEEE Transactions on Cybernetics 44(5):620–635CrossRef Wang XZ, Wang R, Feng HM, Wang H (2014) A new approach to classifier fusion based on upper integral. IEEE Transactions on Cybernetics 44(5):620–635CrossRef
Metadata
Title
A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority
Authors
Parneeta Sidhu
M. P. S. Bhatia
Publication date
31-01-2015
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 1/2018
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-015-0333-x

Other articles of this Issue 1/2018

International Journal of Machine Learning and Cybernetics 1/2018 Go to the issue