Skip to main content
Top
Published in: Neural Computing and Applications 11/2021

06-10-2020 | Original Article

Constructing accuracy and diversity ensemble using Pareto-based multi-objective learning for evolving data streams

Authors: Yange Sun, Honghua Dai

Published in: Neural Computing and Applications | Issue 11/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Ensemble learning is one of the most frequently used techniques for handling concept drift, which is the greatest challenge for learning high-performance models from big evolving data streams. In this paper, a Pareto-based multi-objective optimization technique is introduced to learn high-performance base classifiers. Based on this technique, a multi-objective evolutionary ensemble learning scheme, named Pareto-optimal ensemble for a better accuracy and diversity (PAD), is proposed. The approach aims to enhance the generalization ability of ensemble in evolving data stream environment by balancing the accuracy and diversity of ensemble members. In addition, an adaptive window change detection mechanism is designed for tracking different kinds of drifts constantly. Extensive experiments show that PAD is capable of adapting to dynamic change environments effectively and efficiently in achieving better performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Aggarwal CC (2007) Data streams: models and algorithms. Springer-Verlag, BerlinCrossRef Aggarwal CC (2007) Data streams: models and algorithms. Springer-Verlag, BerlinCrossRef
2.
3.
go back to reference Zliobaite I, Pechenizkiy M, Gama J (2016) An overview of concept drift applications. In: Japkowicz N, Stefanowski J (eds) Big data analysis: new algorithms for a new society, studies in big data, vol 16. Springer, Berlin, pp 91–114CrossRef Zliobaite I, Pechenizkiy M, Gama J (2016) An overview of concept drift applications. In: Japkowicz N, Stefanowski J (eds) Big data analysis: new algorithms for a new society, studies in big data, vol 16. Springer, Berlin, pp 91–114CrossRef
4.
go back to reference De Francisci Morales G, Bifet A, Khan L et al (2016) IoT Big Data Stream Mining. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2016), pp 2119–2120. ACM Press, New York De Francisci Morales G, Bifet A, Khan L et al (2016) IoT Big Data Stream Mining. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2016), pp 2119–2120. ACM Press, New York
5.
go back to reference Gomes HM, Read J, Bifet A et al (2019) Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explor Newslett 21(2):6–22CrossRef Gomes HM, Read J, Bifet A et al (2019) Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explor Newslett 21(2):6–22CrossRef
6.
go back to reference Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report, Department of Computer Science, Trinity College, Dublin, Ireland Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report, Department of Computer Science, Trinity College, Dublin, Ireland
7.
go back to reference Gama J, Žliobaitė I, Bifet A et al (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):231–238CrossRef Gama J, Žliobaitė I, Bifet A et al (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):231–238CrossRef
8.
go back to reference Ditzler G, Roveri M, Alippi C et al (2015) Learning in nonstationary environments: a survey. IEEE Comput Intell Mag 10(4):12–25CrossRef Ditzler G, Roveri M, Alippi C et al (2015) Learning in nonstationary environments: a survey. IEEE Comput Intell Mag 10(4):12–25CrossRef
10.
go back to reference Khamassi I, Sayed-Mouchaweh M, Hammami M et al (2018) Discussion and review on evolving data streams and concept drift adapting. Evol Syst 9(1):1–23CrossRef Khamassi I, Sayed-Mouchaweh M, Hammami M et al (2018) Discussion and review on evolving data streams and concept drift adapting. Evol Syst 9(1):1–23CrossRef
11.
go back to reference Gomes HM, Barddal JP, Enembreck F et al (2017) A survey on ensemble learning for data stream classification. ACM Comput Surv (CSUR) 50(2):1–36CrossRef Gomes HM, Barddal JP, Enembreck F et al (2017) A survey on ensemble learning for data stream classification. ACM Comput Surv (CSUR) 50(2):1–36CrossRef
12.
go back to reference Krawczyk B, Minku LL, Gama J et al (2017) Ensemble learning for data stream analysis: a survey. Inf Fusion 37:132–156CrossRef Krawczyk B, Minku LL, Gama J et al (2017) Ensemble learning for data stream analysis: a survey. Inf Fusion 37:132–156CrossRef
13.
go back to reference Kuncheva LI (2008) Classifier ensembles for detecting concept change in streaming data: Overview and perspectives. In: Proceedings of the 2nd workshop SUEMA, pp 5–10 Kuncheva LI (2008) Classifier ensembles for detecting concept change in streaming data: Overview and perspectives. In: Proceedings of the 2nd workshop SUEMA, pp 5–10
14.
go back to reference Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207CrossRef Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207CrossRef
15.
go back to reference Brown G, Kuncheva L (2010) “Good’ and “bad” diversity in majority vote ensembles. In: Multiple classifier systems, pp 124–133 Brown G, Kuncheva L (2010) “Good’ and “bad” diversity in majority vote ensembles. In: Multiple classifier systems, pp 124–133
16.
go back to reference Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5):730–742CrossRef Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5):730–742CrossRef
17.
go back to reference Minku LL, Yao X (2012) DDD: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(4):619–633CrossRef Minku LL, Yao X (2012) DDD: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(4):619–633CrossRef
18.
go back to reference Brzezinski D, Stefanowski J (2016) Ensemble diversity in evolving data streams. In: Proceedings of the 19th international conference on discovery science (DS 2016). Bari, Italy, pp 229–244 Brzezinski D, Stefanowski J (2016) Ensemble diversity in evolving data streams. In: Proceedings of the 19th international conference on discovery science (DS 2016). Bari, Italy, pp 229–244
19.
go back to reference Sun JY, Zhang H, Zhou A et al (2018) Learning from a stream of nonstationary and dependent data in multiobjective evolutionary optimization. IEEE Trans Evol Comput 23(4):541–555CrossRef Sun JY, Zhang H, Zhou A et al (2018) Learning from a stream of nonstationary and dependent data in multiobjective evolutionary optimization. IEEE Trans Evol Comput 23(4):541–555CrossRef
20.
go back to reference Ghomeshi H, Gaber MM, Kovalchuk Y (2019) EACD: evolutionary adaptation to concept drifts in data streams. Data Min Knowl Discov 33(3):663–694CrossRef Ghomeshi H, Gaber MM, Kovalchuk Y (2019) EACD: evolutionary adaptation to concept drifts in data streams. Data Min Knowl Discov 33(3):663–694CrossRef
21.
go back to reference Brzezinski D (2015) Block-based and online ensembles for concept-drifting data streams. PhD thesis, Poznan University of Technology Brzezinski D (2015) Block-based and online ensembles for concept-drifting data streams. PhD thesis, Poznan University of Technology
22.
go back to reference Street WN, Kim YS (2001) A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2001), pp 377–382. ACM Press, New York Street WN, Kim YS (2001) A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2001), pp 377–382. ACM Press, New York
23.
go back to reference Wang H, Fan W, Yu PS et al (2003) Mining concept-drifting data streams using ensembles classifiers. In: Proceedings of 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2003), pp 226–235. ACM Press, New York Wang H, Fan W, Yu PS et al (2003) Mining concept-drifting data streams using ensembles classifiers. In: Proceedings of 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2003), pp 226–235. ACM Press, New York
24.
go back to reference Brzeziński D, Stefanowski J (2011) Accuracy updated ensemble for data streams with concept drift. In: Corchado E, Kurzynski M, Wozniak M (eds) Proceedings of the 6th international conference on hybrid artificial intelligent systems (HAIS 2011, LNCS 6678), pp. 155–163. Springer-Verlag, Berlin Brzeziński D, Stefanowski J (2011) Accuracy updated ensemble for data streams with concept drift. In: Corchado E, Kurzynski M, Wozniak M (eds) Proceedings of the 6th international conference on hybrid artificial intelligent systems (HAIS 2011, LNCS 6678), pp. 155–163. Springer-Verlag, Berlin
25.
go back to reference Brzezinski D, Stefanowski J (2014) Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans Neural Netw Learn Syst 25(1):81–94CrossRef Brzezinski D, Stefanowski J (2014) Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans Neural Netw Learn Syst 25(1):81–94CrossRef
26.
go back to reference Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531CrossRef Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531CrossRef
27.
go back to reference Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790MATH Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790MATH
28.
go back to reference Gomes HM, Enembreck F (2014) SAE2: advances on the social adaptive ensemble classifier for data streams. In: Proceedings of the 29th annual ACM symposium on applied computing. ACM, pp 798–804 Gomes HM, Enembreck F (2014) SAE2: advances on the social adaptive ensemble classifier for data streams. In: Proceedings of the 29th annual ACM symposium on applied computing. ACM, pp 798–804
29.
go back to reference Lu Y, Cheung Y, Tang YY (2017) Dynamic weighted majority for incremental learning of imbalanced data streams with concept drift. In: Proceedings of the 26th international joint conference on artificial intelligence, pp 2393–2399. AAAI Press Lu Y, Cheung Y, Tang YY (2017) Dynamic weighted majority for incremental learning of imbalanced data streams with concept drift. In: Proceedings of the 26th international joint conference on artificial intelligence, pp 2393–2399. AAAI Press
30.
go back to reference Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265(5):50–67MathSciNetCrossRef Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265(5):50–67MathSciNetCrossRef
31.
32.
go back to reference Santos SGTDC, Paulo Jr MG, Silva GDDS, Barros RSMD (2014) Speeding up recovery from concept drifts. In: Proceedings of the European conference on machine learning and knowledge discovery in databases, pp 179–194. Springer-Verlag, Berlin Santos SGTDC, Paulo Jr MG, Silva GDDS, Barros RSMD (2014) Speeding up recovery from concept drifts. In: Proceedings of the European conference on machine learning and knowledge discovery in databases, pp 179–194. Springer-Verlag, Berlin
33.
go back to reference Gomes HM, Bifet A, Read J et al (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):1469–1495MathSciNetCrossRef Gomes HM, Bifet A, Read J et al (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):1469–1495MathSciNetCrossRef
34.
go back to reference Sun Y, Tang K, Zhu Z et al (2017) Concept drift adaptation by exploiting historical knowledge. IEEE Trans Neural Netw Learn Syst 29(10):4822–4832CrossRef Sun Y, Tang K, Zhu Z et al (2017) Concept drift adaptation by exploiting historical knowledge. IEEE Trans Neural Netw Learn Syst 29(10):4822–4832CrossRef
35.
go back to reference Jin Y, Sendhoff B (2008) Pareto-based multi-objective machine learning: an overview and case studies. IEEE Trans Systems Man Cybern Part C Appl Rev 38(3):397–415CrossRef Jin Y, Sendhoff B (2008) Pareto-based multi-objective machine learning: an overview and case studies. IEEE Trans Systems Man Cybern Part C Appl Rev 38(3):397–415CrossRef
36.
go back to reference Akgül A (2018) A novel method for a fractional derivative with non-local and non-singular kernel. Chaos, Solitons Fractals 114:478–482MathSciNetCrossRef Akgül A (2018) A novel method for a fractional derivative with non-local and non-singular kernel. Chaos, Solitons Fractals 114:478–482MathSciNetCrossRef
37.
go back to reference Akgül A, Modanli M (2019) Crank–Nicholson difference method and reproducing kernel function for third order fractional differential equations in the sense of Atangana–Baleanu Caputo derivative. Chaos, Solitons Fractals 127:10–16MathSciNetCrossRef Akgül A, Modanli M (2019) Crank–Nicholson difference method and reproducing kernel function for third order fractional differential equations in the sense of Atangana–Baleanu Caputo derivative. Chaos, Solitons Fractals 127:10–16MathSciNetCrossRef
38.
go back to reference Arqub OA, Al-Smadi M, Momani S et al (2017) Application of reproducing kernel algorithm for solving second-order, two-point fuzzy boundary value problems. Soft Comput 21:7191–7206CrossRef Arqub OA, Al-Smadi M, Momani S et al (2017) Application of reproducing kernel algorithm for solving second-order, two-point fuzzy boundary value problems. Soft Comput 21:7191–7206CrossRef
39.
go back to reference Arqub OA (2015) Adaptation of reproducing kernel algorithm for solving fuzzy Fredholm-Volterra integrodifferential equations. Neural Comput Appl 28:1591–1610CrossRef Arqub OA (2015) Adaptation of reproducing kernel algorithm for solving fuzzy Fredholm-Volterra integrodifferential equations. Neural Comput Appl 28:1591–1610CrossRef
40.
go back to reference Deb K, Pratap A, Agarwal S et al (2002) A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197CrossRef Deb K, Pratap A, Agarwal S et al (2002) A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197CrossRef
41.
go back to reference Gama J, Medas P, Castillo G et al (2004) Learning with drift detection. In: Proceedings of the 17th Brazilian symposium on artificial intelligence (SBIA 2004, LNCS 3171), pp 286–295. Springer-Verlag, Berlin Gama J, Medas P, Castillo G et al (2004) Learning with drift detection. In: Proceedings of the 17th Brazilian symposium on artificial intelligence (SBIA 2004, LNCS 3171), pp 286–295. Springer-Verlag, Berlin
42.
go back to reference Baena-García M, Campo-Ávila DJ, and Fidalgo R et al (2006) Early drift detection method. In: Proceedings of the fourth international workshop on knowledge discovery from data streams (KDD 2006), pp 77–86. ACM Press, New York Baena-García M, Campo-Ávila DJ, and Fidalgo R et al (2006) Early drift detection method. In: Proceedings of the fourth international workshop on knowledge discovery from data streams (KDD 2006), pp 77–86. ACM Press, New York
43.
go back to reference Bifet A, Gavalda R (2007) Learning from time-changing data with adaptive windowing. In: Apte C, Skillicorn D, Liu B et al (eds) Proceedings of the 7th SIAM international conference on data mining (SDM 2007), pp 443–448. SIAM, Philadelphia Bifet A, Gavalda R (2007) Learning from time-changing data with adaptive windowing. In: Apte C, Skillicorn D, Liu B et al (eds) Proceedings of the 7th SIAM international conference on data mining (SDM 2007), pp 443–448. SIAM, Philadelphia
45.
go back to reference Bifet A, Holmes G, Kirkby R et al (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604 Bifet A, Holmes G, Kirkby R et al (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604
48.
go back to reference Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH
49.
go back to reference Settouti N, Bechar MEA, Chikh MA (2016) Statistical comparisons of the top 10 algorithms in data mining for classification task. Int J Interact Multimed Artif Intell Spec Issue Artif Intell Underpinning 4:46–51 Settouti N, Bechar MEA, Chikh MA (2016) Statistical comparisons of the top 10 algorithms in data mining for classification task. Int J Interact Multimed Artif Intell Spec Issue Artif Intell Underpinning 4:46–51
Metadata
Title
Constructing accuracy and diversity ensemble using Pareto-based multi-objective learning for evolving data streams
Authors
Yange Sun
Honghua Dai
Publication date
06-10-2020
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 11/2021
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-020-05386-5

Other articles of this Issue 11/2021

Neural Computing and Applications 11/2021 Go to the issue

Premium Partner