Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 1/2014

01.02.2014 | Original Article

Online neural network model for non-stationary and imbalanced data stream classification

verfasst von: Adel Ghazikhani, Reza Monsefi, Hadi Sadoghi Yazdi

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 1/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

“Concept drift” and class imbalance are two challenges for supervised classifiers. “Concept drift” (or non-stationarity) is changes in the underlying function being learnt, and class imbalance is a vast difference between the numbers of instances in different classes of data. Class imbalance is an obstacle for the efficiency of most classifiers. Previous methods for classifying non-stationary and imbalanced data streams mainly focus on batch solutions, in which the classification model is trained using a chunk of data. Here, we propose an online Neural Network (NN) model. The NN model, is composed of two different parts for handling concept drift and class imbalance. Concept drift is handled with a forgetting function and class imbalance is handled with a specific error function which assigns different importance to error in separate classes. The proposed method is evaluated on 3 synthetic and 8 real world datasets. The results show statistically significant improvement to previous online NN methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC Press, Boca RatonCrossRefMATH Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC Press, Boca RatonCrossRefMATH
2.
Zurück zum Zitat Masud MM (2009) Adaptive classification of scarcely labeled and evolving data streams. Texas, Dallas Masud MM (2009) Adaptive classification of scarcely labeled and evolving data streams. Texas, Dallas
3.
Zurück zum Zitat Klinkenberg R, Joachims T (2000) Detecting concept drift with support vector machines. In: Paper presented at the 17th International conference on machine learning, San Mateo Klinkenberg R, Joachims T (2000) Detecting concept drift with support vector machines. In: Paper presented at the 17th International conference on machine learning, San Mateo
4.
Zurück zum Zitat Sun J, Li H (2011) Dynamic financial distress prediction using instance selection for the disposal of concept drift. Expert Syst Appl 38(3):2566–2576CrossRef Sun J, Li H (2011) Dynamic financial distress prediction using instance selection for the disposal of concept drift. Expert Syst Appl 38(3):2566–2576CrossRef
5.
Zurück zum Zitat Martínez-Rego D, Pérez-Sánchez B, Fontenla-Romero O, Alonso-Betanzos A (2011) A robust incremental learning method for non-stationary environments. Neurocomputing 74(11):1800–1808CrossRef Martínez-Rego D, Pérez-Sánchez B, Fontenla-Romero O, Alonso-Betanzos A (2011) A robust incremental learning method for non-stationary environments. Neurocomputing 74(11):1800–1808CrossRef
6.
Zurück zum Zitat Pavlidis NG, Tasoulis DK, Adams NM, Hand DJ (2011) Landa perceptron: an adaptive classifier for data streams. Pattern Recogn 44(1):78–96CrossRefMATH Pavlidis NG, Tasoulis DK, Adams NM, Hand DJ (2011) Landa perceptron: an adaptive classifier for data streams. Pattern Recogn 44(1):78–96CrossRefMATH
7.
Zurück zum Zitat Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report: TCD-CS-2004-15. Trinity College Dublin, Computer Science Department, Dublin Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report: TCD-CS-2004-15. Trinity College Dublin, Computer Science Department, Dublin
8.
Zurück zum Zitat Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531CrossRef Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531CrossRef
9.
Zurück zum Zitat Abdulsalam H, Skillicorn DB, Martin P (2011) Classification using streaming random forests. IEEE Trans Knowl Data Eng 23(1):22–36CrossRef Abdulsalam H, Skillicorn DB, Martin P (2011) Classification using streaming random forests. IEEE Trans Knowl Data Eng 23(1):22–36CrossRef
10.
Zurück zum Zitat Masud MM, Jing G, Khan L, Jiawei H, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874CrossRef Masud MM, Jing G, Khan L, Jiawei H, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874CrossRef
12.
Zurück zum Zitat Rodriguez JJ, Kuncheva LI (2008) Combining online classification approaches for changing environments. In: Paper presented at the Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, Orlando Rodriguez JJ, Kuncheva LI (2008) Combining online classification approaches for changing environments. In: Paper presented at the Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, Orlando
13.
14.
Zurück zum Zitat Kuncheva LI (2004) Classifier ensembles for changing environments. In: Roli F, Kittler J, Windeatt T (eds) Multiple classifier systems. Lecture notes in computer science, vol 3077. Springer, Berlin, pp 1–15. doi:10.1007/978-3-540-25966-4_1 Kuncheva LI (2004) Classifier ensembles for changing environments. In: Roli F, Kittler J, Windeatt T (eds) Multiple classifier systems. Lecture notes in computer science, vol 3077. Springer, Berlin, pp 1–15. doi:10.​1007/​978-3-540-25966-4_​1
15.
Zurück zum Zitat Kotsiantis S, Patriarcheas K, Xenos M (2010) A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowl-Based Syst 23(6):529–535CrossRef Kotsiantis S, Patriarcheas K, Xenos M (2010) A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowl-Based Syst 23(6):529–535CrossRef
16.
Zurück zum Zitat Abdelhamid B (2011) Incremental learning with multi-level adaptation. Neurocomputing 74(11):1785–1799CrossRef Abdelhamid B (2011) Incremental learning with multi-level adaptation. Neurocomputing 74(11):1785–1799CrossRef
17.
Zurück zum Zitat Pocock A, Yiapanis P, Singer J, Luján M, Brown G (2010) Online non-stationary boosting. In: El Gayar N, Kittler J, Roli F (eds) Multiple classifier systems. Lecture notes in computer science, vol 5997. Springer, Berlin, pp 205–214. doi:10.1007/978-3-642-12127-2_21 Pocock A, Yiapanis P, Singer J, Luján M, Brown G (2010) Online non-stationary boosting. In: El Gayar N, Kittler J, Roli F (eds) Multiple classifier systems. Lecture notes in computer science, vol 5997. Springer, Berlin, pp 205–214. doi:10.​1007/​978-3-642-12127-2_​21
18.
Zurück zum Zitat Minku L, Yao X (2011) DDD: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(99):1–1 Minku L, Yao X (2011) DDD: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(99):1–1
19.
Zurück zum Zitat Batuwita R, Palade V (2010) FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans Fuzzy Syst 18(3):558–571CrossRef Batuwita R, Palade V (2010) FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans Fuzzy Syst 18(3):558–571CrossRef
20.
Zurück zum Zitat Fernández A, del Jesus MJ, Herrera F (2010) On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets. Inf Sci 180(8):1268–1291CrossRef Fernández A, del Jesus MJ, Herrera F (2010) On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets. Inf Sci 180(8):1268–1291CrossRef
22.
Zurück zum Zitat Sánchez-Monedero J, Gutiérrez P, Fernández-Navarro F, Hervás-Martínez C (2011) Weighting efficient accuracy and minimum sensitivity for evolving multi-class classifiers. Neural Process Lett 34(2):101–116. doi:10.1007/s11063-011-9186-9 CrossRef Sánchez-Monedero J, Gutiérrez P, Fernández-Navarro F, Hervás-Martínez C (2011) Weighting efficient accuracy and minimum sensitivity for evolving multi-class classifiers. Neural Process Lett 34(2):101–116. doi:10.​1007/​s11063-011-9186-9 CrossRef
23.
Zurück zum Zitat Gao J, Fan W, Han J, Yu PS (2007) A general framework for mining concept-drifting data streams with skewed distributions. Paper presented at the SIAM Gao J, Fan W, Han J, Yu PS (2007) A general framework for mining concept-drifting data streams with skewed distributions. Paper presented at the SIAM
24.
Zurück zum Zitat Chen S, He H (2010) Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evol Syst 2(1):35–50CrossRef Chen S, He H (2010) Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evol Syst 2(1):35–50CrossRef
25.
Zurück zum Zitat Ditzler G, Polikar R (2010) An ensemble based incremental learning framework for concept drift and class imbalance. Paper presented at the WCCI Ditzler G, Polikar R (2010) An ensemble based incremental learning framework for concept drift and class imbalance. Paper presented at the WCCI
26.
Zurück zum Zitat Tong D, Mintram R (2010) Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection. Int J Mach Learn Cyber 1(1–4):75–87. doi:10.1007/s13042-010-0004-x CrossRef Tong D, Mintram R (2010) Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection. Int J Mach Learn Cyber 1(1–4):75–87. doi:10.​1007/​s13042-010-0004-x CrossRef
27.
Zurück zum Zitat Boehm O, Hardoon D, Manevitz L (2011) Classifying cognitive states of brain activity via one-class neural networks with feature selection by genetic algorithms. Int J Mach Learn Cyber 2(3):125–134. doi:10.1007/s13042-011-0030-3 CrossRef Boehm O, Hardoon D, Manevitz L (2011) Classifying cognitive states of brain activity via one-class neural networks with feature selection by genetic algorithms. Int J Mach Learn Cyber 2(3):125–134. doi:10.​1007/​s13042-011-0030-3 CrossRef
29.
Zurück zum Zitat Barakat M, Lefebvre D, Khalil M, Druaux F, Mustapha O (2013) Parameter selection algorithm with self adaptive growing neural network classifier for diagnosis issues. Int J Mach Learn Cyber 4(3):217–233. doi:10.1007/s13042-012-0089-5 CrossRef Barakat M, Lefebvre D, Khalil M, Druaux F, Mustapha O (2013) Parameter selection algorithm with self adaptive growing neural network classifier for diagnosis issues. Int J Mach Learn Cyber 4(3):217–233. doi:10.​1007/​s13042-012-0089-5 CrossRef
30.
Zurück zum Zitat Oh S-H (2011) Error back-propagation algorithm for classification of imbalanced data. Neurocomputing 74(6):1058–1061CrossRef Oh S-H (2011) Error back-propagation algorithm for classification of imbalanced data. Neurocomputing 74(6):1058–1061CrossRef
31.
Zurück zum Zitat Rumelhart DE, McClelland JL (1986) Parallel distributed processing. MIT Press, Cambridge Rumelhart DE, McClelland JL (1986) Parallel distributed processing. MIT Press, Cambridge
32.
Zurück zum Zitat Fontenla-Romero O, Guijarro-Berdiñas B, Pérez-Sánchez B, Alonso-Betanzos A (2010) A new convex objective function for the supervised learning of single-layer neural networks. Pattern Recogn 43(5):1984–1992CrossRefMATH Fontenla-Romero O, Guijarro-Berdiñas B, Pérez-Sánchez B, Alonso-Betanzos A (2010) A new convex objective function for the supervised learning of single-layer neural networks. Pattern Recogn 43(5):1984–1992CrossRefMATH
33.
Zurück zum Zitat Ghazikhani A, Monsefi R, Sadoghi Yazdi H (2012) Online cost-sensitive neural network classifiers for non-stationary and imbalanced data streams. Neural Comput Appl 1–13. doi:10.1007/s00521-012-1071-6 Ghazikhani A, Monsefi R, Sadoghi Yazdi H (2012) Online cost-sensitive neural network classifiers for non-stationary and imbalanced data streams. Neural Comput Appl 1–13. doi:10.​1007/​s00521-012-1071-6
34.
Zurück zum Zitat He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRef He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRef
35.
Zurück zum Zitat Street NW, Kim Y (2001) A streaming ensemble algorithm (SEA) for large-scale classification. In: Paper presented at the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Street NW, Kim Y (2001) A streaming ensemble algorithm (SEA) for large-scale classification. In: Paper presented at the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
36.
Zurück zum Zitat Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23:60–101 Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23:60–101
37.
Zurück zum Zitat Narasimhamurthy A, Kuncheva LI (2007) A framework for generating data to simulate changing environments. In: Paper presented at the IASTED International Conference on Artificial Intelligence and Applications Narasimhamurthy A, Kuncheva LI (2007) A framework for generating data to simulate changing environments. In: Paper presented at the IASTED International Conference on Artificial Intelligence and Applications
38.
Zurück zum Zitat Harries M (1999) Splice-2 comparative evaluation: electricity pricing. University of South Wales Harries M (1999) Splice-2 comparative evaluation: electricity pricing. University of South Wales
42.
Zurück zum Zitat Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Min Knowl Discov 13(3):261–289CrossRefMathSciNet Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Min Knowl Discov 13(3):261–289CrossRefMathSciNet
43.
Zurück zum Zitat Alpaydın E (2010) Introduction to machine learning, 2nd edn. The MIT Press, CambridgeMATH Alpaydın E (2010) Introduction to machine learning, 2nd edn. The MIT Press, CambridgeMATH
44.
Zurück zum Zitat Sipser M (2006) Introduction to the theory of computation. Course Technology Inc, Boston Sipser M (2006) Introduction to the theory of computation. Course Technology Inc, Boston
Metadaten
Titel
Online neural network model for non-stationary and imbalanced data stream classification
verfasst von
Adel Ghazikhani
Reza Monsefi
Hadi Sadoghi Yazdi
Publikationsdatum
01.02.2014
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 1/2014
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-013-0180-6

Weitere Artikel der Ausgabe 1/2014

International Journal of Machine Learning and Cybernetics 1/2014 Zur Ausgabe

Neuer Inhalt