Skip to main content
Erschienen in: Neural Computing and Applications 5/2013

01.10.2013 | Original Article

Online cost-sensitive neural network classifiers for non-stationary and imbalanced data streams

verfasst von: Adel Ghazikhani, Reza Monsefi, Hadi Sadoghi Yazdi

Erschienen in: Neural Computing and Applications | Ausgabe 5/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Classifying non-stationary and imbalanced data streams encompasses two important challenges, namely concept drift and class imbalance. Concept drift is changes in the underlying function being learnt, and class imbalance is vast difference between the numbers of instances in different classes of data. Class imbalance is an obstacle for the efficiency of most classifiers. Previous methods for classifying non-stationary and imbalanced data streams mainly focus on batch solutions, in which the classification model is trained using a chunk of data. Here, we propose two online classifiers. The classifiers are one-layer NNs. In the proposed classifiers, class imbalance is handled with two separate cost-sensitive strategies. The first one incorporates a fixed and the second one an adaptive misclassification cost matrix. The proposed classifiers are evaluated on 3 synthetic and 8 real-world datasets. The results show statistically significant improvements in imbalanced data metrics.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aggarwal CC (2006) Data streams: models and algorithms. Springer, Berlin Aggarwal CC (2006) Data streams: models and algorithms. Springer, Berlin
2.
Zurück zum Zitat Klinkenberg R, Joachims T (2000) Detecting concept drift with support vector machines. Paper presented at the 17th international conference on machine learning, San Mateo Klinkenberg R, Joachims T (2000) Detecting concept drift with support vector machines. Paper presented at the 17th international conference on machine learning, San Mateo
3.
Zurück zum Zitat Sun J, Li H (2011) Dynamic financial distress prediction using instance selection for the disposal of concept drift. Expert Syst Appl 38(3):2566–2576CrossRef Sun J, Li H (2011) Dynamic financial distress prediction using instance selection for the disposal of concept drift. Expert Syst Appl 38(3):2566–2576CrossRef
4.
Zurück zum Zitat Martínez-Rego D, Pérez-Sánchez B, Fontenla-Romero O, Alonso-Betanzos A (2011) A robust incremental learning method for non-stationary environments. Neurocomputing 74(11):1800–1808CrossRef Martínez-Rego D, Pérez-Sánchez B, Fontenla-Romero O, Alonso-Betanzos A (2011) A robust incremental learning method for non-stationary environments. Neurocomputing 74(11):1800–1808CrossRef
5.
Zurück zum Zitat Pavlidis NG, Tasoulis DK, Adams NM, Hand DJ (2011) An adaptive classifier for data streams. Pattern Recognit 44(1):78–96CrossRefMATH Pavlidis NG, Tasoulis DK, Adams NM, Hand DJ (2011) An adaptive classifier for data streams. Pattern Recognit 44(1):78–96CrossRefMATH
6.
Zurück zum Zitat Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(1):1517–1531CrossRef Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(1):1517–1531CrossRef
7.
Zurück zum Zitat Abdulsalam H, Skillicorn DB, Martin P (2011) Classification using streaming random forests. IEEE Trans Knowl Data Eng 23(1):22–36CrossRef Abdulsalam H, Skillicorn DB, Martin P (2011) Classification using streaming random forests. IEEE Trans Knowl Data Eng 23(1):22–36CrossRef
8.
Zurück zum Zitat Masud MM, Jing G, Khan L, Jiawei H, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874CrossRef Masud MM, Jing G, Khan L, Jiawei H, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874CrossRef
9.
Zurück zum Zitat Gao J, Fan W, Han J, Yu PS (2007) A general framework for mining concept-drifting data streams with skewed distributions. Paper presented at the SIAM Gao J, Fan W, Han J, Yu PS (2007) A general framework for mining concept-drifting data streams with skewed distributions. Paper presented at the SIAM
10.
Zurück zum Zitat Lichtenwalter R, Chawla NV (2009) Learning to classify data streams with imbalanced class distributions. Paper presented at the PAKDD Lichtenwalter R, Chawla NV (2009) Learning to classify data streams with imbalanced class distributions. Paper presented at the PAKDD
11.
Zurück zum Zitat Lichtenwalter R, Chawla NV (2009) Adaptive methods for classification in arbitrarily imbalanced and drifting data streams. Paper presented at the PAKDD workshop for data mining when classes are imbalanced and errors have costs Lichtenwalter R, Chawla NV (2009) Adaptive methods for classification in arbitrarily imbalanced and drifting data streams. Paper presented at the PAKDD workshop for data mining when classes are imbalanced and errors have costs
12.
Zurück zum Zitat Chen S, He H (2010) Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evol Syst 2(1):35–50 Chen S, He H (2010) Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evol Syst 2(1):35–50
13.
Zurück zum Zitat Ditzler G, Polikar R (2010) An ensemble based incremental learning framework for concept drift and class imbalance. Paper presented at the WCCI Ditzler G, Polikar R (2010) An ensemble based incremental learning framework for concept drift and class imbalance. Paper presented at the WCCI
14.
Zurück zum Zitat Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report: TCD-CS-2004-15. Trinity College Dublin, Computer Science Department, Dublin Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report: TCD-CS-2004-15. Trinity College Dublin, Computer Science Department, Dublin
15.
Zurück zum Zitat He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRef He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRef
16.
Zurück zum Zitat Zadrozny B, Langford J, Abe N (2003, Nov) Cost-sensitive learning by cost proportionate example weighting. Paper presented at the 3rd IEEE international conference on data mining, Melbourne Zadrozny B, Langford J, Abe N (2003, Nov) Cost-sensitive learning by cost proportionate example weighting. Paper presented at the 3rd IEEE international conference on data mining, Melbourne
17.
Zurück zum Zitat Ling CX, Li C (2004, July) Decision trees with minimal costs. Paper presented at the 21st International Conference on Machine Learning, Banff Ling CX, Li C (2004, July) Decision trees with minimal costs. Paper presented at the 21st International Conference on Machine Learning, Banff
18.
Zurück zum Zitat Domingos P (1999) Metacost: a general method for making classifiers cost-sensitive. In: Paper presented at the proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. San Diego, CA Domingos P (1999) Metacost: a general method for making classifiers cost-sensitive. In: Paper presented at the proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. San Diego, CA
19.
Zurück zum Zitat Lan J-s, Berardi V, Patuwo B, Hu M (2009) A joint investigation of misclassification treatments and imbalanced datasets on neural network performance. Neural Comput Appl 18(7):689–706. doi:10.1007/s00521-009-0239-1 CrossRef Lan J-s, Berardi V, Patuwo B, Hu M (2009) A joint investigation of misclassification treatments and imbalanced datasets on neural network performance. Neural Comput Appl 18(7):689–706. doi:10.​1007/​s00521-009-0239-1 CrossRef
21.
Zurück zum Zitat Zhi-Hua Z, Xu-Ying L (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77CrossRef Zhi-Hua Z, Xu-Ying L (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77CrossRef
22.
Zurück zum Zitat Street NW, Kim Y (2001) A streaming ensemble algorithm (SEA) for large-scale classification. Paper presented at the 7th ACM SIGKDD international conference on knowledge discovery and data mining Street NW, Kim Y (2001) A streaming ensemble algorithm (SEA) for large-scale classification. Paper presented at the 7th ACM SIGKDD international conference on knowledge discovery and data mining
23.
Zurück zum Zitat Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23:60–101 Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23:60–101
24.
Zurück zum Zitat Narasimhamurthy A, Kuncheva LI (2007) A framework for generating data to simulate changing environments. Paper presented at the IASTED international conference on artificial intelligence and applications Narasimhamurthy A, Kuncheva LI (2007) A framework for generating data to simulate changing environments. Paper presented at the IASTED international conference on artificial intelligence and applications
25.
Zurück zum Zitat Harries M (1999) Splice-2 comparative evaluation: electricity pricing. University of South Wales Harries M (1999) Splice-2 comparative evaluation: electricity pricing. University of South Wales
29.
Zurück zum Zitat Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Mining Knowl Discov 13(3):261–289MathSciNetCrossRef Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Mining Knowl Discov 13(3):261–289MathSciNetCrossRef
30.
Zurück zum Zitat Alpaydın E (2010) Introduction to machine learning, 2nd edn. The MIT Press, CambridgeMATH Alpaydın E (2010) Introduction to machine learning, 2nd edn. The MIT Press, CambridgeMATH
Metadaten
Titel
Online cost-sensitive neural network classifiers for non-stationary and imbalanced data streams
verfasst von
Adel Ghazikhani
Reza Monsefi
Hadi Sadoghi Yazdi
Publikationsdatum
01.10.2013
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 5/2013
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-012-1071-6

Weitere Artikel der Ausgabe 5/2013

Neural Computing and Applications 5/2013 Zur Ausgabe

Premium Partner