Skip to main content
Top
Published in: Knowledge and Information Systems 3/2019

16-10-2018 | Regular Paper

Concept-evolution detection in non-stationary data streams: a fuzzy clustering approach

Authors: Poorya ZareMoodi, Sajjad Kamali Siahroudi, Hamid Beigy

Published in: Knowledge and Information Systems | Issue 3/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We have entered the era of networked communications where concepts such as big data and social networks are emerging. The explosion and profusion of available data in a broad range of application domains cause data streams to become an inevitable part of the most real-world applications. In the classification of data streams, there are four major challenges: infinite length, concept drift, recurring and evolving concepts. This paper proposes a novel method to address the mentioned challenges with a focus on the last one. Unlike the existing methods for detection of evolving concepts, we cast joint classification and detection of evolving concepts into optimizing an objective function by extending a fuzzy agglomerative clustering method. Moreover, rather than keeping instances or hyper-sphere summaries of previously seen classes, we just maintain boundaries in the kernel space and generate instances of each class on demand. This approach enhances the accuracy and reduces the memory usage of the proposed method. We empirically evaluated and showed the effectiveness of the proposed approach on several synthetic and real datasets. Experimental results on synthetic and real datasets show the superiority of the proposed method over the related state-of-the-art methods in this area.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Hosseini MJ, Gholipour A, Beigy H (2016) An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl Inf Syst 46(3):567–597CrossRef Hosseini MJ, Gholipour A, Beigy H (2016) An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl Inf Syst 46(3):567–597CrossRef
2.
go back to reference Dehghan M, Beigy H, ZareMoodi P (2016) A novel concept drift detection method in data streams using ensemble classifiers. Intell Data Anal 20(6):1329–1350CrossRef Dehghan M, Beigy H, ZareMoodi P (2016) A novel concept drift detection method in data streams using ensemble classifiers. Intell Data Anal 20(6):1329–1350CrossRef
3.
go back to reference Hosseini MJ, Gholipour A, Beigy H (2016) An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl Inf Syst 46(3):567–597CrossRef Hosseini MJ, Gholipour A, Beigy H (2016) An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams. Knowl Inf Syst 46(3):567–597CrossRef
5.
go back to reference Abdallah ZS, Gaber MM, Srinivasan B, Krishnaswamy S (2016) Anynovel: detection of novel concepts in evolving data streams. Evol Syst 7(2):73–93CrossRef Abdallah ZS, Gaber MM, Srinivasan B, Krishnaswamy S (2016) Anynovel: detection of novel concepts in evolving data streams. Evol Syst 7(2):73–93CrossRef
6.
go back to reference de Faria ER, Goncalves IR, Gama J, de Leon Ferreira ACP et al (2015) Evaluation of multiclass novelty detection algorithms for data streams. IEEE Trans Knowl Data Eng 27(11):2961–2973CrossRef de Faria ER, Goncalves IR, Gama J, de Leon Ferreira ACP et al (2015) Evaluation of multiclass novelty detection algorithms for data streams. IEEE Trans Knowl Data Eng 27(11):2961–2973CrossRef
8.
go back to reference Pimentel MA, Clifton DA, Clifton L, Tarassenko L (2014) A review of novelty detection. Signal Process 99:215–249CrossRef Pimentel MA, Clifton DA, Clifton L, Tarassenko L (2014) A review of novelty detection. Signal Process 99:215–249CrossRef
9.
go back to reference ZareMoodi P, Beigy H, Siahroudi SK (2015) Novel class detection in data streams using local patterns and neighborhood graph. Neurocomputing 158:234–245CrossRef ZareMoodi P, Beigy H, Siahroudi SK (2015) Novel class detection in data streams using local patterns and neighborhood graph. Neurocomputing 158:234–245CrossRef
10.
go back to reference Masud MM, Gao J, Khan L, Han J, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874CrossRef Masud MM, Gao J, Khan L, Han J, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874CrossRef
11.
go back to reference Al-Khateeb T, Masud MM, Khan L, Aggarwal C, Han J, Thuraisingham B (2012) Stream classification with recurring and novel class detection using class-based ensemble. In: Proceedings of the IEEE 12th international conference on data mining (ICDM). IEEE, pp 31–40 Al-Khateeb T, Masud MM, Khan L, Aggarwal C, Han J, Thuraisingham B (2012) Stream classification with recurring and novel class detection using class-based ensemble. In: Proceedings of the IEEE 12th international conference on data mining (ICDM). IEEE, pp 31–40
12.
go back to reference ZareMoodi P, Siahroudi SK, Beigy H (2016) A support vector based approach for classification beyond the learned label space in data streams. In: Proceedings of the 31st annual ACM symposium on applied computing. ACM, pp 910–915 ZareMoodi P, Siahroudi SK, Beigy H (2016) A support vector based approach for classification beyond the learned label space in data streams. In: Proceedings of the 31st annual ACM symposium on applied computing. ACM, pp 910–915
13.
go back to reference Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Srivastava A, Oza NC (2013) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497CrossRef Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Srivastava A, Oza NC (2013) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497CrossRef
14.
go back to reference Farid DM, Rahman CM (2012) Novel class detection in concept-drifting data stream mining employing decision tree. In: Proceedings of the 7th international conference on electrical and computer engineering (ICECE). IEEE, pp 630–633 Farid DM, Rahman CM (2012) Novel class detection in concept-drifting data stream mining employing decision tree. In: Proceedings of the 7th international conference on electrical and computer engineering (ICECE). IEEE, pp 630–633
15.
go back to reference Faria ER, Gama J, Carvalho AC (2013) Novelty detection algorithm for data streams multi-class problems. In: Proceedings of the 28th annual ACM symposium on applied computing. ACM, pp 795–800 Faria ER, Gama J, Carvalho AC (2013) Novelty detection algorithm for data streams multi-class problems. In: Proceedings of the 28th annual ACM symposium on applied computing. ACM, pp 795–800
16.
go back to reference Spinosa EJ, de Leon F de Carvalho AP, Gama J (2007) Olindda: a cluster-based approach for detecting novelty and concept drift in data streams. In: Proceedings of the 2007 ACM symposium on applied computing. ACM, New York, NY, USA, pp 448–452. https://doi.org/10.1145/1244002.1244107 Spinosa EJ, de Leon F de Carvalho AP, Gama J (2007) Olindda: a cluster-based approach for detecting novelty and concept drift in data streams. In: Proceedings of the 2007 ACM symposium on applied computing. ACM, New York, NY, USA, pp 448–452. https://​doi.​org/​10.​1145/​1244002.​1244107
17.
go back to reference Mu X, Ting KM, Zhou Z (2016) Classification under streaming emerging new classes: a solution using completely random trees. CoRR arXiv:1605.09131 Mu X, Ting KM, Zhou Z (2016) Classification under streaming emerging new classes: a solution using completely random trees. CoRR arXiv:​1605.​09131
18.
go back to reference Haque A, Khan L, Baron M (2015) Semi supervised adaptive framework for classifying evolving data stream. In: PAKDD (2). Volume 9078 of lecture notes in computer science. Springer, pp 383–394 Haque A, Khan L, Baron M (2015) Semi supervised adaptive framework for classifying evolving data stream. In: PAKDD (2). Volume 9078 of lecture notes in computer science. Springer, pp 383–394
19.
go back to reference Haque A, Khan L, Baron M (2016) SAND: semi-supervised adaptive novel class detection and classification over data stream. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, AAAI’16. AAAI Press, pp 1652–1658 Haque A, Khan L, Baron M (2016) SAND: semi-supervised adaptive novel class detection and classification over data stream. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, AAAI’16. AAAI Press, pp 1652–1658
20.
go back to reference Bouguelia M, Belaïd Y, Belaïd A (2014) Efficient active novel class detection for data stream classification. In: ICPR. IEEE Computer Society, pp 2826–2831 Bouguelia M, Belaïd Y, Belaïd A (2014) Efficient active novel class detection for data stream classification. In: ICPR. IEEE Computer Society, pp 2826–2831
21.
go back to reference Bouguelia M, Belaïd Y, Belaïd A (2013) A stream-based semi-supervised active learning approach for document classification. In: 12th International conference on document analysis and recognition, ICDAR 2013, Washington, DC, USA, August 25–28, 2013, pp 611–615 Bouguelia M, Belaïd Y, Belaïd A (2013) A stream-based semi-supervised active learning approach for document classification. In: 12th International conference on document analysis and recognition, ICDAR 2013, Washington, DC, USA, August 25–28, 2013, pp 611–615
22.
go back to reference Siahroudi SK, Moodi PZ, Beigy H (2018) Detection of evolving concepts in non-stationary data streams: a multiple kernel learning approach. Exp Syst Appl 91:187–197CrossRef Siahroudi SK, Moodi PZ, Beigy H (2018) Detection of evolving concepts in non-stationary data streams: a multiple kernel learning approach. Exp Syst Appl 91:187–197CrossRef
23.
go back to reference Rigollet P (2007) Generalization error bounds in semi-supervised classification under the cluster assumption. J Mach Learn Res 8:1369–1392MathSciNetMATH Rigollet P (2007) Generalization error bounds in semi-supervised classification under the cluster assumption. J Mach Learn Res 8:1369–1392MathSciNetMATH
24.
go back to reference Camci F, Chinnam RB (2008) General support vector representation machine for one-class classification of non-stationary classes. Pattern Recognit 41(10):3021–3034CrossRefMATH Camci F, Chinnam RB (2008) General support vector representation machine for one-class classification of non-stationary classes. Pattern Recognit 41(10):3021–3034CrossRefMATH
25.
go back to reference Krawczyk B, Woźniak M (2013) Incremental learning and forgetting in one-class classifiers for data streams. In: Proceedings of the 8th international conference on computer recognition systems. Springer, pp 319–328 Krawczyk B, Woźniak M (2013) Incremental learning and forgetting in one-class classifiers for data streams. In: Proceedings of the 8th international conference on computer recognition systems. Springer, pp 319–328
26.
go back to reference Li MJ, Ng MK, Cheung Y, Huang JZ (2008) Agglomerative fuzzy k-means clustering algorithm with selection of number of clusters. IEEE Trans Knowl Data Eng 20(11):1519–1534CrossRef Li MJ, Ng MK, Cheung Y, Huang JZ (2008) Agglomerative fuzzy k-means clustering algorithm with selection of number of clusters. IEEE Trans Knowl Data Eng 20(11):1519–1534CrossRef
27.
go back to reference Sun H, Wang S, Jiang Q (2004) FCM-based model selection algorithms for determining the number of clusters. Pattern Recognit 37(10):2027–2037CrossRefMATH Sun H, Wang S, Jiang Q (2004) FCM-based model selection algorithms for determining the number of clusters. Pattern Recognit 37(10):2027–2037CrossRefMATH
28.
go back to reference Tax DM, Duin RP (2002) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155–173MATH Tax DM, Duin RP (2002) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155–173MATH
29.
go back to reference Ullman NR (1978) Elementary statistics: an applied approach. Wiley, New York Ullman NR (1978) Elementary statistics: an applied approach. Wiley, New York
30.
go back to reference Mika S, Schölkopf B, Smola AJ, Müller KR, Scholz M, Rätsch G (1998) Kernel PCA and de-noising in feature spaces. In: NIPS, vol 4, p 7 Mika S, Schölkopf B, Smola AJ, Müller KR, Scholz M, Rätsch G (1998) Kernel PCA and de-noising in feature spaces. In: NIPS, vol 4, p 7
31.
go back to reference Schölkopf B, Mika S, Burges CJ, Knirsch P, Müller KR, Rätsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5):1000–1017CrossRef Schölkopf B, Mika S, Burges CJ, Knirsch P, Müller KR, Rätsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5):1000–1017CrossRef
Metadata
Title
Concept-evolution detection in non-stationary data streams: a fuzzy clustering approach
Authors
Poorya ZareMoodi
Sajjad Kamali Siahroudi
Hamid Beigy
Publication date
16-10-2018
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 3/2019
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-018-1266-y

Other articles of this Issue 3/2019

Knowledge and Information Systems 3/2019 Go to the issue

Premium Partner