Skip to main content
Erschienen in: Soft Computing 12/2020

14.10.2019 | Methodologies and Application

Clustering data stream with uncertainty using belief function theory and fading function

Erschienen in: Soft Computing | Ausgabe 12/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data stream clustering faces major challenges such as lack of memory and time. Therefore, traditional clustering methods are not suitable for this kind of data. On the other hand, most data stream clustering methods do not consider the problems of uncertainty and ambiguity in the data. So, in this case, where an object is close to a set of clusters, this object cannot be correctly and simply categorized. The aim of this study is to provide a new method for clustering data stream, called clustering data stream using belief function, with regard to the problem of uncertain and ambiguous data. In the proposed method, the belief function theory is used to cluster objects into single clusters or a set of clusters and determines the structure of data. In addition, using window, weighted centers, and the fading function overcomes the restrictions of data stream. The results of the experiments have been compared with state-of-the-art methods, which show the superiority of the proposed method in terms of purity, error rate, and ambiguity rate measures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Fixed granularity grid.
 
Literatur
Zurück zum Zitat Ackermann MR, Märtens M, Raupach C, Swierkot K, Lammersen C, Sohler C (2012) StreamKM ++: a clustering algorithm for data streams. J Exp Algorithm (JEA) 17:2–4MathSciNetMATH Ackermann MR, Märtens M, Raupach C, Swierkot K, Lammersen C, Sohler C (2012) StreamKM ++: a clustering algorithm for data streams. J Exp Algorithm (JEA) 17:2–4MathSciNetMATH
Zurück zum Zitat Aggarwal C (2013) A survey of stream clustering algorithms. In: Aggarwal CC, Reddy CK (eds) Data clustering: algorithms and applications. Chapman and Hall/CRC, Boca Raton, pp 229–256 Aggarwal C (2013) A survey of stream clustering algorithms. In: Aggarwal CC, Reddy CK (eds) Data clustering: algorithms and applications. Chapman and Hall/CRC, Boca Raton, pp 229–256
Zurück zum Zitat Aggarwal C, Yu P (2008) A framework for clustering uncertain data streams. In: IEEE international conference on data engineering, pp 150–159 Aggarwal C, Yu P (2008) A framework for clustering uncertain data streams. In: IEEE international conference on data engineering, pp 150–159
Zurück zum Zitat Aggarwal C, Han J, Wang J, Yu P, Watson T (2003) A framework for clustering evolving data streams. In: Proceedings of VLDB 2003, pp 81–92 Aggarwal C, Han J, Wang J, Yu P, Watson T (2003) A framework for clustering evolving data streams. In: Proceedings of VLDB 2003, pp 81–92
Zurück zum Zitat Aggarwal C, Han J, Wang J, Yu P (2004) A framework for projected clustering of high dimensional data streams. In: Proceedings of VLDB, pp 852–863 Aggarwal C, Han J, Wang J, Yu P (2004) A framework for projected clustering of high dimensional data streams. In: Proceedings of VLDB, pp 852–863
Zurück zum Zitat Ahmad S, Lavin A, Purdy S, Agha Z (2017) Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262:134–147 Ahmad S, Lavin A, Purdy S, Agha Z (2017) Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262:134–147
Zurück zum Zitat Ahmouda A, Hochmair HH, Cvetojevic S (2018) Analyzing the effect of earthquakes on OpenStreetMap contribution patterns and tweeting activities. Geospat Inf Sci 21(3):195–212 Ahmouda A, Hochmair HH, Cvetojevic S (2018) Analyzing the effect of earthquakes on OpenStreetMap contribution patterns and tweeting activities. Geospat Inf Sci 21(3):195–212
Zurück zum Zitat Amini A, Saboohi H, Herawan T, Wah T (2016) MuDi-Stream: a multi density clustering algorithm for evolving data stream. Netw Comput Appl 59:370–385 Amini A, Saboohi H, Herawan T, Wah T (2016) MuDi-Stream: a multi density clustering algorithm for evolving data stream. Netw Comput Appl 59:370–385
Zurück zum Zitat Antoine V, Quost B, Masson MH, Denoeux T (2014) CEVCLUS: evidential clustering with instance-level constraints for relational data. Soft Comput 18:1321–1335 Antoine V, Quost B, Masson MH, Denoeux T (2014) CEVCLUS: evidential clustering with instance-level constraints for relational data. Soft Comput 18:1321–1335
Zurück zum Zitat Bahri M, Elouedi Z (2017) Clustering data stream under a belief function framework. In: IEEE/ACS 13th international conference of computer systems and applications (AICCSA), pp 1–8 Bahri M, Elouedi Z (2017) Clustering data stream under a belief function framework. In: IEEE/ACS 13th international conference of computer systems and applications (AICCSA), pp 1–8
Zurück zum Zitat Bhatnagar V, Kaur S, Chakravarthy S (2014) Clustering data streams using grid-based synopsis. Knowl Inf Syst 41:127–152 Bhatnagar V, Kaur S, Chakravarthy S (2014) Clustering data streams using grid-based synopsis. Knowl Inf Syst 41:127–152
Zurück zum Zitat Calderwood S, McAreavey K, Liu W, Hong J (2017) Context-dependent combination of sensor information in Dempster–Shafer theory for BDI. Knowl Inf Syst 51:259–285 Calderwood S, McAreavey K, Liu W, Hong J (2017) Context-dependent combination of sensor information in Dempster–Shafer theory for BDI. Knowl Inf Syst 51:259–285
Zurück zum Zitat Chakeri A, Nekooimehr I, Hall LO (2013) Dempster–Shafer theory of evidence in Single Pass Fuzzy C Means. In: 2013 IEEE international conference on fuzzy systems, Hyderabad, pp 1–5 Chakeri A, Nekooimehr I, Hall LO (2013) Dempster–Shafer theory of evidence in Single Pass Fuzzy C Means. In: 2013 IEEE international conference on fuzzy systems, Hyderabad, pp 1–5
Zurück zum Zitat Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings KDD’07 proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 133–142 Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings KDD’07 proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 133–142
Zurück zum Zitat Croisard N, Vasile M, Kemble S, Radice G (2010) Preliminary space mission design under uncertainty. Acta Astronaut 66:654–664 Croisard N, Vasile M, Kemble S, Radice G (2010) Preliminary space mission design under uncertainty. Acta Astronaut 66:654–664
Zurück zum Zitat da Silva A, Chiky R, Hébrail G (2012) A clustering approach for sampling data streams in sensor networks. Knowl Inf Syst 32:1–23 da Silva A, Chiky R, Hébrail G (2012) A clustering approach for sampling data streams in sensor networks. Knowl Inf Syst 32:1–23
Zurück zum Zitat Ding S, Zhang J, Jia H, Qian J (2016) An adaptive density data stream clustering algorithm. Cognit Comput 8:30–38 Ding S, Zhang J, Jia H, Qian J (2016) An adaptive density data stream clustering algorithm. Cognit Comput 8:30–38
Zurück zum Zitat Frey B, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976MathSciNetMATH Frey B, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976MathSciNetMATH
Zurück zum Zitat Ghosh S, Mitra S (2013) Clustering large data with uncertainty. Appl Soft Comput 13:1639–1645 Ghosh S, Mitra S (2013) Clustering large data with uncertainty. Appl Soft Comput 13:1639–1645
Zurück zum Zitat Hamidzadeh J, Ghomanjani MH (2018) An unequal cluster-radius approach based on node density in clustering for wireless sensor networks. Wireless Pers Commun 101:1619–1637 Hamidzadeh J, Ghomanjani MH (2018) An unequal cluster-radius approach based on node density in clustering for wireless sensor networks. Wireless Pers Commun 101:1619–1637
Zurück zum Zitat Hamidzadeh J, Namaei N (2019) Belief-based chaotic algorithm for support vector data description. Soft Comput 23:4289–4314MATH Hamidzadeh J, Namaei N (2019) Belief-based chaotic algorithm for support vector data description. Soft Comput 23:4289–4314MATH
Zurück zum Zitat Hamidzadeh J, Monsefi R, Sadoghi Yazdi H (2015) IRAHC: instance reduction algorithm using hyper rectangle clustering. Pattern Recogn 48:1878–1889MATH Hamidzadeh J, Monsefi R, Sadoghi Yazdi H (2015) IRAHC: instance reduction algorithm using hyper rectangle clustering. Pattern Recogn 48:1878–1889MATH
Zurück zum Zitat Hamidzadeh J, Zabihimayvan M, Sadeghi R (2018) Detection of Web site visitors based on fuzzy rough sets. Soft Comput 22(7):2175–2188 Hamidzadeh J, Zabihimayvan M, Sadeghi R (2018) Detection of Web site visitors based on fuzzy rough sets. Soft Comput 22(7):2175–2188
Zurück zum Zitat Helton JC (2011) Quantification of margins and uncertainties: conceptual and computational basis. Reliab Eng Syst Saf 96:976–1013 Helton JC (2011) Quantification of margins and uncertainties: conceptual and computational basis. Reliab Eng Syst Saf 96:976–1013
Zurück zum Zitat Hofmeyr DP, Pavlidis NG, Eckley IA (2016) Divisive clustering of high dimensional data streams. Stat Comput 26:1101–1120MathSciNetMATH Hofmeyr DP, Pavlidis NG, Eckley IA (2016) Divisive clustering of high dimensional data streams. Stat Comput 26:1101–1120MathSciNetMATH
Zurück zum Zitat Jin C, Yu JX, Zhou A, Cao F (2014) Efficient clustering of uncertain data streams. Knowl Inf Syst 40:509–539 Jin C, Yu JX, Zhou A, Cao F (2014) Efficient clustering of uncertain data streams. Knowl Inf Syst 40:509–539
Zurück zum Zitat Khan I, Huang JZ, Ivanov K (2016) Incremental density-based ensemble clustering over evolving data streams. Neurocomputing 191:34–43 Khan I, Huang JZ, Ivanov K (2016) Incremental density-based ensemble clustering over evolving data streams. Neurocomputing 191:34–43
Zurück zum Zitat Kranen P, Assent I, Baldauf C, Seidl T (2011) The ClusTree: indexing micro-clusters for anytime stream mining. Knowl Inf Syst 29(2):249–272 Kranen P, Assent I, Baldauf C, Seidl T (2011) The ClusTree: indexing micro-clusters for anytime stream mining. Knowl Inf Syst 29(2):249–272
Zurück zum Zitat Li Y, Chen J, Feng L (2013) Dealing with uncertainty: a survey of theories and practices. IEEE Trans Knowl Data Eng 25(11):2463–2482 Li Y, Chen J, Feng L (2013) Dealing with uncertainty: a survey of theories and practices. IEEE Trans Knowl Data Eng 25(11):2463–2482
Zurück zum Zitat Liu Z, Pan Q, Dezert J, Martin A (2016) Adaptive imputation of missing values for incomplete pattern classification. Pattern Recogn 52:85–95 Liu Z, Pan Q, Dezert J, Martin A (2016) Adaptive imputation of missing values for incomplete pattern classification. Pattern Recogn 52:85–95
Zurück zum Zitat Masson M, Denœux T (2008) ECM: an evidential version of the fuzzy c-means algorithm. Pattern Recogn 41:1384–1397MATH Masson M, Denœux T (2008) ECM: an evidential version of the fuzzy c-means algorithm. Pattern Recogn 41:1384–1397MATH
Zurück zum Zitat Meesuksabai W, Kangkachit T, Waiyamai K (2011) HUE-stream: evolution-based clustering technique for heterogeneous data streams with uncertainty. In: Tang J, King I, Chen L, Wang J (eds) Advanced data mining and applications. ADMA 2011. Lecture notes in computer science. Springer, Berlin, pp 27–40 Meesuksabai W, Kangkachit T, Waiyamai K (2011) HUE-stream: evolution-based clustering technique for heterogeneous data streams with uncertainty. In: Tang J, King I, Chen L, Wang J (eds) Advanced data mining and applications. ADMA 2011. Lecture notes in computer science. Springer, Berlin, pp 27–40
Zurück zum Zitat Mousavi M, Abu Bakar A, Vakilian M (2015) Data stream clustering algorithms: a review. Int J Adv Soft Comput Appl 7:1–15 Mousavi M, Abu Bakar A, Vakilian M (2015) Data stream clustering algorithms: a review. Int J Adv Soft Comput Appl 7:1–15
Zurück zum Zitat Nguyen HL, Woon YK, Ng WK (2014) A survey on data stream clustering and classification. Knowl Inf Syst 45:535–569 Nguyen HL, Woon YK, Ng WK (2014) A survey on data stream clustering and classification. Knowl Inf Syst 45:535–569
Zurück zum Zitat Patra BK, Nandi S (2015) Effective data summarization for hierarchical clustering. Knowl Inf Syst 42:1–20 Patra BK, Nandi S (2015) Effective data summarization for hierarchical clustering. Knowl Inf Syst 42:1–20
Zurück zum Zitat Pereira C, Mello R (2015) PTS: projected topological stream clustering algorithm. Neurocomputing 180:16–26 Pereira C, Mello R (2015) PTS: projected topological stream clustering algorithm. Neurocomputing 180:16–26
Zurück zum Zitat Ramírez-Gallego S, Krawczyk B, García S, Woźniak M, Herrera F (2017) A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239:39–57 Ramírez-Gallego S, Krawczyk B, García S, Woźniak M, Herrera F (2017) A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239:39–57
Zurück zum Zitat Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er M, Ding W, Lin C (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681 Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er M, Ding W, Lin C (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Zurück zum Zitat Serir L, Ramasso E, Zerhouni N (2012) Evidential evolving Gustafson–Kessel algorithm for online data streams partitioning using belief function theory. Int J Approx Reason 53:747–768MathSciNet Serir L, Ramasso E, Zerhouni N (2012) Evidential evolving Gustafson–Kessel algorithm for online data streams partitioning using belief function theory. Int J Approx Reason 53:747–768MathSciNet
Zurück zum Zitat Shafer G (1976) A mathematical theory of evidence. Princeton University Press, PrincetonMATH Shafer G (1976) A mathematical theory of evidence. Princeton University Press, PrincetonMATH
Zurück zum Zitat Shang G, Zhu J, Gao T, Zheng X, Zhang J (2018) Using multi-source remote sensing data to classify larch plantations in Northeast China and support the development of multi-purpose silviculture. J For Res 29(4):889–904 Shang G, Zhu J, Gao T, Zheng X, Zhang J (2018) Using multi-source remote sensing data to classify larch plantations in Northeast China and support the development of multi-purpose silviculture. J For Res 29(4):889–904
Zurück zum Zitat Sheskin D (2011) Handbook of parametric and nonparametric statistical procedures. Chapman and Hall/CRC, Boca RatonMATH Sheskin D (2011) Handbook of parametric and nonparametric statistical procedures. Chapman and Hall/CRC, Boca RatonMATH
Zurück zum Zitat Silva J, Hruschka E, Gama J (2017) An evolutionary algorithm for clustering data streams with a variable number of clusters. Expert Syst Appl 67:228–238 Silva J, Hruschka E, Gama J (2017) An evolutionary algorithm for clustering data streams with a variable number of clusters. Expert Syst Appl 67:228–238
Zurück zum Zitat Smets P (2000) Data fusion in the transferable belief model. In: Proceedings of the third international conference on information fusion, pp 21–33 Smets P (2000) Data fusion in the transferable belief model. In: Proceedings of the third international conference on information fusion, pp 21–33
Zurück zum Zitat Yang Y, Liu Z, Xing Z (2015) A review of uncertain data stream clustering algorithms. In: Eighth international conference on internet computing for science and engineering (ICICSE), Harbin, pp 111–116 Yang Y, Liu Z, Xing Z (2015) A review of uncertain data stream clustering algorithms. In: Eighth international conference on internet computing for science and engineering (ICICSE), Harbin, pp 111–116
Zurück zum Zitat Yin C, Xia L, Zhang S, Sun R, Wang J (2018) Improved clustering algorithm based on high-speed network data stream. Soft Comput 22:4185–4195 Yin C, Xia L, Zhang S, Sun R, Wang J (2018) Improved clustering algorithm based on high-speed network data stream. Soft Comput 22:4185–4195
Zurück zum Zitat Yin C, Zhang S, Yin Z, Wang J (2019) Anomaly detection model based on data stream clustering. Cluster Comput 22:1729–1738 Yin C, Zhang S, Yin Z, Wang J (2019) Anomaly detection model based on data stream clustering. Cluster Comput 22:1729–1738
Zurück zum Zitat Yu X, Xu X, Lin L (2015) A data stream subspace clustering algorithm. In: Wang H et al (eds) Intelligent computation in big data era. ICYCSEE 2015. Communications in computer and information science. Springer, Berlin, pp 334–343 Yu X, Xu X, Lin L (2015) A data stream subspace clustering algorithm. In: Wang H et al (eds) Intelligent computation in big data era. ICYCSEE 2015. Communications in computer and information science. Springer, Berlin, pp 334–343
Zurück zum Zitat Zaman K, Rangavajhala S, McDonald MP, Mahadevan S (2011) A probabilistic approach for representation of interval uncertainty. Reliab Eng Syst Saf 96:117–130 Zaman K, Rangavajhala S, McDonald MP, Mahadevan S (2011) A probabilistic approach for representation of interval uncertainty. Reliab Eng Syst Saf 96:117–130
Zurück zum Zitat Zhang B, Qin S, Wang W, Wang D, Xue L (2016) Data stream clustering based on Fuzzy C-Mean algorithm and entropy theory. Sig Process 126:111–116 Zhang B, Qin S, Wang W, Wang D, Xue L (2016) Data stream clustering based on Fuzzy C-Mean algorithm and entropy theory. Sig Process 126:111–116
Zurück zum Zitat Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Syst 15:181–214 Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Syst 15:181–214
Metadaten
Titel
Clustering data stream with uncertainty using belief function theory and fading function
Publikationsdatum
14.10.2019
Erschienen in
Soft Computing / Ausgabe 12/2020
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-019-04422-4

Weitere Artikel der Ausgabe 12/2020

Soft Computing 12/2020 Zur Ausgabe