Skip to main content
Erschienen in: Artificial Intelligence Review 2/2021

21.07.2020

Data stream clustering: a review

verfasst von: Alaettin Zubaroğlu, Volkan Atalay

Erschienen in: Artificial Intelligence Review | Ausgabe 2/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Number of connected devices is steadily increasing and these devices continuously generate data streams. Real-time processing of data streams is arousing interest despite many challenges. Clustering is one of the most suitable methods for real-time data stream processing, because it can be applied with less prior information about the data and it does not need labeled instances. However, data stream clustering differs from traditional clustering in many aspects and it has several challenging issues. Here, we provide information regarding the concepts and common characteristics of data streams, such as concept drift, data structures for data streams, time window models and outlier detection. We comprehensively review recent data stream clustering algorithms and analyze them in terms of the base clustering technique, computational complexity and clustering accuracy. A comparison of these algorithms is given along with still open problems. We indicate popular data stream repositories and datasets, stream processing tools and platforms. Open problems about data stream clustering are also discussed.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Ackermann MR, Märtens M, Raupach C, Swierkot K, Lammersen C, Sohler C (2012) Streamkm++: a clustering algorithm for data streams. J Exp Algorithm 17:2.4:2.1–2.4:2.30MathSciNetMATHCrossRef Ackermann MR, Märtens M, Raupach C, Swierkot K, Lammersen C, Sohler C (2012) Streamkm++: a clustering algorithm for data streams. J Exp Algorithm 17:2.4:2.1–2.4:2.30MathSciNetMATHCrossRef
Zurück zum Zitat Aggarwal CC (2013) A survey of stream clustering algorithms. In: Reddy CK, Aggarwal CC (eds) Data clustering: algorithms and applications. CRC Press, Boca Raton, pp 231–258CrossRef Aggarwal CC (2013) A survey of stream clustering algorithms. In: Reddy CK, Aggarwal CC (eds) Data clustering: algorithms and applications. CRC Press, Boca Raton, pp 231–258CrossRef
Zurück zum Zitat Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on very large data bases, VLDB ’03, vol 9, pp 81–92 Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on very large data bases, VLDB ’03, vol 9, pp 81–92
Zurück zum Zitat Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, association for computing machinery, SIGMOD ’98, New York, NY, USA, pp 94–105. https://doi.org/10.1145/276304.276314 Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, association for computing machinery, SIGMOD ’98, New York, NY, USA, pp 94–105. https://​doi.​org/​10.​1145/​276304.​276314
Zurück zum Zitat Alam F, Mehmood R, Katib I, Albeshri A (2016) Analysis of eight data mining algorithms for smarter internet of things (IoT). Procedia Comput Sci 98:437–442CrossRef Alam F, Mehmood R, Katib I, Albeshri A (2016) Analysis of eight data mining algorithms for smarter internet of things (IoT). Procedia Comput Sci 98:437–442CrossRef
Zurück zum Zitat Amini A, Saboohi H, Herawan T, Wah TY (2016) Mudi-stream: a multi density clustering algorithm for evolving data stream. J Netw Comput Appl 59(C):370–385CrossRef Amini A, Saboohi H, Herawan T, Wah TY (2016) Mudi-stream: a multi density clustering algorithm for evolving data stream. J Netw Comput Appl 59(C):370–385CrossRef
Zurück zum Zitat Andrade Silva J, Hruschka ER, Gama J (2017) An evolutionary algorithm for clustering data streams with a variable number of clusters. Expert Syst Appl 67:228–238CrossRef Andrade Silva J, Hruschka ER, Gama J (2017) An evolutionary algorithm for clustering data streams with a variable number of clusters. Expert Syst Appl 67:228–238CrossRef
Zurück zum Zitat Bhosale SV (2014) A survey: outlier detection in streaming data using clustering approached. Int J Comput Sci Inf Technol 5:6050–6053 Bhosale SV (2014) A survey: outlier detection in streaming data using clustering approached. Int J Comput Sci Inf Technol 5:6050–6053
Zurück zum Zitat Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604 Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604
Zurück zum Zitat Carnein M, Assenmacher D, Trautmann H (2017) An empirical comparison of stream clustering algorithms. In: Proceedings of the computing frontiers conference, CF’17, pp 361–366 Carnein M, Assenmacher D, Trautmann H (2017) An empirical comparison of stream clustering algorithms. In: Proceedings of the computing frontiers conference, CF’17, pp 361–366
Zurück zum Zitat Chauhan P, Shukla M (2015) A review on outlier detection techniques on data stream by using different approaches of K-Means algorithm. In: 2015 international conference on advances in computer engineering and applications Chauhan P, Shukla M (2015) A review on outlier detection techniques on data stream by using different approaches of K-Means algorithm. In: 2015 international conference on advances in computer engineering and applications
Zurück zum Zitat Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07, pp 133–142 Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07, pp 133–142
Zurück zum Zitat Dang XH, Lee VCS, Ng WK, Ong KL (2009) Incremental and adaptive clustering stream data over sliding window. In: Bhowmick SS, Küng J, Wagner R (eds) Database and expert systems applications. Springer, Berlin, pp 660–674CrossRef Dang XH, Lee VCS, Ng WK, Ong KL (2009) Incremental and adaptive clustering stream data over sliding window. In: Bhowmick SS, Küng J, Wagner R (eds) Database and expert systems applications. Springer, Berlin, pp 660–674CrossRef
Zurück zum Zitat Ding S, Wu F, Qian J, Jia H, Jin F (2015) Research on data stream clustering algorithms. Artif Intell Rev 43(4):593–600CrossRef Ding S, Wu F, Qian J, Jia H, Jin F (2015) Research on data stream clustering algorithms. Artif Intell Rev 43(4):593–600CrossRef
Zurück zum Zitat Ester M, Kriegel HP, Sander J, Wimmer M, Xu X (1998) Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 323–333 Ester M, Kriegel HP, Sander J, Wimmer M, Xu X (1998) Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 323–333
Zurück zum Zitat Fahy C, Yang S, Gongora M (2018) Ant colony stream clustering: a fast density clustering algorithm for dynamic data streams. IEEE Trans Cybern 49(6):2215–2228CrossRef Fahy C, Yang S, Gongora M (2018) Ant colony stream clustering: a fast density clustering algorithm for dynamic data streams. IEEE Trans Cybern 49(6):2215–2228CrossRef
Zurück zum Zitat Gaber MM, Zaslavsky A, Krishnaswamy S (2009) Data stream mining. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, Berlin, pp 759–787CrossRef Gaber MM, Zaslavsky A, Krishnaswamy S (2009) Data stream mining. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, Berlin, pp 759–787CrossRef
Zurück zum Zitat Gama J, Rodrigues PP, Lopes L (2011) Clustering distributed sensor data streams using local processing and reduced communication. Intell Data Anal 15(1):3–28CrossRef Gama J, Rodrigues PP, Lopes L (2011) Clustering distributed sensor data streams using local processing and reduced communication. Intell Data Anal 15(1):3–28CrossRef
Zurück zum Zitat Gama J, Žliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37MATHCrossRef Gama J, Žliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37MATHCrossRef
Zurück zum Zitat Gedik B, Andrade H (2012) A model-based framework for building extensible, high performance stream processing middleware and programming language for IBM InfoSphere Streams. Softw Pract Exp 42(11):1363–1391CrossRef Gedik B, Andrade H (2012) A model-based framework for building extensible, high performance stream processing middleware and programming language for IBM InfoSphere Streams. Softw Pract Exp 42(11):1363–1391CrossRef
Zurück zum Zitat Ghesmoune M, Lebbah M, Azzag H (2016) State-of-the-art on clustering data streams. Big Data Anal 1(1):13CrossRef Ghesmoune M, Lebbah M, Azzag H (2016) State-of-the-art on clustering data streams. Big Data Anal 1(1):13CrossRef
Zurück zum Zitat Hassani M, Spaus P, Seidl T (2014) Adaptive multiple-resolution stream clustering. In: Machine learning and data mining in pattern recognition, pp 134–148 Hassani M, Spaus P, Seidl T (2014) Adaptive multiple-resolution stream clustering. In: Machine learning and data mining in pattern recognition, pp 134–148
Zurück zum Zitat Hassani M, Spaus P, Cuzzocrea A, Seidl T (2015) Adaptive stream clustering using incremental graph maintenance. In: Proceedings of the 4th international conference on big data, streams and heterogeneous source mining: algorithms, systems, programming models and applications , BIGMINE’15, vol 41, pp 49–64 Hassani M, Spaus P, Cuzzocrea A, Seidl T (2015) Adaptive stream clustering using incremental graph maintenance. In: Proceedings of the 4th international conference on big data, streams and heterogeneous source mining: algorithms, systems, programming models and applications , BIGMINE’15, vol 41, pp 49–64
Zurück zum Zitat Hassani M, Spaus P, Cuzzocrea A, Seidl T (2016) I-hastream: density-based hierarchical clustering of big data streams and its application to big graph analytics tools. In: 2016 16th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGrid), pp 656–665 Hassani M, Spaus P, Cuzzocrea A, Seidl T (2016) I-hastream: density-based hierarchical clustering of big data streams and its application to big graph analytics tools. In: 2016 16th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGrid), pp 656–665
Zurück zum Zitat Hyde R, Angelov P, MacKenzie A (2017) Fully online clustering of evolving data streams into arbitrarily shaped clusters. Inf Sci 382–383:96–114CrossRef Hyde R, Angelov P, MacKenzie A (2017) Fully online clustering of evolving data streams into arbitrarily shaped clusters. Inf Sci 382–383:96–114CrossRef
Zurück zum Zitat Keogh E, Lin J, Fu A (2005) Hot sax: efficiently finding the most unusual time series subsequence. In: Proceedings of the fifth IEEE international conference on data mining, ICDM ’05, IEEE Computer Society, USA, pp 226–233. https://doi.org/10.1109/ICDM.2005.79 Keogh E, Lin J, Fu A (2005) Hot sax: efficiently finding the most unusual time series subsequence. In: Proceedings of the fifth IEEE international conference on data mining, ICDM ’05, IEEE Computer Society, USA, pp 226–233. https://​doi.​org/​10.​1109/​ICDM.​2005.​79
Zurück zum Zitat Kong X, Bi Y, Glass DH (2019) Detecting anomalies in sequential data augmented with new features. Artif Intell Rev 53:625–652CrossRef Kong X, Bi Y, Glass DH (2019) Detecting anomalies in sequential data augmented with new features. Artif Intell Rev 53:625–652CrossRef
Zurück zum Zitat Kremer H, Kranen P, Jansen T, Seidl T, Bifet A, Holmes G, Pfahringer B (2011) An effecive evaluation measure for clustering on evolving data streams. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’11, pp 868–876 Kremer H, Kranen P, Jansen T, Seidl T, Bifet A, Holmes G, Pfahringer B (2011) An effecive evaluation measure for clustering on evolving data streams. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’11, pp 868–876
Zurück zum Zitat Kumar P (2016) Data stream clustering in internet of things. SSRG Int J Comput Sci Eng 3(8):1–14CrossRef Kumar P (2016) Data stream clustering in internet of things. SSRG Int J Comput Sci Eng 3(8):1–14CrossRef
Zurück zum Zitat Liu L, Huang H, Guo Y, Chen F (2009) rDenStream, a clustering algorithm over an evolving data stream. In: 2009 International conference on information engineering and computer science, pp 1–4 Liu L, Huang H, Guo Y, Chen F (2009) rDenStream, a clustering algorithm over an evolving data stream. In: 2009 International conference on information engineering and computer science, pp 1–4
Zurück zum Zitat Lu Y, Sun Y, Xu G, Liu G (2005) A grid-based clustering algorithm for high-dimensional data streams. In: Li X, Wang S, Dong ZY (eds) Advanced data mining and applications. Springer, Berlin, pp 824–831CrossRef Lu Y, Sun Y, Xu G, Liu G (2005) A grid-based clustering algorithm for high-dimensional data streams. In: Li X, Wang S, Dong ZY (eds) Advanced data mining and applications. Springer, Berlin, pp 824–831CrossRef
Zurück zum Zitat Mahdiraji AR (2009) Clustering data stream: a survey of algorithms. Int J Knowl-Based Intell Eng Syst 13(2):39–44 Mahdiraji AR (2009) Clustering data stream: a survey of algorithms. Int J Knowl-Based Intell Eng Syst 13(2):39–44
Zurück zum Zitat Mansalis S, Ntoutsi E, Pelekis N, Theodoridis Y (2018) An evaluation of data stream clustering algorithms. Stat Anal Data Min ASA Data Sci J 11(4):167–187MathSciNetMATHCrossRef Mansalis S, Ntoutsi E, Pelekis N, Theodoridis Y (2018) An evaluation of data stream clustering algorithms. Stat Anal Data Min ASA Data Sci J 11(4):167–187MathSciNetMATHCrossRef
Zurück zum Zitat Merino JA (2015) Streaming data clustering in MOA using the leader algorithm. PhD thesis, Universitat Politècnica de Catalunya Merino JA (2015) Streaming data clustering in MOA using the leader algorithm. PhD thesis, Universitat Politècnica de Catalunya
Zurück zum Zitat Modi KD, Oza PB (2017) Outlier analysis approaches in data mining. Int J Innov Res Technol 3:6–12CrossRef Modi KD, Oza PB (2017) Outlier analysis approaches in data mining. Int J Innov Res Technol 3:6–12CrossRef
Zurück zum Zitat Mousavi M, Bakar A, Vakilian M (2015) Data stream clustering algorithms: a review. Int J Adv Soft Comput Appl 7:1–15 Mousavi M, Bakar A, Vakilian M (2015) Data stream clustering algorithms: a review. Int J Adv Soft Comput Appl 7:1–15
Zurück zum Zitat Mouss H, Mouss D, Mouss N, Sefouhi L (2004) Test of page-hinckley, an approach for fault detection in an agro-alimentary production system. In: 2004 5th Asian control conference (IEEE Cat. No.04EX904), vol 2, pp 815–818 Mouss H, Mouss D, Mouss N, Sefouhi L (2004) Test of page-hinckley, an approach for fault detection in an agro-alimentary production system. In: 2004 5th Asian control conference (IEEE Cat. No.04EX904), vol 2, pp 815–818
Zurück zum Zitat Namadchian A, Esfandani G (2012) Dsclu: a new data stream clustring algorithm for multi density environments. In: 2012 13th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing, pp 83–88 Namadchian A, Esfandani G (2012) Dsclu: a new data stream clustring algorithm for multi density environments. In: 2012 13th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing, pp 83–88
Zurück zum Zitat Nguyen HL, Woon YK, Ng WK (2015) A survey on data stream clustering and classification. Knowl Inf Syst 45(3):535–569CrossRef Nguyen HL, Woon YK, Ng WK (2015) A survey on data stream clustering and classification. Knowl Inf Syst 45(3):535–569CrossRef
Zurück zum Zitat O’Callaghan L, Meyerson A, Motwani R, Mishra N, Guha S (2002) Streaming-data algorithms for high-quality clustering. In: Proceedings of the 18th international conference on data engineering, ICDE ’02, pp 685–694 O’Callaghan L, Meyerson A, Motwani R, Mishra N, Guha S (2002) Streaming-data algorithms for high-quality clustering. In: Proceedings of the 18th international conference on data engineering, ICDE ’02, pp 685–694
Zurück zum Zitat Ordonez C (2003) Clustering binary data streams with k-means. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, DMKD ’03, Association for Computing Machinery, New York, NY, USA, pp 12–19, https://doi.org/10.1145/882082.882087 Ordonez C (2003) Clustering binary data streams with k-means. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, DMKD ’03, Association for Computing Machinery, New York, NY, USA, pp 12–19, https://​doi.​org/​10.​1145/​882082.​882087
Zurück zum Zitat Prasad BR, Agarwal S (2016) Stream data mining: platforms, algorithms, performance evaluators and research trends. Int J Database Theory Appl 9(9):201–218CrossRef Prasad BR, Agarwal S (2016) Stream data mining: platforms, algorithms, performance evaluators and research trends. Int J Database Theory Appl 9(9):201–218CrossRef
Zurück zum Zitat Puschmann D, Barnaghi P, Tafazolli R (2017) Adaptive clustering for dynamic IoT data streams. IEEE Internet Things J 4(1):64–74CrossRef Puschmann D, Barnaghi P, Tafazolli R (2017) Adaptive clustering for dynamic IoT data streams. IEEE Internet Things J 4(1):64–74CrossRef
Zurück zum Zitat Ramirez-Gallego S, Krawczyk B, Garcia S, Wozniak M, Herrera F (2017) A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239:39–57CrossRef Ramirez-Gallego S, Krawczyk B, Garcia S, Wozniak M, Herrera F (2017) A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239:39–57CrossRef
Zurück zum Zitat Sadik S, Gruenwald L (2014) Research issues in outlier detection for data streams. SIGKDD Explor Newsl 15(1):33–40CrossRef Sadik S, Gruenwald L (2014) Research issues in outlier detection for data streams. SIGKDD Explor Newsl 15(1):33–40CrossRef
Zurück zum Zitat Shi W, Dustdar S (2016) The promise of edge computing. Computer 49(5):78–81CrossRef Shi W, Dustdar S (2016) The promise of edge computing. Computer 49(5):78–81CrossRef
Zurück zum Zitat Shi W, Cao J, Zhang Q, Li Y, Xu L (2016) Edge computing: vision and challenges. IEEE Internet Things J 3(5):637–646CrossRef Shi W, Cao J, Zhang Q, Li Y, Xu L (2016) Edge computing: vision and challenges. IEEE Internet Things J 3(5):637–646CrossRef
Zurück zum Zitat Silva JA, Faria ER, Barros RC, Hruschka ER, Carvalho ACPLFd, Ja Gama (2013) Data stream clustering: a survey. ACM Comput Surv 46(1):13:1–13:31MATHCrossRef Silva JA, Faria ER, Barros RC, Hruschka ER, Carvalho ACPLFd, Ja Gama (2013) Data stream clustering: a survey. ACM Comput Surv 46(1):13:1–13:31MATHCrossRef
Zurück zum Zitat Song Q, Kasabov N (2001) ECM–a novel on-line, evolving clustering method and its applications. In: Posner MI (ed) Foundations of cognitive science. The MIT Press, Cambridge, pp 631–682 Song Q, Kasabov N (2001) ECM–a novel on-line, evolving clustering method and its applications. In: Posner MI (ed) Foundations of cognitive science. The MIT Press, Cambridge, pp 631–682
Zurück zum Zitat Souiden I, Brahmi Z, Toumi H (2016) A survey on outlier detection in the context of stream mining: review of existing approaches and recommadations. In: Advances in intelligent systems and computing Souiden I, Brahmi Z, Toumi H (2016) A survey on outlier detection in the context of stream mining: review of existing approaches and recommadations. In: Advances in intelligent systems and computing
Zurück zum Zitat Sun Y, Lu Y (2006) A grid-based subspace clustering algorithm for high-dimensional data streams. In: Feng L, Wang G, Zeng C, Huang R (eds) Web information systems–WISE 2006 workshops. Springer, Berlin, pp 37–48 Sun Y, Lu Y (2006) A grid-based subspace clustering algorithm for high-dimensional data streams. In: Feng L, Wang G, Zeng C, Huang R (eds) Web information systems–WISE 2006 workshops. Springer, Berlin, pp 37–48
Zurück zum Zitat Thakkar P, Vala J, Prajapati V (2016) Survey on outlier detection in data stream. Int J Comput Appl 136(2):13–16 Thakkar P, Vala J, Prajapati V (2016) Survey on outlier detection in data stream. Int J Comput Appl 136(2):13–16
Zurück zum Zitat Wang H, Yu Y, Wang Q, Wan Y (2012) A density-based clustering structure mining algorithm for data streams. In: Proceedings of the 1st international workshop on big data, streams and heterogeneous source mining: algorithms, systems, programming models and applications, BigMine’12, Association for Computing Machinery, New York, NY, USA, pp 69–76. https://doi.org/10.1145/2351316.2351326 Wang H, Yu Y, Wang Q, Wan Y (2012) A density-based clustering structure mining algorithm for data streams. In: Proceedings of the 1st international workshop on big data, streams and heterogeneous source mining: algorithms, systems, programming models and applications, BigMine’12, Association for Computing Machinery, New York, NY, USA, pp 69–76. https://​doi.​org/​10.​1145/​2351316.​2351326
Zurück zum Zitat Wang W, Yang J, Muntz RR (1997) Sting: a statistical information grid approach to spatial data mining. In: Proceedings of the 23rd international conference on very large data bases, , VLDB ’97, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 186–195 Wang W, Yang J, Muntz RR (1997) Sting: a statistical information grid approach to spatial data mining. In: Proceedings of the 23rd international conference on very large data bases, , VLDB ’97, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 186–195
Zurück zum Zitat Yasumoto K, Yamaguchi H, Shigeno H (2016) Survey of real-time processing technologies of iot data streams. J Inf Process 24(2):195–202 Yasumoto K, Yamaguchi H, Shigeno H (2016) Survey of real-time processing technologies of iot data streams. J Inf Process 24(2):195–202
Zurück zum Zitat Yin C, Xia L, Zhang S, Sun R, Wang J (2017) Improved clustering algorithm based on high-speed network data stream. Soft Comput 22(13):4185–4195CrossRef Yin C, Xia L, Zhang S, Sun R, Wang J (2017) Improved clustering algorithm based on high-speed network data stream. Soft Comput 22(13):4185–4195CrossRef
Zurück zum Zitat Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. SIGMOD Rec 25(2):103–114CrossRef Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. SIGMOD Rec 25(2):103–114CrossRef
Zurück zum Zitat Zhang X, Furtlehner C, Germain-Renaud C, Sebag M (2014) Data stream clustering with affinity propagation. IEEE Trans Knowl Data Eng 26(7):1644–1656CrossRef Zhang X, Furtlehner C, Germain-Renaud C, Sebag M (2014) Data stream clustering with affinity propagation. IEEE Trans Knowl Data Eng 26(7):1644–1656CrossRef
Zurück zum Zitat Zhang KS, Zhong L, Tian L, Zhang XY, Li L (2017) DBIECM—an evolving clustering method for streaming data clustering. AMSE J 60(1):239–254 Zhang KS, Zhong L, Tian L, Zhang XY, Li L (2017) DBIECM—an evolving clustering method for streaming data clustering. AMSE J 60(1):239–254
Zurück zum Zitat Zhou A, Cao F, Yan Y, Sha C, He X (2007) Distributed data stream clustering: a fast em-based approach. In: 2007 IEEE 23rd international conference on data engineering, pp 736–745 Zhou A, Cao F, Yan Y, Sha C, He X (2007) Distributed data stream clustering: a fast em-based approach. In: 2007 IEEE 23rd international conference on data engineering, pp 736–745
Zurück zum Zitat Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Syst 15(2):181–214CrossRef Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Syst 15(2):181–214CrossRef
Metadaten
Titel
Data stream clustering: a review
verfasst von
Alaettin Zubaroğlu
Volkan Atalay
Publikationsdatum
21.07.2020
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 2/2021
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-020-09874-x

Weitere Artikel der Ausgabe 2/2021

Artificial Intelligence Review 2/2021 Zur Ausgabe

Premium Partner