Skip to main content

2018 | OriginalPaper | Buchkapitel

An Improved CURE Algorithm

verfasst von : Mingjuan Cai, Yongquan Liang

Erschienen in: Intelligence Science II

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

CURE algorithm is an efficient hierarchical clustering algorithm for large data sets. This paper presents an improved CURE algorithm, named ISE-RS-CURE. The algorithm adopts a sample extraction algorithm combined with statistical ideas, which can reasonably select sample points according to different data densities and can improve the representation of sample sets. When the sample set is extracted, the data set is divided at the same time, which can help to reduce the time consumption in the non-sample set allocation process. A selection strategy based on partition influence factor is proposed for the selection of representative points, which comprehensively considers the overall correlation between the data in the region where a representative point is located, so as to improve the rationality of the representative points. Experiments show that the improved CURE algorithm proposed in this paper can ensure the accuracy of the clustering results and can also improve the operating efficiency.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. China Machine Press, Beijing (2012)MATH Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. China Machine Press, Beijing (2012)MATH
2.
Zurück zum Zitat Niu, Z.-H., Fan, J.-C., Liu, W.-H., Tang, L., Tang, S.: CDNASA: clustering data with noise and arbitrary shape. Int. J. Wirel. Mob. Comput. 11(2), 100–111 (2016)CrossRef Niu, Z.-H., Fan, J.-C., Liu, W.-H., Tang, L., Tang, S.: CDNASA: clustering data with noise and arbitrary shape. Int. J. Wirel. Mob. Comput. 11(2), 100–111 (2016)CrossRef
3.
Zurück zum Zitat Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 73–84. ACM (1998) Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 73–84. ACM (1998)
4.
Zurück zum Zitat Guha, S., Rastogi, R., Shim, K., et al.: CURE: an efficient clustering algorithm for large databases. Inf. Syst. 26(1), 35–58 (2001)CrossRef Guha, S., Rastogi, R., Shim, K., et al.: CURE: an efficient clustering algorithm for large databases. Inf. Syst. 26(1), 35–58 (2001)CrossRef
5.
Zurück zum Zitat Kang, W., Ye, D.: Study of CURE based clustering algorithm. In: 18th China Conference on Computer Technology and Applications (CACIS), vol. 1. Computer Technology and Application Progress, pp. 132–135. China University of Science and Technology Press, Hefei (2007) Kang, W., Ye, D.: Study of CURE based clustering algorithm. In: 18th China Conference on Computer Technology and Applications (CACIS), vol. 1. Computer Technology and Application Progress, pp. 132–135. China University of Science and Technology Press, Hefei (2007)
6.
Zurück zum Zitat Jie, S., Zhao, L., Yang, J., et al.: Hierarchical clustering algorithm based on partition. Comput. Eng. Appl. 43(31), 175–177 (2007) Jie, S., Zhao, L., Yang, J., et al.: Hierarchical clustering algorithm based on partition. Comput. Eng. Appl. 43(31), 175–177 (2007)
7.
Zurück zum Zitat Wu, H., Li, W., Jiang, M.: Modified CURE clustering algorithm based on entropy. Comput. Appl. Res. 34(08), 2303–2305 (2017) Wu, H., Li, W., Jiang, M.: Modified CURE clustering algorithm based on entropy. Comput. Appl. Res. 34(08), 2303–2305 (2017)
8.
Zurück zum Zitat Wang, Y., Wang, J., Chen, H., Xu, T., Sun, B.: An algorithm for approximate binary hierarchical clustering using representatives. Mini Micro Comput. Syst. 36(02), 215–219 (2015) Wang, Y., Wang, J., Chen, H., Xu, T., Sun, B.: An algorithm for approximate binary hierarchical clustering using representatives. Mini Micro Comput. Syst. 36(02), 215–219 (2015)
9.
Zurück zum Zitat Fray, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)MathSciNetCrossRef Fray, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)MathSciNetCrossRef
10.
Zurück zum Zitat Jia, R., Geng, J., Ning, Z., et al.: Fast clustering algorithm based on representative points. Comput. Eng. Appl. 46(33), 121–123+126 (2010) Jia, R., Geng, J., Ning, Z., et al.: Fast clustering algorithm based on representative points. Comput. Eng. Appl. 46(33), 121–123+126 (2010)
11.
Zurück zum Zitat Zhao, Y.: Research on user clustering algorithm based on CURE. Comput. Eng. Appl. 11(1), 457–465 (2012) Zhao, Y.: Research on user clustering algorithm based on CURE. Comput. Eng. Appl. 11(1), 457–465 (2012)
12.
Zurück zum Zitat Shao, X., Wei, C.: Improved CURE algorithm and application of clustering for large-scale data. In: International Symposium on it in Medicine and Education, pp 305–308. IEEE (2012) Shao, X., Wei, C.: Improved CURE algorithm and application of clustering for large-scale data. In: International Symposium on it in Medicine and Education, pp 305–308. IEEE (2012)
13.
Zurück zum Zitat Shi, N., Zhang, J., Chu, X.: CURE algorithm-based inspection of duplicated records. Comput. Eng. 35(05), 56–58 (2009) Shi, N., Zhang, J., Chu, X.: CURE algorithm-based inspection of duplicated records. Comput. Eng. 35(05), 56–58 (2009)
15.
Zurück zum Zitat Pengli, L.U., Wang, Z.: Density-sensitive hierarchical clustering algorithm. Comput. Eng. Appl. 50(04), 190–195 (2014) Pengli, L.U., Wang, Z.: Density-sensitive hierarchical clustering algorithm. Comput. Eng. Appl. 50(04), 190–195 (2014)
Metadaten
Titel
An Improved CURE Algorithm
verfasst von
Mingjuan Cai
Yongquan Liang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01313-4_11

Premium Partner