Skip to main content
Top

2018 | OriginalPaper | Chapter

An Improved CURE Algorithm

Authors : Mingjuan Cai, Yongquan Liang

Published in: Intelligence Science II

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

CURE algorithm is an efficient hierarchical clustering algorithm for large data sets. This paper presents an improved CURE algorithm, named ISE-RS-CURE. The algorithm adopts a sample extraction algorithm combined with statistical ideas, which can reasonably select sample points according to different data densities and can improve the representation of sample sets. When the sample set is extracted, the data set is divided at the same time, which can help to reduce the time consumption in the non-sample set allocation process. A selection strategy based on partition influence factor is proposed for the selection of representative points, which comprehensively considers the overall correlation between the data in the region where a representative point is located, so as to improve the rationality of the representative points. Experiments show that the improved CURE algorithm proposed in this paper can ensure the accuracy of the clustering results and can also improve the operating efficiency.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. China Machine Press, Beijing (2012)MATH Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. China Machine Press, Beijing (2012)MATH
2.
go back to reference Niu, Z.-H., Fan, J.-C., Liu, W.-H., Tang, L., Tang, S.: CDNASA: clustering data with noise and arbitrary shape. Int. J. Wirel. Mob. Comput. 11(2), 100–111 (2016)CrossRef Niu, Z.-H., Fan, J.-C., Liu, W.-H., Tang, L., Tang, S.: CDNASA: clustering data with noise and arbitrary shape. Int. J. Wirel. Mob. Comput. 11(2), 100–111 (2016)CrossRef
3.
go back to reference Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 73–84. ACM (1998) Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 73–84. ACM (1998)
4.
go back to reference Guha, S., Rastogi, R., Shim, K., et al.: CURE: an efficient clustering algorithm for large databases. Inf. Syst. 26(1), 35–58 (2001)CrossRef Guha, S., Rastogi, R., Shim, K., et al.: CURE: an efficient clustering algorithm for large databases. Inf. Syst. 26(1), 35–58 (2001)CrossRef
5.
go back to reference Kang, W., Ye, D.: Study of CURE based clustering algorithm. In: 18th China Conference on Computer Technology and Applications (CACIS), vol. 1. Computer Technology and Application Progress, pp. 132–135. China University of Science and Technology Press, Hefei (2007) Kang, W., Ye, D.: Study of CURE based clustering algorithm. In: 18th China Conference on Computer Technology and Applications (CACIS), vol. 1. Computer Technology and Application Progress, pp. 132–135. China University of Science and Technology Press, Hefei (2007)
6.
go back to reference Jie, S., Zhao, L., Yang, J., et al.: Hierarchical clustering algorithm based on partition. Comput. Eng. Appl. 43(31), 175–177 (2007) Jie, S., Zhao, L., Yang, J., et al.: Hierarchical clustering algorithm based on partition. Comput. Eng. Appl. 43(31), 175–177 (2007)
7.
go back to reference Wu, H., Li, W., Jiang, M.: Modified CURE clustering algorithm based on entropy. Comput. Appl. Res. 34(08), 2303–2305 (2017) Wu, H., Li, W., Jiang, M.: Modified CURE clustering algorithm based on entropy. Comput. Appl. Res. 34(08), 2303–2305 (2017)
8.
go back to reference Wang, Y., Wang, J., Chen, H., Xu, T., Sun, B.: An algorithm for approximate binary hierarchical clustering using representatives. Mini Micro Comput. Syst. 36(02), 215–219 (2015) Wang, Y., Wang, J., Chen, H., Xu, T., Sun, B.: An algorithm for approximate binary hierarchical clustering using representatives. Mini Micro Comput. Syst. 36(02), 215–219 (2015)
9.
10.
go back to reference Jia, R., Geng, J., Ning, Z., et al.: Fast clustering algorithm based on representative points. Comput. Eng. Appl. 46(33), 121–123+126 (2010) Jia, R., Geng, J., Ning, Z., et al.: Fast clustering algorithm based on representative points. Comput. Eng. Appl. 46(33), 121–123+126 (2010)
11.
go back to reference Zhao, Y.: Research on user clustering algorithm based on CURE. Comput. Eng. Appl. 11(1), 457–465 (2012) Zhao, Y.: Research on user clustering algorithm based on CURE. Comput. Eng. Appl. 11(1), 457–465 (2012)
12.
go back to reference Shao, X., Wei, C.: Improved CURE algorithm and application of clustering for large-scale data. In: International Symposium on it in Medicine and Education, pp 305–308. IEEE (2012) Shao, X., Wei, C.: Improved CURE algorithm and application of clustering for large-scale data. In: International Symposium on it in Medicine and Education, pp 305–308. IEEE (2012)
13.
go back to reference Shi, N., Zhang, J., Chu, X.: CURE algorithm-based inspection of duplicated records. Comput. Eng. 35(05), 56–58 (2009) Shi, N., Zhang, J., Chu, X.: CURE algorithm-based inspection of duplicated records. Comput. Eng. 35(05), 56–58 (2009)
15.
go back to reference Pengli, L.U., Wang, Z.: Density-sensitive hierarchical clustering algorithm. Comput. Eng. Appl. 50(04), 190–195 (2014) Pengli, L.U., Wang, Z.: Density-sensitive hierarchical clustering algorithm. Comput. Eng. Appl. 50(04), 190–195 (2014)
Metadata
Title
An Improved CURE Algorithm
Authors
Mingjuan Cai
Yongquan Liang
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-01313-4_11

Premium Partner