Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 2/2017

18-11-2015 | Original Article

Representative points clustering algorithm based on density factor and relevant degree

Authors: Di Wu, Jiadong Ren, Long Sheng

Published in: International Journal of Machine Learning and Cybernetics | Issue 2/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Most of the existing clustering algorithms are affected seriously by noise data and high cost of time. In this paper, on the basis of CURE algorithm, a representative points clustering algorithm based on density factor and relevant degree called RPCDR is proposed. The definition of density factor and relevant degree are presented. The primary representative point whose density factor is less than the prescribed threshold will be deleted directly. New representative points can be reselected from non representative points in corresponding cluster. Moreover, the representative points of each cluster are modeled by using K-nearest neighbor method. Relevant degree is computed by comprehensive considering the correlations of objects within a cluster and between different clusters. And then whether the two clusters need to merge is judged. The theoretic experimental results and analysis prove that RPCDR has better clustering accuracy and execution efficiency.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Hou SZ, Zhang XF (2008) Analysis and research for network management alarms correlation based on sequence clustering algorithm. In: Proceedings of the 2008 international conference on intelligent computation technology and automation, pp 982–986 Hou SZ, Zhang XF (2008) Analysis and research for network management alarms correlation based on sequence clustering algorithm. In: Proceedings of the 2008 international conference on intelligent computation technology and automation, pp 982–986
2.
go back to reference Mishra R, Kumar P, Bhasker B (2014) An alternative approach for clustering web user sessions considering sequential information. J Intell Data Anal 18:137–156 Mishra R, Kumar P, Bhasker B (2014) An alternative approach for clustering web user sessions considering sequential information. J Intell Data Anal 18:137–156
3.
go back to reference Sharif MA, Raghavan VV (2014) A clustering based scalable hybrid approach for web page recommendation. In: Proceedings of 2014 IEEE international conference on big data, pp 80–87 Sharif MA, Raghavan VV (2014) A clustering based scalable hybrid approach for web page recommendation. In: Proceedings of 2014 IEEE international conference on big data, pp 80–87
4.
go back to reference Sheu TL, Lin YH (2014) A cluster-based TDMA system for inter-vehicle communications. J Inf Sci Eng 30:213–231 Sheu TL, Lin YH (2014) A cluster-based TDMA system for inter-vehicle communications. J Inf Sci Eng 30:213–231
5.
go back to reference Pichara K, Soto A (2011) Active learning and subspace clustering for anomaly detection. J Intell Data Anal 15:151–171 Pichara K, Soto A (2011) Active learning and subspace clustering for anomaly detection. J Intell Data Anal 15:151–171
6.
go back to reference Guha S, Rastogi R, Shim K (2001) CURE: an efficient clustering algorithm for large databases. J Inf Syst 26:35–58CrossRefMATH Guha S, Rastogi R, Shim K (2001) CURE: an efficient clustering algorithm for large databases. J Inf Syst 26:35–58CrossRefMATH
7.
go back to reference Zhang JJ, Peng YW, Li HF (2013) A new semiparametric estimation method for accelerated hazards mixture cure model. J Comput Stat Data Anal 59:95–102MathSciNetCrossRef Zhang JJ, Peng YW, Li HF (2013) A new semiparametric estimation method for accelerated hazards mixture cure model. J Comput Stat Data Anal 59:95–102MathSciNetCrossRef
8.
go back to reference Wang XJ, Shen H (2009) Clustering high dimensional data streams with representative points. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery, pp 449–453 Wang XJ, Shen H (2009) Clustering high dimensional data streams with representative points. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery, pp 449–453
9.
go back to reference DelibasiC B, VukiCeviC M, JovanoviC M, Kirchner K (2012) An architecture for component-based design of representative-based blustering algorithms. J Data Knowl Eng 75:78–98CrossRef DelibasiC B, VukiCeviC M, JovanoviC M, Kirchner K (2012) An architecture for component-based design of representative-based blustering algorithms. J Data Knowl Eng 75:78–98CrossRef
10.
go back to reference Cesmeci D, Gullu MK (2009) Phase-correlation-based hyperspectral image classification using multiple class representatives obtained with K-means clustering. Int J Remote Sens 30:3827–3834CrossRef Cesmeci D, Gullu MK (2009) Phase-correlation-based hyperspectral image classification using multiple class representatives obtained with K-means clustering. Int J Remote Sens 30:3827–3834CrossRef
11.
go back to reference Pang YJ, Pan W, Liu KD (2010) A supervised clustering algorithm based on representative points and its application to fault diagnosis of diesel engine. J Adv Mater Res 121–122:958–963CrossRef Pang YJ, Pan W, Liu KD (2010) A supervised clustering algorithm based on representative points and its application to fault diagnosis of diesel engine. J Adv Mater Res 121–122:958–963CrossRef
12.
go back to reference Chen EH, Wang SF, Yan N, Wang XF (2001) The design and implementation of clustering algorithm using representative data. J Pattern Recognit Artif Intell 14:417–422 Chen EH, Wang SF, Yan N, Wang XF (2001) The design and implementation of clustering algorithm using representative data. J Pattern Recognit Artif Intell 14:417–422
13.
go back to reference Huang TQ, Qin XL, Wang JD (2006) Multi-representation feature tree and spatial clustering algorithm. J Comput Sci 33:189–195 Huang TQ, Qin XL, Wang JD (2006) Multi-representation feature tree and spatial clustering algorithm. J Comput Sci 33:189–195
14.
go back to reference Jia RY, Geng JW, Ning ZZ, He CG (2010) Fast clustering algorithm based on representative points. J Comput Eng Appl 46:121–126 Jia RY, Geng JW, Ning ZZ, He CG (2010) Fast clustering algorithm based on representative points. J Comput Eng Appl 46:121–126
15.
go back to reference Arajo D, Neto AD (2013) Information-theoretic clustering: a representative and evolutionary approach. J Expert Syst Appl 40:4190–4205CrossRef Arajo D, Neto AD (2013) Information-theoretic clustering: a representative and evolutionary approach. J Expert Syst Appl 40:4190–4205CrossRef
16.
go back to reference Domenica A, Massimo C (2001) Experiments in parallel clustering with DBSCAN. Lect Notes Comput Sci 2150:326–331CrossRefMATH Domenica A, Massimo C (2001) Experiments in parallel clustering with DBSCAN. Lect Notes Comput Sci 2150:326–331CrossRefMATH
17.
go back to reference Wang XZ, Wang YD, Wang LJ (2004) Improving fuzzy C-means clustering based on feature-weight learning. Pattern Recognit Lett 25:1123–1132CrossRef Wang XZ, Wang YD, Wang LJ (2004) Improving fuzzy C-means clustering based on feature-weight learning. Pattern Recognit Lett 25:1123–1132CrossRef
18.
go back to reference Li XX, Meng FR, Zhou Y (2012) The fast clustering algorithm based representative points. J Nanjing Univ (Natl Sci) 48:504–512 Li XX, Meng FR, Zhou Y (2012) The fast clustering algorithm based representative points. J Nanjing Univ (Natl Sci) 48:504–512
19.
go back to reference Yeung D, Wang XZ (2002) Improving performance of similarity-based clustering by feature weight learning. IEEE Trans Pattern Anal Mach Intell 24:556–561CrossRef Yeung D, Wang XZ (2002) Improving performance of similarity-based clustering by feature weight learning. IEEE Trans Pattern Anal Mach Intell 24:556–561CrossRef
20.
go back to reference Pham TT, Luo JW, Hong TP, Vo B (2013) Efficient algorithm for mining sequential rules with interestingness measures. Int J Innov Comput Inf Control 9:4811–4824 Pham TT, Luo JW, Hong TP, Vo B (2013) Efficient algorithm for mining sequential rules with interestingness measures. Int J Innov Comput Inf Control 9:4811–4824
22.
go back to reference Xie JY, Guo WJ, Xie WX, Gao XB (2012) K-means clustering algorithm based on optimal initial centers related to pattern distribution of samples in space. J Appl Res Comput 29:888–892 Xie JY, Guo WJ, Xie WX, Gao XB (2012) K-means clustering algorithm based on optimal initial centers related to pattern distribution of samples in space. J Appl Res Comput 29:888–892
23.
go back to reference Wu D, Ren JD (2012) K-means sequence clustering algorithm based on top-K maximal frequent sequence patterns. Int J Adv Comput Technol 4:405–413 Wu D, Ren JD (2012) K-means sequence clustering algorithm based on top-K maximal frequent sequence patterns. Int J Adv Comput Technol 4:405–413
24.
go back to reference Wang SY, Hu YF, Fan YJ, Xu HX (2010) Cluster of data streams with mixed numeric and categorical values based on entropy and distance. J Comput Syst 31:2365–2371 Wang SY, Hu YF, Fan YJ, Xu HX (2010) Cluster of data streams with mixed numeric and categorical values based on entropy and distance. J Comput Syst 31:2365–2371
Metadata
Title
Representative points clustering algorithm based on density factor and relevant degree
Authors
Di Wu
Jiadong Ren
Long Sheng
Publication date
18-11-2015
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 2/2017
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-015-0451-5

Other articles of this Issue 2/2017

International Journal of Machine Learning and Cybernetics 2/2017 Go to the issue