Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 3/2017

03.06.2015 | Original Article

Sequence clustering algorithm based on weighted vector identification

verfasst von: Di Wu, Jiadong Ren

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sequence clustering has become an important topic that experts in data mining are currently investigating. However, clustering quality is typically significantly affected by both the selection of initial centers and the mean sequences. In this study, the sequence clustering algorithm based on weighted vector identification (SCAWVI) algorithm is developed based on sequence element composite similarity and the weight of a sequence in its corresponding cluster. Based on the weighted sequence element, all sequences in the sequence database are preprocessed into M-dimensional weighted vector identifications. Then, using Huffman-based initial clustering centers optimization algorithm, the initial clustering centers are optimized. In addition, the weighted vector identification and the weight of a sequence in its corresponding cluster are used to update the clustering centers. The theoretical experimental results and the analysis results in this study show that the SCAWVI algorithm has a higher rate of accurate results in its clustering results and higher execution efficiency.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Vincent M, Simon P, Vincent D (2012) High-quality sequence clustering guided by network topology and multiple alignment likelihood. Int J Bioinform 28(8):1078–1085CrossRef Vincent M, Simon P, Vincent D (2012) High-quality sequence clustering guided by network topology and multiple alignment likelihood. Int J Bioinform 28(8):1078–1085CrossRef
2.
Zurück zum Zitat Ma JJ, Tian DY, Gong MG (2014) Fuzzy clustering with non-local information for image segmentation. Int J Mach Learn Cybern 5(6):109–118CrossRef Ma JJ, Tian DY, Gong MG (2014) Fuzzy clustering with non-local information for image segmentation. Int J Mach Learn Cybern 5(6):109–118CrossRef
3.
Zurück zum Zitat Li HL (2015) On-line and dynamic time warping for time series data mining. Int J Mach Learn Cybern 6(1):145–153CrossRef Li HL (2015) On-line and dynamic time warping for time series data mining. Int J Mach Learn Cybern 6(1):145–153CrossRef
4.
Zurück zum Zitat Yang J, Hong P, Huang XL, Zhang JR, Shi P (2014) A novel clustering algorithm based on P systems. Int J Innov Comput Inf Control 10(2):753–765 Yang J, Hong P, Huang XL, Zhang JR, Shi P (2014) A novel clustering algorithm based on P systems. Int J Innov Comput Inf Control 10(2):753–765
5.
Zurück zum Zitat Lipovetsky S (2013) Additive and multiplicative mixed normal distributions and finding cluster centers. Int J Mach Learn Cybern 4:1–11CrossRef Lipovetsky S (2013) Additive and multiplicative mixed normal distributions and finding cluster centers. Int J Mach Learn Cybern 4:1–11CrossRef
6.
Zurück zum Zitat Arai K, Barakbah AR (2007) Hierarchical K-means: an algorithm for centroids initialization for K-means. J Rep Fac Sci Eng 36(1):25–31 Arai K, Barakbah AR (2007) Hierarchical K-means: an algorithm for centroids initialization for K-means. J Rep Fac Sci Eng 36(1):25–31
7.
Zurück zum Zitat Li YX, Shi YM, Li GY (2011) Research on K-means algorithm based on concept lattice. Comput Eng Des 32(2):913–916 Li YX, Shi YM, Li GY (2011) Research on K-means algorithm based on concept lattice. Comput Eng Des 32(2):913–916
8.
Zurück zum Zitat Xie JY, Guo WJ, Xie WX, Gao XB (2012) K-means clustering algorithm based on optimal initial centers related to pattern distribution of samples in space. J Appl Res Comput 29(3):888–892 Xie JY, Guo WJ, Xie WX, Gao XB (2012) K-means clustering algorithm based on optimal initial centers related to pattern distribution of samples in space. J Appl Res Comput 29(3):888–892
9.
Zurück zum Zitat Liu N, Chen F, Lu MY (2013) Spectral co-clustering documents and words using fuzzy K-harmonic means. Int J Mach Learn Cybern 4:75–83CrossRef Liu N, Chen F, Lu MY (2013) Spectral co-clustering documents and words using fuzzy K-harmonic means. Int J Mach Learn Cybern 4:75–83CrossRef
10.
Zurück zum Zitat Xie JY, Guo WJ, Xie WX (2012) A neighborhood-based K-medoids clustering algorithm. J Shaanxi Norm Univ (Nat Sci Ed) 40(4):16–22 Xie JY, Guo WJ, Xie WX (2012) A neighborhood-based K-medoids clustering algorithm. J Shaanxi Norm Univ (Nat Sci Ed) 40(4):16–22
11.
Zurück zum Zitat Wan XJ, Yang JW, Chen XO (2003) An improved K-means algorithm for documents clustering. Comput Eng 29(2):102–104 Wan XJ, Yang JW, Chen XO (2003) An improved K-means algorithm for documents clustering. Comput Eng 29(2):102–104
12.
Zurück zum Zitat Zheng JS, Zhang JH, Bai FL, Ma LX (2012) Similarity analysis of DNA sequences based on the MQ-EMD method. Int J Comput Inf Syst 8(23):9823–9830 Zheng JS, Zhang JH, Bai FL, Ma LX (2012) Similarity analysis of DNA sequences based on the MQ-EMD method. Int J Comput Inf Syst 8(23):9823–9830
13.
Zurück zum Zitat Li S, Mu WS, Qi B, Zhou ZJ (2015) A new privacy-preserving proximal support vector machine for classification of vertically partitioned data. Int J Mach Learn Cybern 6(1):109–118CrossRef Li S, Mu WS, Qi B, Zhou ZJ (2015) A new privacy-preserving proximal support vector machine for classification of vertically partitioned data. Int J Mach Learn Cybern 6(1):109–118CrossRef
14.
Zurück zum Zitat Morzy T, Wojciechowski M, Zakrzewicz M (2001) Scalable hierarchical clustering method for sequences of categorical values. In: Proceedings of the 5th Pacific-Asia conference on knowledge discovery and data mining ( PAKDD). Lecture notes in computer science, vol 2035. Springer, pp 282–293 Morzy T, Wojciechowski M, Zakrzewicz M (2001) Scalable hierarchical clustering method for sequences of categorical values. In: Proceedings of the 5th Pacific-Asia conference on knowledge discovery and data mining ( PAKDD). Lecture notes in computer science, vol 2035. Springer, pp 282–293
15.
Zurück zum Zitat Hu XG, Zhang YY (2008) Clustering sequences using sequential patterns. J Hefei Univ Technol 31(14):9–12 Hu XG, Zhang YY (2008) Clustering sequences using sequential patterns. J Hefei Univ Technol 31(14):9–12
16.
Zurück zum Zitat Pham TT, Luo JW, Hong TP, Vo B (2013) An efficient algorithm for mining sequential rules with interestingness measures. Intl J Innov Comput Inf Control 9(12):4811–4824 Pham TT, Luo JW, Hong TP, Vo B (2013) An efficient algorithm for mining sequential rules with interestingness measures. Intl J Innov Comput Inf Control 9(12):4811–4824
17.
Zurück zum Zitat Yang TX, Wang ZH, Wang H, Wang LY (2010) Research of clustering initial center selection. J Nanjing Norm Univ (Nat Sci Ed) 33(4):161–165 Yang TX, Wang ZH, Wang H, Wang LY (2010) Research of clustering initial center selection. J Nanjing Norm Univ (Nat Sci Ed) 33(4):161–165
Metadaten
Titel
Sequence clustering algorithm based on weighted vector identification
verfasst von
Di Wu
Jiadong Ren
Publikationsdatum
03.06.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 3/2017
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-015-0381-2

Weitere Artikel der Ausgabe 3/2017

International Journal of Machine Learning and Cybernetics 3/2017 Zur Ausgabe

Neuer Inhalt