Skip to main content
Top
Published in:
Cover of the book

2018 | OriginalPaper | Chapter

Hybrid Hierarchical Clustering Algorithm Used for Large Datasets: A Pilot Study on Long-Term Sleep Data

Authors : V. Gerla, M. Murgas, A. Mladek, E. Saifutdinova, M. Macas, L. Lhotska

Published in: Precision Medicine Powered by pHealth and Connected Health

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Clustering is a popular analysis technique in a modern science full of unlabeled data, hidden dependencies and relations between elements in datasets. The presented study proposes a new hybrid hierarchical clustering method suitable for large datasets. It is based on the combination of effective simple methods. The proposed method was tested and compared with a widely used agglomerative clustering method. Two groups of datasets were used for testing. The first group contains data delivered from real biomedical data and related to a real problem of indication of sleep stages. The second group consists of artificially generated large data. Time, memory consumption, and mutual information were compared.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Yin Ch, Zhang S (2017) Parallel implementing improved k-means applied for image retrieval and anomaly detection. Multimed Tools Appl 76:16911–16927CrossRef Yin Ch, Zhang S (2017) Parallel implementing improved k-means applied for image retrieval and anomaly detection. Multimed Tools Appl 76:16911–16927CrossRef
3.
go back to reference Borgwardt S, Brieden A, Gritzmann P (2017) An LP-based k-means algorithm for balancing weighted point sets. Eur J Oper Res 263:349–355MathSciNetCrossRef Borgwardt S, Brieden A, Gritzmann P (2017) An LP-based k-means algorithm for balancing weighted point sets. Eur J Oper Res 263:349–355MathSciNetCrossRef
4.
go back to reference Jeon Y, Yoo J, Lee J, Yoon S (2017) NC-link: a new linkage method for efficient hierarchical clustering of large-scale data. IEEE Access 5:5594–5608 Jeon Y, Yoo J, Lee J, Yoon S (2017) NC-link: a new linkage method for efficient hierarchical clustering of large-scale data. IEEE Access 5:5594–5608
5.
go back to reference Medvedev V, Kurasova O, Bernataviciene J, Treigys P, Marcinkevicius V, Dzemyda G (2017) A new web-based solution for modelling data mining processes. Simul Model Pract Theory 76:34–46. High-Performance Modelling and Simulation for Big Data Applications Medvedev V, Kurasova O, Bernataviciene J, Treigys P, Marcinkevicius V, Dzemyda G (2017) A new web-based solution for modelling data mining processes. Simul Model Pract Theory 76:34–46. High-Performance Modelling and Simulation for Big Data Applications
6.
go back to reference Li L, Xiwei Ch, Dashi L, Yonggang L, Guandong X, Ming LHSC (2013) A spectral clustering algorithm combined with hierarchical method. Neural Netw World 6:499–521CrossRef Li L, Xiwei Ch, Dashi L, Yonggang L, Guandong X, Ming LHSC (2013) A spectral clustering algorithm combined with hierarchical method. Neural Netw World 6:499–521CrossRef
7.
go back to reference Gagolewski M, Bartoszuk M, Cena A (2016) Genie: a new, fast, and outlier-resistant hierarchical clustering algorithm. Inf Sci 363:8–23CrossRef Gagolewski M, Bartoszuk M, Cena A (2016) Genie: a new, fast, and outlier-resistant hierarchical clustering algorithm. Inf Sci 363:8–23CrossRef
8.
go back to reference Iber C (2007) Sleep medicine american academy. The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specifications. American Academy of Sleep Medicine Iber C (2007) Sleep medicine american academy. The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specifications. American Academy of Sleep Medicine
9.
go back to reference Gerla V, Djordjevic V, Lhotska L, Krajca V (2009) System approach to complex signal processing task. Comput Aided Syst Theory-EUROCAST 2009:579–586 Gerla V, Djordjevic V, Lhotska L, Krajca V (2009) System approach to complex signal processing task. Comput Aided Syst Theory-EUROCAST 2009:579–586
10.
go back to reference Gerla V (2012) Automated Analysis of Long-Term EEG Signals. PhD thesis. Czech Technical University in Prague Gerla V (2012) Automated Analysis of Long-Term EEG Signals. PhD thesis. Czech Technical University in Prague
11.
go back to reference Tanaseichuk O, Hadj Khodabakshi A, Petrov D et al (2015) An efficient hierarchical clustering algorithm for large datasets. Austin J Proteomics Bioinf Genomics 2 Tanaseichuk O, Hadj Khodabakshi A, Petrov D et al (2015) An efficient hierarchical clustering algorithm for large datasets. Austin J Proteomics Bioinf Genomics 2
12.
go back to reference Murtagh F, Legendre P (2014) Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion? J Classif 31:274–295MathSciNetCrossRef Murtagh F, Legendre P (2014) Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion? J Classif 31:274–295MathSciNetCrossRef
13.
go back to reference Arthur D, Vassilvitskii S (2007) K-means ++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, pp 1027–1035 Arthur D, Vassilvitskii S (2007) K-means ++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, pp 1027–1035
14.
go back to reference Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson International EditionPearson Addison Wesley Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson International EditionPearson Addison Wesley
15.
go back to reference James RG, Mahoney JR, Crutchfield JP (2017) Information trimming: sufficient statistics, mutual information, and predictability from effective channel states. Phys Rev E 95:060102CrossRef James RG, Mahoney JR, Crutchfield JP (2017) Information trimming: sufficient statistics, mutual information, and predictability from effective channel states. Phys Rev E 95:060102CrossRef
Metadata
Title
Hybrid Hierarchical Clustering Algorithm Used for Large Datasets: A Pilot Study on Long-Term Sleep Data
Authors
V. Gerla
M. Murgas
A. Mladek
E. Saifutdinova
M. Macas
L. Lhotska
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7419-6_1