Skip to main content
Erschienen in: The Journal of Supercomputing 1/2019

08.05.2018

AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities

verfasst von: Jeong-Hun Kim, Jong-Hyeok Choi, Kwan-Hee Yoo, Aziz Nasridinov

Erschienen in: The Journal of Supercomputing | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Clustering is a typical data mining technique that partitions a dataset into multiple subsets of similar objects according to similarity metrics. In particular, density-based algorithms can find clusters of different shapes and sizes while remaining robust to noise objects. DBSCAN, a representative density-based algorithm, finds clusters by defining the density criterion with global parameters, \( \varepsilon \)-distance and \( MinPts \). However, most density-based algorithms, including DBSCAN, find clusters incorrectly because the density criterion is fixed to the global parameters and misapplied to clusters of varying densities. Although studies have been conducted to determine optimal parameters or to improve clustering performance using additional parameters and computations, running time for clustering has been significantly increased, particularly when the dataset is large. In this study, we focus on minimizing the additional computation required to determine the parameters by using the approximate adaptive \( \varepsilon \)-distance for each density while finding the clusters with varying densities that DBSCAN cannot find. Specifically, we propose a new tree structure based on a quadtree to define a dataset density layer. In addition, we propose approximate adaptive DBSCAN (AA-DBSCAN) and kAA-DBSCAN that have clustering performance similar to those of existing algorithms for finding clusters with varying densities while significantly reducing the running time required to perform clustering. We evaluate the proposed algorithms, AA-DBSCAN and kAA-DBSCAN, via extensive experiments using the state-of-the-art algorithms. Experimental results demonstrate an improvement in clustering performance and reduction in running time of the proposed algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Morgan Kaufmann, WalthamMATH Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Morgan Kaufmann, WalthamMATH
4.
Zurück zum Zitat Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96(34):226–231 Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96(34):226–231
8.
Zurück zum Zitat Xiong Z, Chen R, Zhang Y, Zhang X (2012) Multi-density DBSCAN algorithm based on density levels partitioning. J Inform Comput Sci 9(10):2739–2749 Xiong Z, Chen R, Zhang Y, Zhang X (2012) Multi-density DBSCAN algorithm based on density levels partitioning. J Inform Comput Sci 9(10):2739–2749
19.
Zurück zum Zitat Lumer ED, Faieta B (1994) Diversity and adaptation in populations of clustering ants. Proc Third Int Conf Simul Adapt Behav 3:501–508 Lumer ED, Faieta B (1994) Diversity and adaptation in populations of clustering ants. Proc Third Int Conf Simul Adapt Behav 3:501–508
20.
Zurück zum Zitat Hartigan JA, Wong MA (1979) Algorithm AS 136: a k-means clustering algorithm. J Roy Stat Soc Ser C (Appl Stat) 28(1):100–108MATH Hartigan JA, Wong MA (1979) Algorithm AS 136: a k-means clustering algorithm. J Roy Stat Soc Ser C (Appl Stat) 28(1):100–108MATH
28.
Zurück zum Zitat Dalli A (2003) Adaptation of the F-measure to cluster based lexicon quality evaluation. In: Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing: Are Evaluation Methods, Metrics and Resources Reusable? pp 51–56 Dalli A (2003) Adaptation of the F-measure to cluster based lexicon quality evaluation. In: Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing: Are Evaluation Methods, Metrics and Resources Reusable? pp 51–56
Metadaten
Titel
AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities
verfasst von
Jeong-Hun Kim
Jong-Hyeok Choi
Kwan-Hee Yoo
Aziz Nasridinov
Publikationsdatum
08.05.2018
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 1/2019
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-018-2380-z

Weitere Artikel der Ausgabe 1/2019

The Journal of Supercomputing 1/2019 Zur Ausgabe

Premium Partner