Skip to main content

2002 | OriginalPaper | Buchkapitel

A Fast Parallel Clustering Algorithm for Large Spatial Databases

verfasst von : Xiaowei Xu, Jochen Jäger, Hans-Peter Kriegel

Erschienen in: High Performance Data Mining

Verlag: Springer US

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we present PDBSCAN, aparallel version ofthis algorithm. We use the ‘shared-nothing’ architecture withmultiple computers interconnected through a network. A fundamental component of a shared-nothing system is its distributed data structure. We introduce the dR*-tree, a distributed spatial index structure in which the data is spread among multiple computers and the indexes of the data are replicated on every computer. We implemented our method using a number of workstations connected via Ethernet (10 Mbit). A performance evaluation shows that PDBSCAN offers nearly linear speedup and has excellent scaleup and sizeup behavior.

Metadaten
Titel
A Fast Parallel Clustering Algorithm for Large Spatial Databases
verfasst von
Xiaowei Xu
Jochen Jäger
Hans-Peter Kriegel
Copyright-Jahr
2002
Verlag
Springer US
DOI
https://doi.org/10.1007/0-306-47011-X_3

Premium Partner