Skip to main content
Top

2014 | OriginalPaper | Chapter

Fast SNN-Based Clustering Approach for Large Geospatial Data Sets

Authors : Arménio Antunes, Maribel Yasmina Santos, Adriano Moreira

Published in: Connecting a Digital Europe Through Location and Place

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Current positioning and sensing technologies enable the collection of very large spatio-temporal data sets. When analysing movement data, researchers often resort to clustering techniques to extract useful patterns from these data. Density-based clustering algorithms, although being very adequate to the analysis of this type of data, can be very inefficient when analysing huge amounts of data. The Shared Nearest Neighbour (SNN) algorithm presents low efficiency when dealing with high quantities of data due to its complexity evaluated in the worst case by O(n\(^{2}\)). This chapter presents a clustering method, based on the SNN algorithm that significantly reduces the processing time by segmenting the spatial dimension of the data into a set of cells, and by minimizing the number of cells that have to be visited while searching for the k-nearest neighbours of each vector. The obtained results show an expressive reduction of the time needed to find the k-nearest neighbours and to compute the clusters, while producing results that are equal to those produced by the original SNN algorithm. Experimental results obtained with three different data sets (2D and 3D), one synthetic and two real, show that the proposed method enables the analysis of much larger data sets within reasonable amount of time.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517CrossRef Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517CrossRef
go back to reference Bhavsar HB, Jivani A G (2009) The shared nearest neighbor algorithm with enclosures (SNNAE). In: 2009 WRI world congress on computer science and information engineering. doi:10.1109/CSIE.2009.997 Bhavsar HB, Jivani A G (2009) The shared nearest neighbor algorithm with enclosures (SNNAE). In: 2009 WRI world congress on computer science and information engineering. doi:10.​1109/​CSIE.​2009.​997
go back to reference Chávez E, Navarro G, Baeza-Yates R, Marroquín JL (2001) Searching in metric spaces. ACM Comput Surv 33(3):273–321CrossRef Chávez E, Navarro G, Baeza-Yates R, Marroquín JL (2001) Searching in metric spaces. ACM Comput Surv 33(3):273–321CrossRef
go back to reference Ertoz L, Steinbach M, Kumar V (2002) A new shared nearest neighbor clustering algorithm and its applications. In: Proceedings of the workshop on clustering high dimensional data and its applications at 2nd SIAM international conference on data mining, p 105–115 Ertoz L, Steinbach M, Kumar V (2002) A new shared nearest neighbor clustering algorithm and its applications. In: Proceedings of the workshop on clustering high dimensional data and its applications at 2nd SIAM international conference on data mining, p 105–115
go back to reference Ertoz L, Steinbach M, Kumar V (2003) Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: Proceedings of the 3rd SIAM international conference on data mining Ertoz L, Steinbach M, Kumar V (2003) Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: Proceedings of the 3rd SIAM international conference on data mining
go back to reference Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining, KDD 1996, p 226–231 Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining, KDD 1996, p 226–231
go back to reference Faustino B (2012) Implementation for spatial data of the shared nearest neighbour with metric data structures. M.Sc. thesis, New University of Lisbon Faustino B (2012) Implementation for spatial data of the shared nearest neighbour with metric data structures. M.Sc. thesis, New University of Lisbon
go back to reference Li X, Jiang SY, Su XK (2009) A novel fast clustering algorithm. In: Proceedings of the international conference on artificial intelligence and computational intelligence, pp 284–288 Li X, Jiang SY, Su XK (2009) A novel fast clustering algorithm. In: Proceedings of the international conference on artificial intelligence and computational intelligence, pp 284–288
go back to reference Moreira A, Santos MY, Wachowicz M, Orellana D (2010) The impact of data quality in the context of pedestrian movement analysis. In: Painho M, Santos MY, Pundt H (eds) Geospatial thinking. Springer, Berlin, pp 61–78 Moreira A, Santos MY, Wachowicz M, Orellana D (2010) The impact of data quality in the context of pedestrian movement analysis. In: Painho M, Santos MY, Pundt H (eds) Geospatial thinking. Springer, Berlin, pp 61–78
go back to reference Moreira G, Santos MY, Moura-Pires J (2013) SNN Input Parameters: how are they related? In: Proceedings of the 19th IEEE international conference on parallel and distributed systems (ICPADS’2013). Seoul, Korea, pp 15–18 (IEEE computer society, December) Moreira G, Santos MY, Moura-Pires J (2013) SNN Input Parameters: how are they related? In: Proceedings of the 19th IEEE international conference on parallel and distributed systems (ICPADS’2013). Seoul, Korea, pp 15–18 (IEEE computer society, December)
go back to reference Traina C, Traina A, Faloutsos C, Seeger B (2002) Fast indexing and visualization of metric data sets using slim-trees. IEEE Trans Knowl Data Eng 14(2):244–260CrossRef Traina C, Traina A, Faloutsos C, Seeger B (2002) Fast indexing and visualization of metric data sets using slim-trees. IEEE Trans Knowl Data Eng 14(2):244–260CrossRef
go back to reference Tripathy A, Maji SK, Patra PK (2011) FDCA: A fast density based clustering algorithm for spatial database system. In: Proceedings of the 2nd international conference on computer and communication technology, pp 21–26 Tripathy A, Maji SK, Patra PK (2011) FDCA: A fast density based clustering algorithm for spatial database system. In: Proceedings of the 2nd international conference on computer and communication technology, pp 21–26
Metadata
Title
Fast SNN-Based Clustering Approach for Large Geospatial Data Sets
Authors
Arménio Antunes
Maribel Yasmina Santos
Adriano Moreira
Copyright Year
2014
DOI
https://doi.org/10.1007/978-3-319-03611-3_11

Premium Partner