Skip to main content
Erschienen in: Journal of Geographical Systems 4/2015

01.10.2015 | Original Article

Optimizing distance-based methods for large data sets

verfasst von: Tobias Scholl, Thomas Brenner

Erschienen in: Journal of Geographical Systems | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Distance-based methods for measuring spatial concentration of industries have received an increasing popularity in the spatial econometrics community. However, a limiting factor for using these methods is their computational complexity since both their memory requirements and running times are in \({\mathcal {O}}(n^2)\). In this paper, we present an algorithm with constant memory requirements and shorter running time, enabling distance-based methods to deal with large data sets. We discuss three recent distance-based methods in spatial econometrics: the D&O-Index by Duranton and Overman (Rev Econ Stud 72(4):1077–1106, 2005), the M-function by Marcon and Puech (J Econ Geogr 10(5):745–762, 2010) and the Cluster-Index by Scholl and Brenner (Reg Stud (ahead-of-print):1–15, 2014). Finally, we present an alternative calculation for the latter index that allows the use of data sets with millions of firms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Note, that Eq. (2) is modified for the observation of an unmarked point pattern. See Marcon and Puech (2010, p. 749) for the original formula of the M-function.
 
2
Note, that the latest version of dbmss already includes computational improvements. See Sect. 4 for more information.
 
Literatur
Zurück zum Zitat Baddeley A, Møller J, Waagepetersen RP (2000) Non- and semi-parametric estimation of interaction in inhomogeneous point patterns. Stat Ned 3(54):329–350CrossRef Baddeley A, Møller J, Waagepetersen RP (2000) Non- and semi-parametric estimation of interaction in inhomogeneous point patterns. Stat Ned 3(54):329–350CrossRef
Zurück zum Zitat Barlet M, Briant A, Crusson L (2013) Location patterns of service industries in France: a distance-based approach. Reg Sci Urban Econ 43(2):338–351CrossRef Barlet M, Briant A, Crusson L (2013) Location patterns of service industries in France: a distance-based approach. Reg Sci Urban Econ 43(2):338–351CrossRef
Zurück zum Zitat Duque JC, Aldstadt J, Velasquez E, Franco JL, Betancourt A (2011) A computationally efficient method for delineating irregularly shaped spatial clusters. J Geogr Syst 13(4):355–372CrossRef Duque JC, Aldstadt J, Velasquez E, Franco JL, Betancourt A (2011) A computationally efficient method for delineating irregularly shaped spatial clusters. J Geogr Syst 13(4):355–372CrossRef
Zurück zum Zitat Ellison G, Glaeser EL, Kerr WR (2010) What causes industry agglomeration? Evidence from coagglomeration patterns. Am Econ Rev 100(3):1195–1213CrossRef Ellison G, Glaeser EL, Kerr WR (2010) What causes industry agglomeration? Evidence from coagglomeration patterns. Am Econ Rev 100(3):1195–1213CrossRef
Zurück zum Zitat Espa G, Arbia G, Giuliani D et al (2010) Measuring industrial agglomeration with inhomogeneous k-function: the case of ict firms in milan (Italy). Artículo de trabajo 14:1–11 Espa G, Arbia G, Giuliani D et al (2010) Measuring industrial agglomeration with inhomogeneous k-function: the case of ict firms in milan (Italy). Artículo de trabajo 14:1–11
Zurück zum Zitat German Federal Ministry of Economics and Technology (2010) Möglichkeiten und Grenzen einer Verbesserung der Wettbewerbssituation der Automobilindustrie durch Abbau von branchenspezifischen Kosten aus Informationspflichten. BMBF, Stuttgart German Federal Ministry of Economics and Technology (2010) Möglichkeiten und Grenzen einer Verbesserung der Wettbewerbssituation der Automobilindustrie durch Abbau von branchenspezifischen Kosten aus Informationspflichten. BMBF, Stuttgart
Zurück zum Zitat Getis A, Ord JK (1992) The analysis of spatial association by use of distance statistics. Geographical analysis 24(3):189–206CrossRef Getis A, Ord JK (1992) The analysis of spatial association by use of distance statistics. Geographical analysis 24(3):189–206CrossRef
Zurück zum Zitat Hjaltason GR, Samet H (1999) Distance browsing in spatial databases. ACM Trans Database Syst 24(2):265–318CrossRef Hjaltason GR, Samet H (1999) Distance browsing in spatial databases. ACM Trans Database Syst 24(2):265–318CrossRef
Zurück zum Zitat Koh HJ, Riedel N (2014) Assessing the localization pattern of German manufacturing and service industries: a distance-based approach. Reg Stud 48(5):823–843CrossRef Koh HJ, Riedel N (2014) Assessing the localization pattern of German manufacturing and service industries: a distance-based approach. Reg Stud 48(5):823–843CrossRef
Zurück zum Zitat Kosfeld R, Eckey HF, Lauridsen J (2011) Spatial point pattern analysis and industry concentration. Ann Reg Sci 47(2):311–328CrossRef Kosfeld R, Eckey HF, Lauridsen J (2011) Spatial point pattern analysis and industry concentration. Ann Reg Sci 47(2):311–328CrossRef
Zurück zum Zitat Marcon E, Traissac S, Lang G (2013) A statistical test for ripleys function rejection of poisson null hypothesis. Int Sch Res Not 1:1–9 Marcon E, Traissac S, Lang G (2013) A statistical test for ripleys function rejection of poisson null hypothesis. Int Sch Res Not 1:1–9
Zurück zum Zitat Marcon E, Puech F (2003) Evaluating the geographic concentration of industries using distance-based methods. J Econ Geogr 3(4):409–428CrossRef Marcon E, Puech F (2003) Evaluating the geographic concentration of industries using distance-based methods. J Econ Geogr 3(4):409–428CrossRef
Zurück zum Zitat Marcon E, Puech F (2010) Measures of the geographic concentration of industries: improving distance-based methods. J Econ Geogr 10(5):745–762CrossRef Marcon E, Puech F (2010) Measures of the geographic concentration of industries: improving distance-based methods. J Econ Geogr 10(5):745–762CrossRef
Zurück zum Zitat Miller HJ (2010) The data avalanche is here. Shouldn’t we be digging? J Reg Sci 50(1):181–201CrossRef Miller HJ (2010) The data avalanche is here. Shouldn’t we be digging? J Reg Sci 50(1):181–201CrossRef
Zurück zum Zitat Neill DB, Moore AW (2004) Rapid detection of significant spatial clusters. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 256–265 Neill DB, Moore AW (2004) Rapid detection of significant spatial clusters. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 256–265
Zurück zum Zitat Openshaw S (1984) The modifiable areal unit problem. Institute of British Geographers, Norwich Openshaw S (1984) The modifiable areal unit problem. Institute of British Geographers, Norwich
Zurück zum Zitat Ripley BD (2005) Spatial statistics. Wiley, New Jersey Ripley BD (2005) Spatial statistics. Wiley, New Jersey
Zurück zum Zitat Sankaranarayanan J, Samet H, Varshney A (2007) A fast all nearest neighbor algorithm for applications involving large point-clouds. Comput Graph 31(2):157–174CrossRef Sankaranarayanan J, Samet H, Varshney A (2007) A fast all nearest neighbor algorithm for applications involving large point-clouds. Comput Graph 31(2):157–174CrossRef
Zurück zum Zitat Vitali S, Napoletano M, Fagiolo G (2013) Spatial localization in manufacturing: a cross-country analysis. Reg Stud 47(9):1534–1554CrossRef Vitali S, Napoletano M, Fagiolo G (2013) Spatial localization in manufacturing: a cross-country analysis. Reg Stud 47(9):1534–1554CrossRef
Metadaten
Titel
Optimizing distance-based methods for large data sets
verfasst von
Tobias Scholl
Thomas Brenner
Publikationsdatum
01.10.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
Journal of Geographical Systems / Ausgabe 4/2015
Print ISSN: 1435-5930
Elektronische ISSN: 1435-5949
DOI
https://doi.org/10.1007/s10109-015-0219-1

Premium Partner