Skip to main content
Log in

Data-driven geography

  • Published:
GeoJournal Aims and scope Submit manuscript

Abstract

The context for geographic research has shifted from a data-scarce to a data-rich environment, in which the most fundamental changes are not just the volume of data, but the variety and the velocity at which we can capture georeferenced data; trends often associated with the concept of Big Data. A data-driven geography may be emerging in response to the wealth of georeferenced data flowing from sensors and people in the environment. Although this may seem revolutionary, in fact it may be better described as evolutionary. Some of the issues raised by data-driven geography have in fact been longstanding issues in geographic research, namely, large data volumes, dealing with populations and messy data, and tensions between idiographic versus nomothetic knowledge. The belief that spatial context matters is a major theme in geographic thought and a major motivation behind approaches such as time geography, disaggregate spatial statistics and GIScience. There is potential to use Big Data to inform both geographic knowledge-discovery and spatial modeling. However, there are challenges, such as how to formalize geographic knowledge to clean data and to ignore spurious patterns, and how to build data-driven models that are both true and understandable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete. Wired, 16, 07.

    Google Scholar 

  • Anselin, L. (1995). Local indicators of spatial association: LISA. Geographical Analysis, 27(2), 93–115.

    Article  Google Scholar 

  • Batty, M. (2012). Smart cities, big data. Environment and Planning B, 39(2), 191–193.

    Article  Google Scholar 

  • Butler, D. (2008). Web data predict flu. Nature, 456, 287–288.

    Article  Google Scholar 

  • Carr, N. (2013) The great forgetting. The Atlantic, pp. 77–81.

  • Cetin, N., Nagel, K., Raney, B., & Voellmy, A. (2002). Large-scale multi-agent transportation simulations. Computer Physics Communications, 147(1–2), 559–564.

    Article  Google Scholar 

  • Charlton, M. (2008). Geographical Analysis Machine (GAM). In K. Kemp (Ed.), Encyclopedia of Geographic Information Science (pp. 179–180). London: Sage.

    Google Scholar 

  • Cresswell, T. (2013). Geographic thought: A critical introduction. New York: Wiley-Blackwell.

    Google Scholar 

  • DeLyser, D., & Sui, D. (2013). Crossing the qualitative-quantitative divide II: Inventive approaches to big data, mobile methods, and rhythmanalysis. Progress in Human Geography37(2), 293–305.

  • Diplock, G. (1998). Building new spatial interaction models by using genetic programming and a supercomputer. Environment and Planning A, 30(10), 1893–1904.

    Article  Google Scholar 

  • Dobson, J. E. (1983). Automated geography. The Professional Geographer, 35, 135–143.

    Article  Google Scholar 

  • Dumbill, E. (2012). What is big data? An introduction to the big data landscape, http://strata.oreilly.com/2012/01/what-is-big-data.html. Last accessed 17 April 2014.

  • Flake, G. W. (1998). The computational beauty of nature: computer explorations of fractals, chaos, complex systems, and adaptation. Cambridge: MIT Press.

    Google Scholar 

  • Fotheringham, A. S. (1998). Trends in quantitative methods II: Stressing the computational. Progress in Human Geography, 22(2), 283–292.

    Article  Google Scholar 

  • Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002). Geographically weighted regression: The analysis of spatially varying relationships. Chichester: Wiley.

    Google Scholar 

  • Gahegan, M. (2000). On the application of inductive machine learning tools to geographical analysis. Geographical Analysis, 32(1), 113–139.

    Google Scholar 

  • Gahegan, M. (2009). Visual exploration and explanation in geography: Analysis with light. In H. J. Miller & J. Han (Eds.), Geographic data mining and knowledge discovery (2nd ed., pp. 291–324). London: Taylor and Francis.

    Google Scholar 

  • Gibbings, J. C. (2011). Dimensional analysis. New York: Springer.

    Book  Google Scholar 

  • Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory. Chicago: Aldine.

    Google Scholar 

  • Goffman, E. (1959). The presentation of self in everyday life. New York: Anchor Books.

    Google Scholar 

  • Goodchild, M. F. (2004). GIScience, geography, form, and process. Annals of the Association of American Geographers, 94(4), 709–714.

    Google Scholar 

  • Goodchild, M. F. (2007). Citizens as sensors: The world of volunteered geography. GeoJournal, 69(4), 211–221.

    Article  Google Scholar 

  • Goodchild, M. F., Egenhofer, M. J., Kemp, K. K., Mark, D. M., & Sheppard, E. (1999). Introduction to the Varenius project. International Journal of Geographical Information Science, 13(8), 731–745.

    Article  Google Scholar 

  • Goodchild, M. F., & Li, L. (2012). Assuring the quality of volunteered geographic information. Spatial Statistics, 1, 110–120. doi:10.1016/j.spasta.2012.03.002.

    Article  Google Scholar 

  • Graham, M., & Shelton, T. (2013). Geography and the future of big data, big data and the future of geography. Dialogues in Human Geography, 3(3), 255–261.

    Article  Google Scholar 

  • Guptill, S. C., & Morrison, J. L. (Eds.). (1995). Elements of spatial data quality. Oxford: Elsevier.

    Google Scholar 

  • Haklay, M. (2010). How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and Planning B: Planning and Design, 37(4), 682–703.

    Article  Google Scholar 

  • Hand, D. J. (1999). Discussion contribution on ‘data mining reconsidered: Encompassing and the general-to-specific approach to specification search’ by Hoover and Perez. Econometrics Journal, 2(2), 241–243.

    Article  Google Scholar 

  • Hartshorne, R. (1939). The nature of geography: A critical survey of current thought in the light of the past. Washington, DC: Association of American Geographers.

    Google Scholar 

  • Hey, T., Tansley S., & Tolle, K. (Eds.). (2009). The fourth paradigm: Data-intensive scientific discovery.

  • Hoover, K. D., & Perez, S. J. (1999). Data mining reconsidered: Encompassing and the general-to-specific approach to specification search. Econometrics Journal, 2(2), 167–191.

    Article  Google Scholar 

  • Kitchin, R. (2014). Big data and human geography: Opportunities, challenges and risks. Dialogues in Human Geography, 3(3), 262–267.

    Article  Google Scholar 

  • Kurzweil, R. (1999). The age of spiritual machines: when computers exceed human intelligence. New York: Vintage.

    Google Scholar 

  • Mayer-Schonberger, V., Cukier, K. (2013). Big Data: A revolution that will transform how we live, work, and think.

  • Merton, R. K. (1967). On sociological theories of the middle range. In R. K. Merton (Ed.), On theoretical sociology (pp. 39–72). New York: The Free Press.

    Google Scholar 

  • Miller, H. J. (2007). Place-based versus people-based geographic information science. Geography Compass, 1(3), 503–535.

    Article  Google Scholar 

  • Miller, H. J. (2010). The data avalanche is here. Shouldn’t we be digging? Journal of Regional Science, 50(1), 181–201.

    Article  Google Scholar 

  • O’Leary, M. (2012). Eurovision statistics: post-semifinal update, Cold Hard Facts (May 23). Available: http://mewo2.com/nerdery/2012/05/23/eurovision-statistics-post-semifinal-update/. Accessed October 25, 2013.

  • Openshaw, S. (1988). Building an automated modeling system to explore a universe of spatial interaction models. Geographical Analysis, 20(1), 31–46.

    Article  Google Scholar 

  • Openshaw, S., Charlton, M., Wymer, C., & Craft, A. (1987). A Mark I geographical analysis machine for the automated analysis of point data sets. International Journal of Geographical Information Systems, 1(4), 335–358.

    Article  Google Scholar 

  • Openshaw, S., & Taylor, P. J. (1979). A million or so correlation coefficients: three experiments on the modifiable areal unit problem. In N. Wrigley (Ed.), Statistical methods in the social sciences (pp. 127–144). London: Pion.

    Google Scholar 

  • Preis, T., Moat, H. S., & Stanley, H. E. (2013). Quantifying trading behavior in financial markets using Google Trends. Scientific Reports, 3 (1684). doi:10.1038/srep01684.

  • Raymond, E. S. (2001). The cathedral and the bazaar: Musings on linux and open source by an accidental revolutionary. Sebastopol: O’Reilly Media.

    Google Scholar 

  • Schuurman, N. (2000). Trouble in the heartland: GIS and its critics in the 1990s. Progress in Human Geography, 24(4), 569–589.

    Article  Google Scholar 

  • Silver, N. (2012). The signal and the noise: Why most predictions failbut some don’t.

  • Smith, N. (1992). History and philosophy of geography: Real wars, theory wars. Progress in Human Geography, 16(2), 257–271.

    Article  Google Scholar 

  • Sui, D. (2004). GIS, cartography, and the “Third Culture”: Geographic imaginations in the computer age. Professional Geographer, 56(1), 62–72.

    Google Scholar 

  • Sui, D., & DeLyser, D. (2012). Crossing the qualitative-quantitative chasm I: Hybrid geographies, the spatial turn, and volunteered geographic information (VGI). Progress in Human Geography, 36(1), 111–124.

    Article  Google Scholar 

  • Sui, D., & Goodchild, M. F. (2011). The convergence of GIS and social media: Challenges for GIScience. International Journal of Geographical Information Science, 25(11), 1737–1748.

    Article  Google Scholar 

  • Sui, D., Goodchild, M. F., & Elwood, S. (2013). Volunteered geographic information, the exaflood, and the growing digital divide. In D. Sui, S. Elwood, & M. F. Goodchild (Eds.), Crowdsourcing geographic knowledge (pp. 1–12). New York: Springer.

    Chapter  Google Scholar 

  • Taleb, N. N. (2007). The black swan: The impact of the highly improbable. New York: Random House.

    Google Scholar 

  • The Economist. (19 October 2013). Trouble at the lab, pp. 26–30.

  • Townsend, A. (2013). Smart cities: Big data, civic hackers, and the quest for a new utopia. New York: Norton.

    Google Scholar 

  • Tsou, M. H., Yang, J. A., Lusher, D., Han, S., Spitzberg, B., Gawron, J. M., et al. (2013). Mapping social activities and concepts with social media (Twitter) and web search engines (Yahoo and Bing): a case study in 2012 US Presidential Election. Cartography and Geographic Information Science, 40(4), 337–348.

    Article  Google Scholar 

  • Waldrop, M. M. (1990). Learning to drink from a fire hose. Science, 248(4956), 674–675.

    Article  Google Scholar 

  • Warntz, W. (1989). Newton, the Newtonians, and the Geographia Generalis Varenii. Annals of the Association of American Geographers, 79(2), 165–191.

    Article  Google Scholar 

  • Watts, D. J. (2011). Everything is Obvious – Once You Know the Answer. United States of America: Crown Business.

    Google Scholar 

  • Weinberger, D. (2011). The machine that would predict the future, Scientific American, November 15, 2011. http://www.scientificamerican.com/article.cfm?id=the-machine-that-would-predict.

  • Zedner, L. (2010). Pre-crime and pre-punishment: a health warning. Criminal Justice Matters, 81(1), 24–25.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harvey J. Miller.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miller, H.J., Goodchild, M.F. Data-driven geography. GeoJournal 80, 449–461 (2015). https://doi.org/10.1007/s10708-014-9602-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10708-014-9602-6

Keywords

Navigation