skip to main content
10.1145/2447481.2447482acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Spatiotemporal data mining in the era of big spatial data: algorithms and applications

Published:06 November 2012Publication History

ABSTRACT

Spatial data mining is the process of discovering interesting and previously unknown, but potentially useful patterns from the spatial and spatiotemporal data. However, explosive growth in the spatial and spatiotemporal data, and the emergence of social media and location sensing technologies emphasize the need for developing new and computationally efficient methods tailored for analyzing big data. In this paper, we review major spatial data mining algorithms by closely looking at the computational and I/O requirements and allude to few applications dealing with big spatial data.

References

  1. D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, and A. Ng. Discriminative learning of markov random fields for segmentation of 3d scan data. In CVPR '05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2, pages 169--176, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Anselin. Spatial Econometrics: methods and models. Kluwer, Dordrecht, Netherlands, 1988.Google ScholarGoogle Scholar
  3. J. Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of Royal Statistical Society, 36:192--236, 1974.Google ScholarGoogle Scholar
  4. J. Besag. On the statistical analysis of dirty pictures. J. Royal Statistical Soc., (48):259--302, 1986.Google ScholarGoogle Scholar
  5. J. Bilmes. A gentle tutorial on the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. Technical Report, University of Berkeley, ICSI-TR-97-021, 1997., 1997.Google ScholarGoogle Scholar
  6. Y. Boykov, O. Veksler, and R. Zabih. Fast Approximate Energy Minimization via Graph Cuts. International Conference on Computer Vision, September 1999.Google ScholarGoogle ScholarCross RefCross Ref
  7. R. Brittaine and N. Lutaladio. Jatropha: A samllholder bioenergy crop. the potential for pro-poor development. Integrated Crop Management, 8:1--114, 2010.Google ScholarGoogle Scholar
  8. G. Capps, O. Franzese, B. Knee, M. Lascurain, and P. Otaduy. Class-8 heavy truck duty cycle project final report. ORNL/TM-2008/122, 2008.Google ScholarGoogle Scholar
  9. M. Celik, B. Kazar, S. Shekhar, D. Boley, and D. Lilja. Northstar: A parameter estimation method for the spatial autoregression model. AHPCRC Technical Report No: 2005-001, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  10. V. Chandola and R. R. Vatsavai. Scalable time series change detection for biomass monitoring using gaussian process. In NASA Conference on Intelligent Data Understanding (CIDU), pages 69--82, 2010.Google ScholarGoogle Scholar
  11. V. Chandola and R. R. Vatsavai. A gaussian process based online change detection algorithm for monitoring periodic time series. In SIAM Data Mining (SDM), 2011.Google ScholarGoogle ScholarCross RefCross Ref
  12. V. Chandola and R. R. Vatsavai. A scalable gaussian process analysis algorithm for biomass monitoring. Statistical Analysis and Data Mining, 4(4):430--445, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Chatterjee, K. Steinhaeuser, A. Banerjee, S. Chatterjee, and A. R. Ganguly. Sparse group lasso: Consistency and climate applications. In SDM, pages 47--58, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  14. P. Chou, P. Cooper, M. J. Swain, C. Brown, and L. Wixson. Probabilistic network inference for cooperative high and low levell vision. In In Markov Random Field, Theory and Applicaitons. Academic Press, New York, 1993.Google ScholarGoogle Scholar
  15. N. Cressie. Statistics for Spatial Data (Revised Edition). Wiley, New York, 1993.Google ScholarGoogle Scholar
  16. A. Crooks, A. Croitoru, A. Stefanidis, and J. Radzikowski. Earthquake: Twitter as a distributed sensor system. Transactions in GIS (in press), 0(0), 2012.Google ScholarGoogle Scholar
  17. D. Das, E. Kodra, Z. Obradovic, and A. R. Ganguly. Mining extremes: Severe rainfall and climate change. In ECAI, pages 899--900, 2012.Google ScholarGoogle Scholar
  18. H. Derin and H. Elliott. Modeling and segmentation of noisy and textured images using Gibbs random fields. IEEE Transaction on Pattern Analysis and Machine Intelligence, (9):39--55, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Docan, M. Parashar, and S. Klasky. Dataspaces: an interaction and coordination framework for coupled simulation workflows. In HPDC, pages 25--36, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Fritz, I. McCallum, C. Schill, C. Perger, R. Grillmayer, F. Achard, F. Kraxner, and M. Obersteiner. Geo-wiki.org: The use of crowdsourcing to improve global land cover. Remote Sensing, 1(3):345--354, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  21. A. R. Ganguly and K. Steinhaeuser. Data mining for climate change and impacts. In ICDM Workshops, pages 385--394, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. R. Ganguly, K. Steinhaeuser, D. J. Erickson, M. Branstetter, E. S. Parish, N. Singh, J. B. Drake, and L. Buja. Higher trends but larger uncertainty and geographic variability in 21st century temperature and heat waves. Proceedings of the National Academy of Sciences, 106(37):15555--15559, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  23. S. Geman and D. Geman. Stochastic relaxation, gibbs distributions and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, (6):721--741, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. F. Goodchild. Citizens as Sensors: The World of Volunteered Geography, pages 370--378. John Wiley and Sons, Ltd, 2011.Google ScholarGoogle Scholar
  25. J. Graesser, A. Cheriyadat, R. R. Vatsavai, V. Chandola, J. Long, and E. Bright. Image based characterization of formal and informal neighborhoods in an urban landscape. Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of, 5(4):1164--1176, August 2012.Google ScholarGoogle Scholar
  26. A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, WebKDD/SNA-KDD '07, pages 56--65, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Jhung and P. H. Swain. Bayesian Contextual Classification Based on Modified M-Estimates and Markov Random Fields. IEEE Transaction on Pattern Analysis and Machine Intelligence, 34(1):67--75, 1996.Google ScholarGoogle Scholar
  28. G. Jun, R. R. Vatsavai, and J. Ghosh. Spatially adaptive classification and active learning of multispectral data with gaussian processes. In ICDM Workshops: Spatial and Spatiotemporal Data Mining (SSTDM), pages 597--603, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. O. Justice, E. Vermote, J. R. Townshend, R. Defries, D. P. Roy, D. K. Hall, V. V. Salomonson, J. L. Privette, G. Riggs, A. Strahler, W. Lucht, R. B. Myneni, Y. Knyazikhin, S. W. Running, S. W. Steve W. Nemani, Z. Wan, A. R. Huete, W. van Leeuwen, R. E. Wolfe, L. Giglio, J.-P. Muller, P. Lewis, and M. J. Barnsley. The moderate resolution imagin spectrradiometer (modis): Land remote sensing for global chang research. IEEE Transactions on Geosciences and Remote Sensing, 36:1228--1249, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  30. S.-C. Kao and A. R. Ganguly. Intensity, duration, and frequency of precipitation extremes under 21st-century warming scenarios. J. Geophys. Res., 116(D16119), 2011.Google ScholarGoogle ScholarCross RefCross Ref
  31. H. Kargupta, J. Gama, and W. Fan. The next generation of transportation systems, greenhouse emissions, and data mining. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '10, pages 1209--1212, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Kargupta, V. Puttagunta, M. Klein, and K. Sarkar. On-board vehicle data stream monitoring using mine-fleet and fast resource constrained monitoring of correlation matrices. New Gen. Comput., 25(1):5--32, Jan. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Kawale, S. Liess, A. Kumar, M. Steinbach, A. Ganguly, N. Samatova, F. Semazzi, P. Snyder, and V. Kumar. Data-guided discovery of climate dipoles in observations and models. In NASA Conference on Intelligent Data Understanding (CIDU), pages 1--15, 2011.Google ScholarGoogle Scholar
  34. B. Kazar, S. Shekhar, D. Lilja, R. Vatsavai, and R. Pace. Comparing exact and approximate spatial auto-regression model solutions for spatial data analysis. In Third International Conference on Geographic Information Science (GIScience2004). LNCS, Springer, October 2004.Google ScholarGoogle ScholarCross RefCross Ref
  35. S. Khan, A. Ganguly, S. Bandyopadhyay, S. Saigal, D. Erickson, V. Protopopescu, and G. Ostrouchov. Non-linear statistics reveals stronger ties between enso and the tropical hydrological cycle. Geophysical Research Letters, 33(L24402):6, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  36. S. Klasky and et. al. In situ data processing for extreme-scale computing. In SicDAC, page 16, 2011.Google ScholarGoogle Scholar
  37. J. LeSage. Bayesian estimation of spatial autoregressive models. International Regional Science Review, (20):113--129, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  38. J. LeSage. Regression Analysis of Spatial data. The Journal of Regional Analysis and Policy (Publisher: Mid-Continent Regional Science Association and UNL College of Business Administration), 27(2):83--94, 1997.Google ScholarGoogle Scholar
  39. J. P. LeSage and R. Pace. Spatial dependence in data mining. In Geographic Data Mining and Knowledge Discovery. Taylor and Francis, forthcoming, 2001.Google ScholarGoogle Scholar
  40. S. Z. Li. Markov random field modeling in image analysis. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Lovell. Left-hand-turn elimination. New York Times, http://goo.gl/3bkPb, December 9, 2007.Google ScholarGoogle Scholar
  42. C. Ma. Spatial autoregression and related spatio-temporal models. J. Multivarate Analysis, 88(1):152--162, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, and A. H. Byers. Big data: The next frontier for innovation, competition and productivity. McKinsey Global Institute, 2011.Google ScholarGoogle Scholar
  44. G. J. McLachlan and K. E. Basford. Mixture Models: Inference and Applications to Clustering. Marcel Dekker, 1988.Google ScholarGoogle Scholar
  45. V. Norris, M. McCahill, and D. Wood. Editorial: The growth of CCTV: a global perspective on the international diffusion of video surveillance in publicly accessible space. Surveillance and Society, 2(2/3):110--135, 2004.Google ScholarGoogle Scholar
  46. J. T. Overpeck, G. A. Meehl, S. Bony, and D. R. Easterling. Climate data challenges in the 21st century. Science, 331(6018):700--702, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  47. R. Pace and R. Barry. Quick Computation of Regressions with a Spatially Autoregressive Dependent Variable. Geographic Analysis, 1997.Google ScholarGoogle Scholar
  48. R. Pace and R. Barry. Sparse spatial autoregressions. Statistics and Probability Letters (Publisher: Elsevier Science), (33):291--297, 1997.Google ScholarGoogle Scholar
  49. C. Rasmussen and C. Williams. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. D. M. Romero, B. Meeder, and J. Kleinberg. Differences in the mechanics of information diffusion across topics: idioms, political hashtags and complex contagion on twitter. In Proceedings of the 20th international conference on World wide web, pages 695--704, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. F. Sahito, A. Latif, and W. Slany. Weaving twitter stream into linked data a proof of concept framework. In 7th International Conference on Emerging Technologies (ICET), pages 1--6, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  52. S. Shekhar, V. Gunturi, M. R. Evans, and K. Yang. Spatial big-data challenges intersecting mobility and cloud computing. In Proceedings of the Eleventh ACM International Workshop on Data Engineering for Wireless and Mobile Access, MobiDE '12, pages 1--6, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. S. Shekhar, P. Schrater, R. Vatsavai, W. Wu, and S. Chawla. Spatial contextual classification and prediction models for mining geospatial data. IEEE Transaction on Multimedia, 4(2):174--188, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. A. H. Solberg, T. Taxt, and A. K. Jain. A Markov Random Field Model for Classification of Multisource Satellite Imagery. IEEE Transaction on Geoscience and Remote Sensing, 34(1):100--113, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  55. A. Stefanidis, A. Crooks, and J. Radzikowski. Harvesting ambient geospatial information from social media feeds. GeoJournal, pages 1--20, 2011.Google ScholarGoogle Scholar
  56. K. Steinhaeuser, A. Ganguly, and N. Chawla. Multivariate and multiscale dependence in the global climate system revealed through complex networks. Climate Dynamics, 39:889--895, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  57. R. R. Vatsavai. Biomon: a google earth based continuous biomass monitoring system. In 17th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems (ACM-GIS), pages 536--537, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. R. R. Vatsavai. Stpminer: a highperformance spatiotemporal pattern mining toolbox. In Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities, PDAC '11, pages 29--34, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. R. R. Vatsavai, A. Cheriyadat, and S. S. Gleason. Unsupervised semantic labeling framework for identification of complex facilities in high-resolution remote sensing images. In ICDM Workshops, pages 273--280, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. R. R. Vatsavai, S. Shekhar, and T. E. Burk. An efficient spatial semi-supervised learning algorithm. International Journal of Parallel, Emergent and Distributed Systems, 22(6):427--437, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. C. E. Warrender and M. F. Augusteijn. Fusion of image classifications using Bayesian techniques with Markov rand fields. International Journal of Remote Sensing, 20(10):1987--2002, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  62. N. Wayant, A. Crooks, A. Stefanidis, A. Croitoru, J. Radzikowski, J. Stahl, and J. Shine. Spatiotemporal clustering of twitter feeds for activity summarization. In GIScience (short paper), 2012.Google ScholarGoogle Scholar

Index Terms

  1. Spatiotemporal data mining in the era of big spatial data: algorithms and applications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      BigSpatial '12: Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data
      November 2012
      116 pages
      ISBN:9781450316927
      DOI:10.1145/2447481

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 November 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate32of58submissions,55%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader