Skip to main content
Erschienen in: International Journal of Data Science and Analytics 2-3/2018

30.09.2017 | Regular Paper

A data mining framework for environmental and geo-spatial data analysis

verfasst von: Sujing Wang, Christoph F. Eick

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 2-3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Mining geo-spatial data is an important task in many application domains, such as environmental science, geographic information science, and social networks. In this paper, we introduce a data mining framework, which includes pre-processing of environmental and geo-spatial data, geo-spatial data mining techniques, and visual analysis of environmental and geo-spatial data. In particular, we propose new density-based clustering algorithms to identify interesting distribution patterns from geo-spatial data, a change pattern discovery technique to detect dynamic change patterns within spatial clusters, and a post-processing technique to extract interesting patterns and useful knowledge from geo-spatial data. Our density-based clustering algorithms are based on the well-established density-based shared nearest neighbor clustering algorithm, which can find clusters of different shape, size, and densities in high-dimensional data. The post-processing analysis technique allows automatic screening of interesting spatial clusters. The change pattern discovery algorithm is able to detect and analyze dynamic patterns of changes within spatial clusters. This paper focuses on developing a framework integrating a sequence of data mining process including clustering algorithm, analysis technique and pattern changing discovery algorithm. In contrast to previous works in this area, our approaches can cluster and analyze dynamically evolved complex objects, i.e., polygons. We evaluate the effectiveness of our techniques through a challenging real case study involving ozone pollution events in the Houston–Galveston–Brazoria area. The experimental results show that our approaches can discover interesting patterns and useful information from geo-spatial air-quality data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Han, J., Kamber, M., Tung, A.: Spatial Clustering Methods in Data Mining: A Survey, Geographic Data Mining and Knowledge Discovery, Research Monographs in GIS. Taylor and Francis, Abingdon (2001)CrossRef Han, J., Kamber, M., Tung, A.: Spatial Clustering Methods in Data Mining: A Survey, Geographic Data Mining and Knowledge Discovery, Research Monographs in GIS. Taylor and Francis, Abingdon (2001)CrossRef
2.
Zurück zum Zitat Chawla, S., Shekhar, S., Wu, W., Ozesmi, U.: Modeling spatial dependencies for mining geospatial data. In: Proceedings of the 2001 SIAM International Conference on Data Mining (2001) Chawla, S., Shekhar, S., Wu, W., Ozesmi, U.: Modeling spatial dependencies for mining geospatial data. In: Proceedings of the 2001 SIAM International Conference on Data Mining (2001)
3.
Zurück zum Zitat Ertoz, L., Steinback, M., Kumar, V.: Finding clusters of different sizes, shapes, and density in noisy high dimensional data. In: Proceedings of the 3rd SIAM International Conference on Data Mining, San Francisco, CA, USA, May (2003) Ertoz, L., Steinback, M., Kumar, V.: Finding clusters of different sizes, shapes, and density in noisy high dimensional data. In: Proceedings of the 3rd SIAM International Conference on Data Mining, San Francisco, CA, USA, May (2003)
5.
Zurück zum Zitat Iyengar, S.: On detecting space-time clusters. In: Proceedings of the 10th ACM SIGMOD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, August (2004) Iyengar, S.: On detecting space-time clusters. In: Proceedings of the 10th ACM SIGMOD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, August (2004)
6.
Zurück zum Zitat Wang, M., Wang, A., Li, A.: Mining spatial–temporal clusters from geodatabases. Lect. Notes Comput. Sci. 4093, 263–270 (2006)CrossRef Wang, M., Wang, A., Li, A.: Mining spatial–temporal clusters from geodatabases. Lect. Notes Comput. Sci. 4093, 263–270 (2006)CrossRef
7.
Zurück zum Zitat Birant, D., Kut, A.: ST-DBSCAN: an algorithm for clustering spatial–temporal data. Data Knowl. Eng. 60, 208–221 (2007)CrossRef Birant, D., Kut, A.: ST-DBSCAN: an algorithm for clustering spatial–temporal data. Data Knowl. Eng. 60, 208–221 (2007)CrossRef
8.
Zurück zum Zitat Kisilevich, S., Mansmann, F., Rinzivillo, S., Nanni, M.: Spatio-temporal clustering: a survey. In: Data Mining and knowledge Discovery Handlbook, pp. 269–298 (2010) Kisilevich, S., Mansmann, F., Rinzivillo, S., Nanni, M.: Spatio-temporal clustering: a survey. In: Data Mining and knowledge Discovery Handlbook, pp. 269–298 (2010)
9.
Zurück zum Zitat Gaffney, S., Smyth, P.: Trajectory clustering with mixtures of regression models. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Minin, San Diego, CA, USA, August (1999) Gaffney, S., Smyth, P.: Trajectory clustering with mixtures of regression models. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Minin, San Diego, CA, USA, August (1999)
10.
Zurück zum Zitat Pelekis, N., Kopanakis, I., Marketos, G., Ntoutsi, I., Andrienko, G., Theodoridis, Y.: Similarity search in trajectory databases. In Proceedings of the 14th International Symposium on Temporal Representation and Reasoning, Alicante, Spain, June (2007) Pelekis, N., Kopanakis, I., Marketos, G., Ntoutsi, I., Andrienko, G., Theodoridis, Y.: Similarity search in trajectory databases. In Proceedings of the 14th International Symposium on Temporal Representation and Reasoning, Alicante, Spain, June (2007)
11.
Zurück zum Zitat Nanni, M., Pedreschii, D.: Time-focused clustering of trajectories of moving objects. J. Intell. Inf. Syst. 27, 267–289 (2006)CrossRef Nanni, M., Pedreschii, D.: Time-focused clustering of trajectories of moving objects. J. Intell. Inf. Syst. 27, 267–289 (2006)CrossRef
12.
Zurück zum Zitat Rinzivillo, S., Pedreschi, D., Nanni, M., Giannotti, F., Andrienko, N., Andrienko, G.: Visually driven analysis of movement data by progressive clustering. Inf. Vis. 7, 225–239 (2008)CrossRef Rinzivillo, S., Pedreschi, D., Nanni, M., Giannotti, F., Andrienko, N., Andrienko, G.: Visually driven analysis of movement data by progressive clustering. Inf. Vis. 7, 225–239 (2008)CrossRef
13.
Zurück zum Zitat Li, Y., Han, J., Yang, J.: Clustering moving objects. In: Proceedings of the 10th ACM SIGMOD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, August (2004) Li, Y., Han, J., Yang, J.: Clustering moving objects. In: Proceedings of the 10th ACM SIGMOD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, August (2004)
16.
Zurück zum Zitat Li, Z., Ding, B., Han, J., Kays, R.: Swarm: Mining relaxed temporal moving object clusters. In: PVLDB, vol. 3, pp. 723–734 (2010) Li, Z., Ding, B., Han, J., Kays, R.: Swarm: Mining relaxed temporal moving object clusters. In: PVLDB, vol. 3, pp. 723–734 (2010)
17.
Zurück zum Zitat Benkert, M., Gudmundsson, J., Hubner, F., Wolle, T.: Reporting flock patterns. In: COMGEO (2008) Benkert, M., Gudmundsson, J., Hubner, F., Wolle, T.: Reporting flock patterns. In: COMGEO (2008)
18.
Zurück zum Zitat Gudmundsson, J., van Kreveld M.: Computing longest duration flocks in trajectory data. In: GIS (2006) Gudmundsson, J., van Kreveld M.: Computing longest duration flocks in trajectory data. In: GIS (2006)
19.
Zurück zum Zitat Jeung, H., Yiu, M.L., Zhou, X., Jensen C.S., Shen, H.T.: Discovery of convoys in trajectory databases. In: PVLDB (2008) Jeung, H., Yiu, M.L., Zhou, X., Jensen C.S., Shen, H.T.: Discovery of convoys in trajectory databases. In: PVLDB (2008)
20.
Zurück zum Zitat Chen, L., Ozsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: SIGMOD (2005) Chen, L., Ozsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: SIGMOD (2005)
21.
Zurück zum Zitat Vlachos, M., Gunopulos, D., Kollios, G.: Discovering similar multidimensional trajectories. In: ICDE (2002) Vlachos, M., Gunopulos, D., Kollios, G.: Discovering similar multidimensional trajectories. In: ICDE (2002)
22.
Zurück zum Zitat MaIntosh, J., Yuan, M.: A framework to enhance semantic flexibility for analysis of distributed phenomena. Int. J. Geogr. Inf. Sci. 19, 999–1018 (2005) MaIntosh, J., Yuan, M.: A framework to enhance semantic flexibility for analysis of distributed phenomena. Int. J. Geogr. Inf. Sci. 19, 999–1018 (2005)
23.
Zurück zum Zitat Rinsurongkawong, V., Chen, C.-S., Eick, C.F., Twa, M.: Analyzing change in spatial data by utilizing polygon models. In: Proceedings of International Conference on Computing for Geospatial Research and Application, Washington DC, USA, June (2010) Rinsurongkawong, V., Chen, C.-S., Eick, C.F., Twa, M.: Analyzing change in spatial data by utilizing polygon models. In: Proceedings of International Conference on Computing for Geospatial Research and Application, Washington DC, USA, June (2010)
24.
Zurück zum Zitat Stell, J., Mondo, G.D., Thibaud, R., Claramunt, C.: Spatio-temporal evolution as bigraph dynamics. In: COSIT 2011: Spatial Information Theory, pp. 148–167 (2011) Stell, J., Mondo, G.D., Thibaud, R., Claramunt, C.: Spatio-temporal evolution as bigraph dynamics. In: COSIT 2011: Spatial Information Theory, pp. 148–167 (2011)
26.
Zurück zum Zitat Chen, C., Rinsurongkawong, V., Eick, C., Twa, M.: Change analysis in spatial data by combining contouring algorithms with supervised density functions. In: Proceedings of the 13th Asia-Pacific Conference on Knowledge Discovery and Data Mining, Bangkok, Thailand, April (2009) Chen, C., Rinsurongkawong, V., Eick, C., Twa, M.: Change analysis in spatial data by combining contouring algorithms with supervised density functions. In: Proceedings of the 13th Asia-Pacific Conference on Knowledge Discovery and Data Mining, Bangkok, Thailand, April (2009)
27.
Zurück zum Zitat Hangouet, J.: Computing of the hausdorff distance between plane vector polylines. In: Proceedings of the 8th International Symposium on Computer-Assisted Cartography, Charlotte, North Carolina, USA, February (1995) Hangouet, J.: Computing of the hausdorff distance between plane vector polylines. In: Proceedings of the 8th International Symposium on Computer-Assisted Cartography, Charlotte, North Carolina, USA, February (1995)
28.
Zurück zum Zitat Buchin, K., Buchin, M., C, W.: Computing the frchet distance between simple polygons in polynomial time. In: Proceedings of the 22nd ACM Symposium on Computational Geometry, Sedona, Arizona, USA, June (2006) Buchin, K., Buchin, M., C, W.: Computing the frchet distance between simple polygons in polynomial time. In: Proceedings of the 22nd ACM Symposium on Computational Geometry, Sedona, Arizona, USA, June (2006)
29.
Zurück zum Zitat Joshi, D., Samal, A., Soh, L.: A dissimilarity function for clustering geospatial polygons. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS), Seattle, Washington, USA, November (2009) Joshi, D., Samal, A., Soh, L.: A dissimilarity function for clustering geospatial polygons. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS), Seattle, Washington, USA, November (2009)
30.
Zurück zum Zitat Wang, S., Chen, C., Rinsurongkawong, V., Akdag, F., Eick, C.: A polygon-based methodology for mining related spatial datasets. In: Proceedings of the 18th ACM SIGSPATIAL Conference on Advances in Geographic Information Systems Workshop on Data Mining for Geoinformatics (DMGI), San Jose, CA, USA, November (2010) Wang, S., Chen, C., Rinsurongkawong, V., Akdag, F., Eick, C.: A polygon-based methodology for mining related spatial datasets. In: Proceedings of the 18th ACM SIGSPATIAL Conference on Advances in Geographic Information Systems Workshop on Data Mining for Geoinformatics (DMGI), San Jose, CA, USA, November (2010)
31.
Zurück zum Zitat Atallah, M., Ribeiro, C., Lifschitz, S.: Computing some distance functions between polygons. Pattern Recogn. 24(8), 775–781 (1991)CrossRefMATH Atallah, M., Ribeiro, C., Lifschitz, S.: Computing some distance functions between polygons. Pattern Recogn. 24(8), 775–781 (1991)CrossRefMATH
32.
Zurück zum Zitat Lu, R., Turco, R.: Air pollutant transport in a coastal environment. part i: Two-dimensional simulations of sea-breeze and mountain effects. J. Atmos. Sci. 51, 2285–2308 (1994)CrossRef Lu, R., Turco, R.: Air pollutant transport in a coastal environment. part i: Two-dimensional simulations of sea-breeze and mountain effects. J. Atmos. Sci. 51, 2285–2308 (1994)CrossRef
Metadaten
Titel
A data mining framework for environmental and geo-spatial data analysis
verfasst von
Sujing Wang
Christoph F. Eick
Publikationsdatum
30.09.2017
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 2-3/2018
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-017-0075-9

Weitere Artikel der Ausgabe 2-3/2018

International Journal of Data Science and Analytics 2-3/2018 Zur Ausgabe

Premium Partner