Skip to main content
Erschienen in: International Journal of Data Science and Analytics 4/2017

07.09.2017 | Regular Paper

A spectral clustering approach for multivariate geostatistical data

verfasst von: Francky Fouedjio

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 4/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Spectral clustering has recently become one of the most popular modern clustering methods for conventional data. However, applied to geostatistical data, the general spectral clustering method produces clusters that are spatially non-contiguous which is certainly undesirable for many geoscience applications. In this paper, a spectral clustering approach is proposed, allowing to discover spatially contiguous and meaningful clusters in multivariate geostatistical data, in which spatial dependence plays an important role. The proposed spectral clustering approach relies on a similarity measure built from a nonparametric kernel estimator of the multivariate spatial dependence structure of the data, emphasizing the spatial correlation among data locations. It integrates existing methods to find the relevant number of clusters and to assess the contribution of variables in the formation of the clusters. The results from both synthetic and real-world datasets demonstrate that the proposed spectral clustering approach can effectively provide spatially contiguous and meaningful clusters.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Allard, D.: Geostatistical classification and class kriging. J. Geogr. Inf. Decis. Anal. 2, 87–101 (1998) Allard, D.: Geostatistical classification and class kriging. J. Geogr. Inf. Decis. Anal. 2, 87–101 (1998)
2.
Zurück zum Zitat Allard, D., Guillot, G.: Clustering geostatistical data. In: Proceedings of the Sixth Geostatistical Conference (2000) Allard, D., Guillot, G.: Clustering geostatistical data. In: Proceedings of the Sixth Geostatistical Conference (2000)
3.
Zurück zum Zitat Allard, D., Monestiez, P.: Geostatistical segmentation of rainfall data. In geoENV II: Geostatistics for Environmental Applications pp. 139–150 (1999) Allard, D., Monestiez, P.: Geostatistical segmentation of rainfall data. In geoENV II: Geostatistics for Environmental Applications pp. 139–150 (1999)
4.
Zurück zum Zitat Ambroise, C., Dang, M., Govaert, G.: Clustering of spatial data by the EM algorithm. In geoENV I: Geostatistics for Environmental Applications pp. 493–504 (1995) Ambroise, C., Dang, M., Govaert, G.: Clustering of spatial data by the EM algorithm. In geoENV I: Geostatistics for Environmental Applications pp. 493–504 (1995)
5.
Zurück zum Zitat Bel, L., Allard, D., Laurent, J., Cheddadi, R., Bar-Hen, A.: CART algorithm for spatial data: application to environmental and ecological data. Comput. Stat. Data Anal. 53, 3082–3093 (2009)CrossRefMATHMathSciNet Bel, L., Allard, D., Laurent, J., Cheddadi, R., Bar-Hen, A.: CART algorithm for spatial data: application to environmental and ecological data. Comput. Stat. Data Anal. 53, 3082–3093 (2009)CrossRefMATHMathSciNet
6.
Zurück zum Zitat Bourgault, G., Marcotte, D., Legendre, P.: The multivariate (co)variogram as a spatial weighting function in classification methods. Math. Geol. 24(5), 463–478 (1992)CrossRef Bourgault, G., Marcotte, D., Legendre, P.: The multivariate (co)variogram as a spatial weighting function in classification methods. Math. Geol. 24(5), 463–478 (1992)CrossRef
8.
Zurück zum Zitat Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)MATHMathSciNet Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)MATHMathSciNet
9.
10.
Zurück zum Zitat Caritat, P., Cooper, M.: National geochemical survey of Australia: The geochemical atlas of Australia. Geoscience Australia Record 2011/020 (2011) Caritat, P., Cooper, M.: National geochemical survey of Australia: The geochemical atlas of Australia. Geoscience Australia Record 2011/020 (2011)
11.
Zurück zum Zitat Charu, C., Chandan, K.: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC, London (2013)MATH Charu, C., Chandan, K.: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC, London (2013)MATH
12.
Zurück zum Zitat Chilès, J.P., Delfiner, P.: Geostatistics: Modeling Spatial Uncertainty. Wiley, New York (2012)CrossRefMATH Chilès, J.P., Delfiner, P.: Geostatistics: Modeling Spatial Uncertainty. Wiley, New York (2012)CrossRefMATH
13.
Zurück zum Zitat Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(2), 224–227 (1979)CrossRef Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(2), 224–227 (1979)CrossRef
14.
Zurück zum Zitat Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via EM algorithm. J. R. Stat. Soc. Ser. 39, 1–38 (1977). (with discussion)MATHMathSciNet Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via EM algorithm. J. R. Stat. Soc. Ser. 39, 1–38 (1977). (with discussion)MATHMathSciNet
15.
Zurück zum Zitat Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recogn. 41(1), 176–190 (2008)CrossRefMATH Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recogn. 41(1), 176–190 (2008)CrossRefMATH
16.
Zurück zum Zitat Fotheringham, A.S., Brunsdon, C., Charlton, M.: Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. Wiley, New York (2002)MATH Fotheringham, A.S., Brunsdon, C., Charlton, M.: Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. Wiley, New York (2002)MATH
17.
Zurück zum Zitat Fouedjio, F.: A clustering approach for discovering intrinsic clusters in multivariate geostatistical data. In: Perner, P. (ed.) MLDM 2016, pp. 491–500. Springer, Berlin (2016) Fouedjio, F.: A clustering approach for discovering intrinsic clusters in multivariate geostatistical data. In: Perner, P. (ed.) MLDM 2016, pp. 491–500. Springer, Berlin (2016)
18.
Zurück zum Zitat Fouedjio, F.: Discovering spatially contiguous clusters in multivariate geostatistical data through spectral clustering. In: Li, J., et al. (eds.) ADMA 2016, pp. 547–557. Springer, Berlin (2016) Fouedjio, F.: Discovering spatially contiguous clusters in multivariate geostatistical data through spectral clustering. In: Li, J., et al. (eds.) ADMA 2016, pp. 547–557. Springer, Berlin (2016)
19.
Zurück zum Zitat Fouedjio, F.: A hierarchical clustering method for multivariate geostatistical data. Spat. Stat. 18, 334–351 (2016)CrossRefMathSciNet Fouedjio, F.: A hierarchical clustering method for multivariate geostatistical data. Spat. Stat. 18, 334–351 (2016)CrossRefMathSciNet
20.
Zurück zum Zitat Gneiting, T., Kleiber, W., Schlather, M.: Cross-covariance functions for multivariate random fields. J Am. Stat. Assoc. 105, 1167–1177 (2010)CrossRefMATHMathSciNet Gneiting, T., Kleiber, W., Schlather, M.: Cross-covariance functions for multivariate random fields. J Am. Stat. Assoc. 105, 1167–1177 (2010)CrossRefMATHMathSciNet
21.
Zurück zum Zitat Guillot, G., Kan-King-Yu, D., Michelin, J., Huet, P.: Inference of a hidden spatial tessellation from multivariate data: application to the delineation of homogeneous regions in an agricultural field. J. R. Stat. Soc. Ser. C (Appl. Stat.) 55(3), 407–430 (2006)CrossRefMATHMathSciNet Guillot, G., Kan-King-Yu, D., Michelin, J., Huet, P.: Inference of a hidden spatial tessellation from multivariate data: application to the delineation of homogeneous regions in an agricultural field. J. R. Stat. Soc. Ser. C (Appl. Stat.) 55(3), 407–430 (2006)CrossRefMATHMathSciNet
22.
Zurück zum Zitat Haas, T.C.: Lognormal and moving window methods of estimating acid deposition. J. Am. Stat. Assoc. 85(412), 950–963 (1990)CrossRef Haas, T.C.: Lognormal and moving window methods of estimating acid deposition. J. Am. Stat. Assoc. 85(412), 950–963 (1990)CrossRef
23.
Zurück zum Zitat Hui, X., Zhongmou, L.: Clustering validation measures. In: Charu, C., Chandan, K. (eds.) Data Clustering, pp. 571–605. Chapman and Hall/CRC, London (2013) Hui, X., Zhongmou, L.: Clustering validation measures. In: Charu, C., Chandan, K. (eds.) Data Clustering, pp. 571–605. Chapman and Hall/CRC, London (2013)
24.
Zurück zum Zitat Journel, A., Huijbregts, C.: Mining Geostatistics. Blackburn Press, Caldwell (2003) Journel, A., Huijbregts, C.: Mining Geostatistics. Blackburn Press, Caldwell (2003)
26.
Zurück zum Zitat Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRefMATH Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRefMATH
27.
28.
Zurück zum Zitat Liu, J., Han, J.: Spectral clustering. In: Charu, C., Chandan, K. (eds.) Data Clustering, pp. 177–199. Chapman and Hall/CRC, London (2013) Liu, J., Han, J.: Spectral clustering. In: Charu, C., Chandan, K. (eds.) Data Clustering, pp. 177–199. Chapman and Hall/CRC, London (2013)
29.
Zurück zum Zitat Li, R., Fan, J., Jiang, J., Wu, H.: Spatiotemporal correlation in WebGIS group-user intensive access patterns. Int. J. Geogr. Inf. Sci. 31(1), 36–55 (2017)CrossRef Li, R., Fan, J., Jiang, J., Wu, H.: Spatiotemporal correlation in WebGIS group-user intensive access patterns. Int. J. Geogr. Inf. Sci. 31(1), 36–55 (2017)CrossRef
30.
Zurück zum Zitat Loglisci, C., Appice, A., Malerba, D.: Collective regression for handling autocorrelation of network data in a transductive setting. J. Intell. Inf. Syst. 46(3), 447–472 (2016)CrossRef Loglisci, C., Appice, A., Malerba, D.: Collective regression for handling autocorrelation of network data in a transductive setting. J. Intell. Inf. Syst. 46(3), 447–472 (2016)CrossRef
31.
Zurück zum Zitat Luxburg, U.V.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007) Luxburg, U.V.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
33.
Zurück zum Zitat Luxburg, U.V., Bousquet, O., Belkin, M.: Limits of spectral clustering. In: Advances in Neural Information Processing Systems. pp. 857–864 (2004) Luxburg, U.V., Bousquet, O., Belkin, M.: Limits of spectral clustering. In: Advances in Neural Information Processing Systems. pp. 857–864 (2004)
34.
Zurück zum Zitat Montgomery, D.: Design and Analysis of Experiments, 8th edn. Wiley, New York (2012) Montgomery, D.: Design and Analysis of Experiments, 8th edn. Wiley, New York (2012)
35.
Zurück zum Zitat Nascimento, M.C., de Carvalho, A.C.: Spectral methods for graph clustering a survey. Eur. J. Oper. Res. 211(2), 221–231 (2011)CrossRefMATHMathSciNet Nascimento, M.C., de Carvalho, A.C.: Spectral methods for graph clustering a survey. Eur. J. Oper. Res. 211(2), 221–231 (2011)CrossRefMATHMathSciNet
36.
Zurück zum Zitat Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advanced in Neural Information Processing Systems. pp. 849–856. MIT Press (2001) Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advanced in Neural Information Processing Systems. pp. 849–856. MIT Press (2001)
37.
Zurück zum Zitat Olivier, M., Webster, R.: A geostatistical basis for spatial weighting in multivariate classification. Math. Geol. 21, 15–35 (1989)CrossRef Olivier, M., Webster, R.: A geostatistical basis for spatial weighting in multivariate classification. Math. Geol. 21, 15–35 (1989)CrossRef
38.
Zurück zum Zitat Pawitan, Y., Huang, J.: Constrained clustering of irregularly sampled spatial data. J. Stat. Comput. Simul. 73(12), 853–865 (2003)CrossRefMATHMathSciNet Pawitan, Y., Huang, J.: Constrained clustering of irregularly sampled spatial data. J. Stat. Comput. Simul. 73(12), 853–865 (2003)CrossRefMATHMathSciNet
39.
Zurück zum Zitat Romary, T., Ors, F., Rivoirard, J., Deraisme, J.: Unsupervised classification of multivariate geostatistical data: two algorithms. Comput. Geosci. 85(Part B), 96–103 (2015)CrossRef Romary, T., Ors, F., Rivoirard, J., Deraisme, J.: Unsupervised classification of multivariate geostatistical data: two algorithms. Comput. Geosci. 85(Part B), 96–103 (2015)CrossRef
40.
Zurück zum Zitat Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRefMATH Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRefMATH
41.
42.
Zurück zum Zitat Schuenemeyer, J., Drew, L.: Statistics for Earth and Environmental Scientists. Wiley, New York (2011)MATH Schuenemeyer, J., Drew, L.: Statistics for Earth and Environmental Scientists. Wiley, New York (2011)MATH
43.
Zurück zum Zitat Stojanova, D., Ceci, M., Appice, A., Džeroski, S.: Network regression with predictive clustering trees. Data Min. Knowl. Discovery 25(2), 378–413 (2012)CrossRefMATHMathSciNet Stojanova, D., Ceci, M., Appice, A., Džeroski, S.: Network regression with predictive clustering trees. Data Min. Knowl. Discovery 25(2), 378–413 (2012)CrossRefMATHMathSciNet
44.
Zurück zum Zitat Stojanova, D., Ceci, M., Appice, A., Malerba, D., Džeroski, S.: Dealing with spatial autocorrelation when learning predictive clustering trees. Ecol. Inf. 13, 22–39 (2013)CrossRef Stojanova, D., Ceci, M., Appice, A., Malerba, D., Džeroski, S.: Dealing with spatial autocorrelation when learning predictive clustering trees. Ecol. Inf. 13, 22–39 (2013)CrossRef
45.
Zurück zum Zitat Tao, J., Chloissnig, S., Karl, W.: Analysis of the spatial and temporal locality in data accesses. In: Computational Science – ICCS 2006: 6th International Conference, Reading, UK, May 28–31, 2006. Proceedings, Part II. Springer. pp. 502–509 (2006) Tao, J., Chloissnig, S., Karl, W.: Analysis of the spatial and temporal locality in data accesses. In: Computational Science – ICCS 2006: 6th International Conference, Reading, UK, May 28–31, 2006. Proceedings, Part II. Springer. pp. 502–509 (2006)
46.
Zurück zum Zitat Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press, London (2009)MATH Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press, London (2009)MATH
47.
Zurück zum Zitat Wackernagel, H.: Multivariate Geostatistics: An Introduction with Applications. Springer, Berlin (2003)CrossRefMATH Wackernagel, H.: Multivariate Geostatistics: An Introduction with Applications. Springer, Berlin (2003)CrossRefMATH
48.
Zurück zum Zitat Wand, M., Jones, C.: Kernel Smoothing. Monographs on Statistics and Applied Probability. Chapman and Hall, London (1995) Wand, M., Jones, C.: Kernel Smoothing. Monographs on Statistics and Applied Probability. Chapman and Hall, London (1995)
49.
Zurück zum Zitat Zha, H., He, X., Ding, C., Gu, M., Simon, H.D.: Spectral relaxation for k-means clustering. In: Advances in neural information processing systems. pp. 1057–1064 (2001) Zha, H., He, X., Ding, C., Gu, M., Simon, H.D.: Spectral relaxation for k-means clustering. In: Advances in neural information processing systems. pp. 1057–1064 (2001)
50.
Zurück zum Zitat Zhao, M., X. Li, X.: An application of spatial decision tree for classification of air pollution index. In: 19th International Conference on Geoinformatics. IEEE Computer Society. pp. 1–6 (2011) Zhao, M., X. Li, X.: An application of spatial decision tree for classification of air pollution index. In: 19th International Conference on Geoinformatics. IEEE Computer Society. pp. 1–6 (2011)
Metadaten
Titel
A spectral clustering approach for multivariate geostatistical data
verfasst von
Francky Fouedjio
Publikationsdatum
07.09.2017
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 4/2017
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-017-0069-7

Weitere Artikel der Ausgabe 4/2017

International Journal of Data Science and Analytics 4/2017 Zur Ausgabe