Skip to main content
Top
Published in: International Journal of Data Science and Analytics 4/2017

07-09-2017 | Regular Paper

A spectral clustering approach for multivariate geostatistical data

Author: Francky Fouedjio

Published in: International Journal of Data Science and Analytics | Issue 4/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Spectral clustering has recently become one of the most popular modern clustering methods for conventional data. However, applied to geostatistical data, the general spectral clustering method produces clusters that are spatially non-contiguous which is certainly undesirable for many geoscience applications. In this paper, a spectral clustering approach is proposed, allowing to discover spatially contiguous and meaningful clusters in multivariate geostatistical data, in which spatial dependence plays an important role. The proposed spectral clustering approach relies on a similarity measure built from a nonparametric kernel estimator of the multivariate spatial dependence structure of the data, emphasizing the spatial correlation among data locations. It integrates existing methods to find the relevant number of clusters and to assess the contribution of variables in the formation of the clusters. The results from both synthetic and real-world datasets demonstrate that the proposed spectral clustering approach can effectively provide spatially contiguous and meaningful clusters.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Allard, D.: Geostatistical classification and class kriging. J. Geogr. Inf. Decis. Anal. 2, 87–101 (1998) Allard, D.: Geostatistical classification and class kriging. J. Geogr. Inf. Decis. Anal. 2, 87–101 (1998)
2.
go back to reference Allard, D., Guillot, G.: Clustering geostatistical data. In: Proceedings of the Sixth Geostatistical Conference (2000) Allard, D., Guillot, G.: Clustering geostatistical data. In: Proceedings of the Sixth Geostatistical Conference (2000)
3.
go back to reference Allard, D., Monestiez, P.: Geostatistical segmentation of rainfall data. In geoENV II: Geostatistics for Environmental Applications pp. 139–150 (1999) Allard, D., Monestiez, P.: Geostatistical segmentation of rainfall data. In geoENV II: Geostatistics for Environmental Applications pp. 139–150 (1999)
4.
go back to reference Ambroise, C., Dang, M., Govaert, G.: Clustering of spatial data by the EM algorithm. In geoENV I: Geostatistics for Environmental Applications pp. 493–504 (1995) Ambroise, C., Dang, M., Govaert, G.: Clustering of spatial data by the EM algorithm. In geoENV I: Geostatistics for Environmental Applications pp. 493–504 (1995)
5.
go back to reference Bel, L., Allard, D., Laurent, J., Cheddadi, R., Bar-Hen, A.: CART algorithm for spatial data: application to environmental and ecological data. Comput. Stat. Data Anal. 53, 3082–3093 (2009)CrossRefMATHMathSciNet Bel, L., Allard, D., Laurent, J., Cheddadi, R., Bar-Hen, A.: CART algorithm for spatial data: application to environmental and ecological data. Comput. Stat. Data Anal. 53, 3082–3093 (2009)CrossRefMATHMathSciNet
6.
go back to reference Bourgault, G., Marcotte, D., Legendre, P.: The multivariate (co)variogram as a spatial weighting function in classification methods. Math. Geol. 24(5), 463–478 (1992)CrossRef Bourgault, G., Marcotte, D., Legendre, P.: The multivariate (co)variogram as a spatial weighting function in classification methods. Math. Geol. 24(5), 463–478 (1992)CrossRef
8.
go back to reference Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)MATHMathSciNet Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)MATHMathSciNet
10.
go back to reference Caritat, P., Cooper, M.: National geochemical survey of Australia: The geochemical atlas of Australia. Geoscience Australia Record 2011/020 (2011) Caritat, P., Cooper, M.: National geochemical survey of Australia: The geochemical atlas of Australia. Geoscience Australia Record 2011/020 (2011)
11.
go back to reference Charu, C., Chandan, K.: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC, London (2013)MATH Charu, C., Chandan, K.: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC, London (2013)MATH
12.
go back to reference Chilès, J.P., Delfiner, P.: Geostatistics: Modeling Spatial Uncertainty. Wiley, New York (2012)CrossRefMATH Chilès, J.P., Delfiner, P.: Geostatistics: Modeling Spatial Uncertainty. Wiley, New York (2012)CrossRefMATH
13.
go back to reference Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(2), 224–227 (1979)CrossRef Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(2), 224–227 (1979)CrossRef
14.
go back to reference Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via EM algorithm. J. R. Stat. Soc. Ser. 39, 1–38 (1977). (with discussion)MATHMathSciNet Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via EM algorithm. J. R. Stat. Soc. Ser. 39, 1–38 (1977). (with discussion)MATHMathSciNet
15.
go back to reference Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recogn. 41(1), 176–190 (2008)CrossRefMATH Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recogn. 41(1), 176–190 (2008)CrossRefMATH
16.
go back to reference Fotheringham, A.S., Brunsdon, C., Charlton, M.: Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. Wiley, New York (2002)MATH Fotheringham, A.S., Brunsdon, C., Charlton, M.: Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. Wiley, New York (2002)MATH
17.
go back to reference Fouedjio, F.: A clustering approach for discovering intrinsic clusters in multivariate geostatistical data. In: Perner, P. (ed.) MLDM 2016, pp. 491–500. Springer, Berlin (2016) Fouedjio, F.: A clustering approach for discovering intrinsic clusters in multivariate geostatistical data. In: Perner, P. (ed.) MLDM 2016, pp. 491–500. Springer, Berlin (2016)
18.
go back to reference Fouedjio, F.: Discovering spatially contiguous clusters in multivariate geostatistical data through spectral clustering. In: Li, J., et al. (eds.) ADMA 2016, pp. 547–557. Springer, Berlin (2016) Fouedjio, F.: Discovering spatially contiguous clusters in multivariate geostatistical data through spectral clustering. In: Li, J., et al. (eds.) ADMA 2016, pp. 547–557. Springer, Berlin (2016)
19.
go back to reference Fouedjio, F.: A hierarchical clustering method for multivariate geostatistical data. Spat. Stat. 18, 334–351 (2016)CrossRefMathSciNet Fouedjio, F.: A hierarchical clustering method for multivariate geostatistical data. Spat. Stat. 18, 334–351 (2016)CrossRefMathSciNet
20.
go back to reference Gneiting, T., Kleiber, W., Schlather, M.: Cross-covariance functions for multivariate random fields. J Am. Stat. Assoc. 105, 1167–1177 (2010)CrossRefMATHMathSciNet Gneiting, T., Kleiber, W., Schlather, M.: Cross-covariance functions for multivariate random fields. J Am. Stat. Assoc. 105, 1167–1177 (2010)CrossRefMATHMathSciNet
21.
go back to reference Guillot, G., Kan-King-Yu, D., Michelin, J., Huet, P.: Inference of a hidden spatial tessellation from multivariate data: application to the delineation of homogeneous regions in an agricultural field. J. R. Stat. Soc. Ser. C (Appl. Stat.) 55(3), 407–430 (2006)CrossRefMATHMathSciNet Guillot, G., Kan-King-Yu, D., Michelin, J., Huet, P.: Inference of a hidden spatial tessellation from multivariate data: application to the delineation of homogeneous regions in an agricultural field. J. R. Stat. Soc. Ser. C (Appl. Stat.) 55(3), 407–430 (2006)CrossRefMATHMathSciNet
22.
go back to reference Haas, T.C.: Lognormal and moving window methods of estimating acid deposition. J. Am. Stat. Assoc. 85(412), 950–963 (1990)CrossRef Haas, T.C.: Lognormal and moving window methods of estimating acid deposition. J. Am. Stat. Assoc. 85(412), 950–963 (1990)CrossRef
23.
go back to reference Hui, X., Zhongmou, L.: Clustering validation measures. In: Charu, C., Chandan, K. (eds.) Data Clustering, pp. 571–605. Chapman and Hall/CRC, London (2013) Hui, X., Zhongmou, L.: Clustering validation measures. In: Charu, C., Chandan, K. (eds.) Data Clustering, pp. 571–605. Chapman and Hall/CRC, London (2013)
24.
go back to reference Journel, A., Huijbregts, C.: Mining Geostatistics. Blackburn Press, Caldwell (2003) Journel, A., Huijbregts, C.: Mining Geostatistics. Blackburn Press, Caldwell (2003)
26.
go back to reference Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRefMATH Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRefMATH
27.
28.
go back to reference Liu, J., Han, J.: Spectral clustering. In: Charu, C., Chandan, K. (eds.) Data Clustering, pp. 177–199. Chapman and Hall/CRC, London (2013) Liu, J., Han, J.: Spectral clustering. In: Charu, C., Chandan, K. (eds.) Data Clustering, pp. 177–199. Chapman and Hall/CRC, London (2013)
29.
go back to reference Li, R., Fan, J., Jiang, J., Wu, H.: Spatiotemporal correlation in WebGIS group-user intensive access patterns. Int. J. Geogr. Inf. Sci. 31(1), 36–55 (2017)CrossRef Li, R., Fan, J., Jiang, J., Wu, H.: Spatiotemporal correlation in WebGIS group-user intensive access patterns. Int. J. Geogr. Inf. Sci. 31(1), 36–55 (2017)CrossRef
30.
go back to reference Loglisci, C., Appice, A., Malerba, D.: Collective regression for handling autocorrelation of network data in a transductive setting. J. Intell. Inf. Syst. 46(3), 447–472 (2016)CrossRef Loglisci, C., Appice, A., Malerba, D.: Collective regression for handling autocorrelation of network data in a transductive setting. J. Intell. Inf. Syst. 46(3), 447–472 (2016)CrossRef
31.
go back to reference Luxburg, U.V.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007) Luxburg, U.V.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
33.
go back to reference Luxburg, U.V., Bousquet, O., Belkin, M.: Limits of spectral clustering. In: Advances in Neural Information Processing Systems. pp. 857–864 (2004) Luxburg, U.V., Bousquet, O., Belkin, M.: Limits of spectral clustering. In: Advances in Neural Information Processing Systems. pp. 857–864 (2004)
34.
go back to reference Montgomery, D.: Design and Analysis of Experiments, 8th edn. Wiley, New York (2012) Montgomery, D.: Design and Analysis of Experiments, 8th edn. Wiley, New York (2012)
35.
36.
go back to reference Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advanced in Neural Information Processing Systems. pp. 849–856. MIT Press (2001) Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advanced in Neural Information Processing Systems. pp. 849–856. MIT Press (2001)
37.
go back to reference Olivier, M., Webster, R.: A geostatistical basis for spatial weighting in multivariate classification. Math. Geol. 21, 15–35 (1989)CrossRef Olivier, M., Webster, R.: A geostatistical basis for spatial weighting in multivariate classification. Math. Geol. 21, 15–35 (1989)CrossRef
38.
go back to reference Pawitan, Y., Huang, J.: Constrained clustering of irregularly sampled spatial data. J. Stat. Comput. Simul. 73(12), 853–865 (2003)CrossRefMATHMathSciNet Pawitan, Y., Huang, J.: Constrained clustering of irregularly sampled spatial data. J. Stat. Comput. Simul. 73(12), 853–865 (2003)CrossRefMATHMathSciNet
39.
go back to reference Romary, T., Ors, F., Rivoirard, J., Deraisme, J.: Unsupervised classification of multivariate geostatistical data: two algorithms. Comput. Geosci. 85(Part B), 96–103 (2015)CrossRef Romary, T., Ors, F., Rivoirard, J., Deraisme, J.: Unsupervised classification of multivariate geostatistical data: two algorithms. Comput. Geosci. 85(Part B), 96–103 (2015)CrossRef
40.
go back to reference Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRefMATH Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRefMATH
42.
go back to reference Schuenemeyer, J., Drew, L.: Statistics for Earth and Environmental Scientists. Wiley, New York (2011)MATH Schuenemeyer, J., Drew, L.: Statistics for Earth and Environmental Scientists. Wiley, New York (2011)MATH
43.
go back to reference Stojanova, D., Ceci, M., Appice, A., Džeroski, S.: Network regression with predictive clustering trees. Data Min. Knowl. Discovery 25(2), 378–413 (2012)CrossRefMATHMathSciNet Stojanova, D., Ceci, M., Appice, A., Džeroski, S.: Network regression with predictive clustering trees. Data Min. Knowl. Discovery 25(2), 378–413 (2012)CrossRefMATHMathSciNet
44.
go back to reference Stojanova, D., Ceci, M., Appice, A., Malerba, D., Džeroski, S.: Dealing with spatial autocorrelation when learning predictive clustering trees. Ecol. Inf. 13, 22–39 (2013)CrossRef Stojanova, D., Ceci, M., Appice, A., Malerba, D., Džeroski, S.: Dealing with spatial autocorrelation when learning predictive clustering trees. Ecol. Inf. 13, 22–39 (2013)CrossRef
45.
go back to reference Tao, J., Chloissnig, S., Karl, W.: Analysis of the spatial and temporal locality in data accesses. In: Computational Science – ICCS 2006: 6th International Conference, Reading, UK, May 28–31, 2006. Proceedings, Part II. Springer. pp. 502–509 (2006) Tao, J., Chloissnig, S., Karl, W.: Analysis of the spatial and temporal locality in data accesses. In: Computational Science – ICCS 2006: 6th International Conference, Reading, UK, May 28–31, 2006. Proceedings, Part II. Springer. pp. 502–509 (2006)
46.
go back to reference Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press, London (2009)MATH Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press, London (2009)MATH
47.
go back to reference Wackernagel, H.: Multivariate Geostatistics: An Introduction with Applications. Springer, Berlin (2003)CrossRefMATH Wackernagel, H.: Multivariate Geostatistics: An Introduction with Applications. Springer, Berlin (2003)CrossRefMATH
48.
go back to reference Wand, M., Jones, C.: Kernel Smoothing. Monographs on Statistics and Applied Probability. Chapman and Hall, London (1995) Wand, M., Jones, C.: Kernel Smoothing. Monographs on Statistics and Applied Probability. Chapman and Hall, London (1995)
49.
go back to reference Zha, H., He, X., Ding, C., Gu, M., Simon, H.D.: Spectral relaxation for k-means clustering. In: Advances in neural information processing systems. pp. 1057–1064 (2001) Zha, H., He, X., Ding, C., Gu, M., Simon, H.D.: Spectral relaxation for k-means clustering. In: Advances in neural information processing systems. pp. 1057–1064 (2001)
50.
go back to reference Zhao, M., X. Li, X.: An application of spatial decision tree for classification of air pollution index. In: 19th International Conference on Geoinformatics. IEEE Computer Society. pp. 1–6 (2011) Zhao, M., X. Li, X.: An application of spatial decision tree for classification of air pollution index. In: 19th International Conference on Geoinformatics. IEEE Computer Society. pp. 1–6 (2011)
Metadata
Title
A spectral clustering approach for multivariate geostatistical data
Author
Francky Fouedjio
Publication date
07-09-2017
Publisher
Springer International Publishing
Published in
International Journal of Data Science and Analytics / Issue 4/2017
Print ISSN: 2364-415X
Electronic ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-017-0069-7

Other articles of this Issue 4/2017

International Journal of Data Science and Analytics 4/2017 Go to the issue

Premium Partner