Skip to main content
Erschienen in: Pattern Analysis and Applications 1-2/2005

01.09.2005

Clustering spatial data with a hybrid EM approach

verfasst von: Tianming Hu, Sam Yuan Sung

Erschienen in: Pattern Analysis and Applications | Ausgabe 1-2/2005

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In spatial clustering, in addition to the object similarity in the normal attribute space, similarity in the spatial space needs to be considered and objects assigned to the same cluster should usually be close to one another in the spatial space. The conventional expectation maximization (EM) algorithm is not suited for spatial clustering because it does not consider spatial information. Although neighborhood EM (NEM) algorithm incorporates a spatial penalty term to the criterion function, it involves much more iterations in every E-step. In this paper, we propose a Hybrid EM (HEM) approach that combines EM and NEM. Its computational complexity for every pass is between EM and NEM. Experiments also show that its clustering quality is better than EM and comparable to NEM.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ambroise C, Govaert G (1998) Convergence of an EM-type algorithm for spatial clustering. Pattern Recognit Lett 19(10):919–927 Ambroise C, Govaert G (1998) Convergence of an EM-type algorithm for spatial clustering. Pattern Recognit Lett 19(10):919–927
2.
Zurück zum Zitat Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 59–68 Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 59–68
3.
Zurück zum Zitat Celeux G, Govaert G (1992) A classification EM algorithm for clustering and two stochastic versions. Comput Stat Data Anal 14(3):315–332 Celeux G, Govaert G (1992) A classification EM algorithm for clustering and two stochastic versions. Comput Stat Data Anal 14(3):315–332
4.
Zurück zum Zitat Cressie NA (1993) Statistics for spatial data. Wiley, New York Cressie NA (1993) Statistics for spatial data. Wiley, New York
5.
Zurück zum Zitat Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B(39):1–38 Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B(39):1–38
6.
Zurück zum Zitat Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining, pp 226–231 Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining, pp 226–231
7.
Zurück zum Zitat Estivill-Castro V, Lee I (2001) Fast spatial clustering with different metrics and in the presence of obstacles. In: Proceedings of the 9th ACM international symposium on aAdvances in geographic information systems, pp 142 – 147 Estivill-Castro V, Lee I (2001) Fast spatial clustering with different metrics and in the presence of obstacles. In: Proceedings of the 9th ACM international symposium on aAdvances in geographic information systems, pp 142 – 147
8.
Zurück zum Zitat Friedman N (1998) The Bayesian structural EM algorithm. In: Proceedings of the 14th conference on uncertainty in artificial intelligence, pp 129–138 Friedman N (1998) The Bayesian structural EM algorithm. In: Proceedings of the 14th conference on uncertainty in artificial intelligence, pp 129–138
9.
Zurück zum Zitat Geman S, Geman G (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans Pattern Anal Machine Intell 6:721–741 Geman S, Geman G (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans Pattern Anal Machine Intell 6:721–741
10.
Zurück zum Zitat Gilley OW, Pace RK (1996) On the harrison and rubinfeld data. J Environ Econ Management 31:403–405 Gilley OW, Pace RK (1996) On the harrison and rubinfeld data. J Environ Econ Management 31:403–405
11.
Zurück zum Zitat Guo D, Peuquet D, Gahegan M (2002) Opening the black box: Interactive hierarchical clustering for multivariate spatial patterns. In: Proceedings of the 10th ACM international symposium on advances in geographic information systems, pp 131 – 136 Guo D, Peuquet D, Gahegan M (2002) Opening the black box: Interactive hierarchical clustering for multivariate spatial patterns. In: Proceedings of the 10th ACM international symposium on advances in geographic information systems, pp 131 – 136
12.
Zurück zum Zitat Hathaway RJ (1986) Another interpretation of the EM algorithm for mixture distributions. Stat Probabil Lett 4:53–56 Hathaway RJ (1986) Another interpretation of the EM algorithm for mixture distributions. Stat Probabil Lett 4:53–56
13.
Zurück zum Zitat Jain AK, Farrokhnia F (1991) Unsupervised texture segmentation using Gabor filters. Pattern Recognit 24(12):1167–1186 Jain AK, Farrokhnia F (1991) Unsupervised texture segmentation using Gabor filters. Pattern Recognit 24(12):1167–1186
14.
Zurück zum Zitat Karypis G, Han EH, Kumar V (1999) CHAMELEON: a hierarchical clustering algorithm using dynamic modeling. Computer 32(8):68–75 Karypis G, Han EH, Kumar V (1999) CHAMELEON: a hierarchical clustering algorithm using dynamic modeling. Computer 32(8):68–75
15.
Zurück zum Zitat Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York
16.
Zurück zum Zitat Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86 Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86
17.
Zurück zum Zitat Legendre P (1987) Constrained clustering. In: Legendre P, Legendre L (eds) Developments in numerical ecology, pp 289–307. NATO ASI Series G 14 Legendre P (1987) Constrained clustering. In: Legendre P, Legendre L (eds) Developments in numerical ecology, pp 289–307. NATO ASI Series G 14
19.
Zurück zum Zitat Mceliece R (1977) Theory of information and coding. Addison-Wesley, Reading Mceliece R (1977) Theory of information and coding. Addison-Wesley, Reading
20.
Zurück zum Zitat Murphy PM, Aha DW (1994) UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California at Irvine, http://www.ics.uci.edu/~ mlearn/MLRepository.html Murphy PM, Aha DW (1994) UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California at Irvine, http://​www.​ics.​uci.​edu/​~ mlearn/MLRepository.html
21.
Zurück zum Zitat Neal R, Hinton G (1998) A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan M (ed) Learning in graphical models. Kluwer, Dordrecht, pp 355–368 Neal R, Hinton G (1998) A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan M (ed) Learning in graphical models. Kluwer, Dordrecht, pp 355–368
22.
Zurück zum Zitat Neukirchen C, Rottland J, Willett D, Rigoll G (2001) A continuous density interpretation of discrete HMM systems and MMI-neural networks. IEEE Trans Speech Audio Processing 9(4):367–377 Neukirchen C, Rottland J, Willett D, Rigoll G (2001) A continuous density interpretation of discrete HMM systems and MMI-neural networks. IEEE Trans Speech Audio Processing 9(4):367–377
23.
Zurück zum Zitat Ng R, Han J (2002) CLARANS: A method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng 14(5):1003–1016 Ng R, Han J (2002) CLARANS: A method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng 14(5):1003–1016
24.
Zurück zum Zitat Pena JM, Lozano JA, Larranaga P (2000) An improved Bayesian structural EM algorithm for learning Bayesian networks for clustering. Pattern Recognit Lett 21(8):779–786 Pena JM, Lozano JA, Larranaga P (2000) An improved Bayesian structural EM algorithm for learning Bayesian networks for clustering. Pattern Recognit Lett 21(8):779–786
25.
Zurück zum Zitat Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
26.
Zurück zum Zitat Rasson JP, Granville V (1995) Multivariate discriminant analysis and maximum penalized likelihood density estimation. J R Stat Soc B(57):501–517 Rasson JP, Granville V (1995) Multivariate discriminant analysis and maximum penalized likelihood density estimation. J R Stat Soc B(57):501–517
26.
Zurück zum Zitat Tung A, Hou J, Han J (2001) Spatial clustering in the presence of obstacles. In: Proceedings of 17th international conference on data engineering, pp 359–367 Tung A, Hou J, Han J (2001) Spatial clustering in the presence of obstacles. In: Proceedings of 17th international conference on data engineering, pp 359–367
Metadaten
Titel
Clustering spatial data with a hybrid EM approach
verfasst von
Tianming Hu
Sam Yuan Sung
Publikationsdatum
01.09.2005
Erschienen in
Pattern Analysis and Applications / Ausgabe 1-2/2005
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-005-0251-8

Weitere Artikel der Ausgabe 1-2/2005

Pattern Analysis and Applications 1-2/2005 Zur Ausgabe

Premium Partner