Skip to main content

2020 | OriginalPaper | Buchkapitel

6.  L 2 Normalized Data Clustering Through the Dirichlet Process Mixture Model of von Mises Distributions with Localized Feature Selection

verfasst von : Wentao Fan, Nizar Bouguila, Yewang Chen, Ziyi Chen

Erschienen in: Mixture Models and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter, we propose a probabilistic model based-approach for clustering L 2 normalized data. Our approach is based on the Dirichlet process mixture model of von Mises (VM) distributions. Since it assumes an infinite number of clusters (i.e., the mixture components), the Dirichlet process mixture model of VM distributions can also be considered as the infinite VM mixture model. Comparing with finite mixture model in which the number of mixture components have to be determined through extra efforts, the infinite mixture VM model is a nonparametric model such that the number of mixture components is assumed to be infinite initially and will be inferred automatically during the learning process. To improve clustering performance for high-dimensional data, a localized feature selection scheme is integrated into the infinite VM mixture model which can effectively detect irrelevant features based on the estimated feature saliencies. In order to learn the proposed infinite mixture model with localized feature selection, we develop an effective approach using variational inference that can estimate model parameters and feature saliencies with closed-form solutions. Our model-based clustering approach is validated through two challenging applications, namely topic novelty detection and unsupervised image categorization.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2000)CrossRef McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2000)CrossRef
2.
Zurück zum Zitat Fan, W., Bouguila, N., Ziou, D.: Variational learning for finite Dirichlet mixture models and applications. IEEE Trans. Neural Netw. Learn. Syst. 23(5), 762–774 (2012)CrossRef Fan, W., Bouguila, N., Ziou, D.: Variational learning for finite Dirichlet mixture models and applications. IEEE Trans. Neural Netw. Learn. Syst. 23(5), 762–774 (2012)CrossRef
3.
Zurück zum Zitat Fan, W., Sallay, H., Bouguila, N.: Online learning of hierarchical Pitman–Yor process mixture of generalized Dirichlet distributions with feature selection. IEEE Trans. Neural Netw. Learn. Syst. 28(9), 2048–2061 (2017)MathSciNet Fan, W., Sallay, H., Bouguila, N.: Online learning of hierarchical Pitman–Yor process mixture of generalized Dirichlet distributions with feature selection. IEEE Trans. Neural Netw. Learn. Syst. 28(9), 2048–2061 (2017)MathSciNet
4.
Zurück zum Zitat Fan, W., Bouguila, N., Liu, X.: A hierarchical Dirichlet process mixture of GID distributions with feature selection for spatio-temporal video modeling and segmentation. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, pp. 2771–2775. IEEE, Piscataway (2017) Fan, W., Bouguila, N., Liu, X.: A hierarchical Dirichlet process mixture of GID distributions with feature selection for spatio-temporal video modeling and segmentation. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, pp. 2771–2775. IEEE, Piscataway (2017)
5.
Zurück zum Zitat Fan, W., Bouguila, N.: Online learning of a Dirichlet process mixture of Beta-Liouville distributions via variational inference. IEEE Trans. Neural Netw. Learn. Syst. 24(11), 1850–1862 (2013)CrossRef Fan, W., Bouguila, N.: Online learning of a Dirichlet process mixture of Beta-Liouville distributions via variational inference. IEEE Trans. Neural Netw. Learn. Syst. 24(11), 1850–1862 (2013)CrossRef
6.
Zurück zum Zitat Fan, W., Bouguila, N.: Expectation propagation learning of a Dirichlet process mixture of Beta-Liouville distributions for proportional data clustering. Eng. Appl. Artif. Intell. 43, 1–14 (2015)CrossRef Fan, W., Bouguila, N.: Expectation propagation learning of a Dirichlet process mixture of Beta-Liouville distributions for proportional data clustering. Eng. Appl. Artif. Intell. 43, 1–14 (2015)CrossRef
7.
Zurück zum Zitat Amayri, O., Bouguila, N.: Infinite Langevin mixture modeling and feature selection. In: 2016 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016, pp. 149–155. IEEE, Piscataway (2016) Amayri, O., Bouguila, N.: Infinite Langevin mixture modeling and feature selection. In: 2016 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016, pp. 149–155. IEEE, Piscataway (2016)
8.
Zurück zum Zitat Amayri, O., Bouguila, N.: RJMCMC learning for clustering and feature selection of l2-normalized vectors. In: International Conference on Control, Decision and Information Technologies, CoDIT 2016, pp. 269–274. IEEE, Piscataway (2016) Amayri, O., Bouguila, N.: RJMCMC learning for clustering and feature selection of l2-normalized vectors. In: International Conference on Control, Decision and Information Technologies, CoDIT 2016, pp. 269–274. IEEE, Piscataway (2016)
9.
Zurück zum Zitat Amayri, O., Bouguila, N.: A Bayesian analysis of spherical pattern based on finite Langevin mixture. Appl. Soft Comput. 38, 373–383 (2016)CrossRef Amayri, O., Bouguila, N.: A Bayesian analysis of spherical pattern based on finite Langevin mixture. Appl. Soft Comput. 38, 373–383 (2016)CrossRef
10.
Zurück zum Zitat Amayri, O., Bouguila, N.: On online high-dimensional spherical data clustering and feature selection. Eng. Appl. Artif. Intell. 26(4), 1386–1398 (2013)CrossRef Amayri, O., Bouguila, N.: On online high-dimensional spherical data clustering and feature selection. Eng. Appl. Artif. Intell. 26(4), 1386–1398 (2013)CrossRef
11.
Zurück zum Zitat Korwar, R.M., Hollander, M.: Contributions to the theory of Dirichlet processes. Ann. Probab. 1, 705–711 (1973)MathSciNetCrossRef Korwar, R.M., Hollander, M.: Contributions to the theory of Dirichlet processes. Ann. Probab. 1, 705–711 (1973)MathSciNetCrossRef
12.
13.
Zurück zum Zitat Blei, D.M., Jordan, M.I.: Variational inference for Dirichlet process mixtures. Bayesian Anal. 1, 121–144 (2005)MathSciNetCrossRef Blei, D.M., Jordan, M.I.: Variational inference for Dirichlet process mixtures. Bayesian Anal. 1, 121–144 (2005)MathSciNetCrossRef
14.
Zurück zum Zitat Li, Y., Dong, M., Hua, J.: Simultaneous localized feature selection and model detection for Gaussian mixtures. IEEE Trans. Pattern Anal. Mach. Intell. 31, 953–960 (2009)CrossRef Li, Y., Dong, M., Hua, J.: Simultaneous localized feature selection and model detection for Gaussian mixtures. IEEE Trans. Pattern Anal. Mach. Intell. 31, 953–960 (2009)CrossRef
15.
Zurück zum Zitat Attias, H.: A variational Bayes framework for graphical models. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS), pp. 209–215 (1999) Attias, H.: A variational Bayes framework for graphical models. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS), pp. 209–215 (1999)
16.
Zurück zum Zitat Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)CrossRef Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)CrossRef
17.
Zurück zum Zitat Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)MATH Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)MATH
18.
Zurück zum Zitat Fan, W., Bouguila, N.: Nonparametric localized feature selection via a Dirichlet process mixture of generalized Dirichlet distributions. In: Neural Information Processing—19th International Conference, ICONIP 2012, pp. 25–33 (2012) Fan, W., Bouguila, N.: Nonparametric localized feature selection via a Dirichlet process mixture of generalized Dirichlet distributions. In: Neural Information Processing—19th International Conference, ICONIP 2012, pp. 25–33 (2012)
19.
Zurück zum Zitat Fan, W., Bouguila, N., Ziou, D.: Unsupervised anomaly intrusion detection via localized Bayesian feature selection. In: 11th IEEE International Conference on Data Mining, ICDM 2011, pp. 1032–1037. IEEE, Piscataway (2011) Fan, W., Bouguila, N., Ziou, D.: Unsupervised anomaly intrusion detection via localized Bayesian feature selection. In: 11th IEEE International Conference on Data Mining, ICDM 2011, pp. 1032–1037. IEEE, Piscataway (2011)
20.
Zurück zum Zitat Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1154–1166 (2004)CrossRef Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1154–1166 (2004)CrossRef
21.
Zurück zum Zitat Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650 (1994)MathSciNetMATH Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650 (1994)MathSciNetMATH
22.
Zurück zum Zitat Taghia, J., Ma, Z., Leijon, A.: Bayesian estimation of the von Mises-Fisher mixture model with variational inference. IEEE Trans. Pattern Anal. Mach. Intell. 36(9), 1701–1715 (2014)CrossRef Taghia, J., Ma, Z., Leijon, A.: Bayesian estimation of the von Mises-Fisher mixture model with variational inference. IEEE Trans. Pattern Anal. Mach. Intell. 36(9), 1701–1715 (2014)CrossRef
24.
Zurück zum Zitat Nilsback, M.-E., Zisserman, A.: A visual vocabulary for flower classification. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1447–1454. IEEE, Piscataway (2006) Nilsback, M.-E., Zisserman, A.: A visual vocabulary for flower classification. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1447–1454. IEEE, Piscataway (2006)
25.
Zurück zum Zitat Ke, Y., Sukthankar, R.: PCA-SIFT: A more distinctive representation for local image descriptors. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 506–513. IEEE, Piscataway (2004) Ke, Y., Sukthankar, R.: PCA-SIFT: A more distinctive representation for local image descriptors. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 506–513. IEEE, Piscataway (2004)
26.
Zurück zum Zitat Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)CrossRef Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)CrossRef
27.
Zurück zum Zitat Elkan, C.: Using the triangle inequality to accelerate k-means. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, pp. 147–153 (2003) Elkan, C.: Using the triangle inequality to accelerate k-means. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, pp. 147–153 (2003)
28.
Zurück zum Zitat Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1/2), 177–196 (2001)CrossRef Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1/2), 177–196 (2001)CrossRef
29.
Zurück zum Zitat Bosch, A., Zisserman, A., Munoz, X.: Scene classification via pLSA. In: Proceedings of the 9th European Conference on Computer Vision (ECCV), pp. 517–530. Springer, Berlin (2006)CrossRef Bosch, A., Zisserman, A., Munoz, X.: Scene classification via pLSA. In: Proceedings of the 9th European Conference on Computer Vision (ECCV), pp. 517–530. Springer, Berlin (2006)CrossRef
Metadaten
Titel
L 2 Normalized Data Clustering Through the Dirichlet Process Mixture Model of von Mises Distributions with Localized Feature Selection
verfasst von
Wentao Fan
Nizar Bouguila
Yewang Chen
Ziyi Chen
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-23876-6_6

Neuer Inhalt