Skip to main content
Erschienen in: Pattern Analysis and Applications 1/2023

27.07.2022 | Theoretical Advances

A novel minorization–maximization framework for simultaneous feature selection and clustering of high-dimensional count data

verfasst von: Nuha Zamzami, Nizar Bouguila

Erschienen in: Pattern Analysis and Applications | Ausgabe 1/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Count data are commonly exploited in machine learning and computer vision applications; however, they often suffer from the well-known curse of dimensionality, which declines the performance of clustering algorithms dramatically. Feature selection is a major technique for handling a large number of features, which most are often redundant and noisy. In this paper, we propose a probabilistic approach for count data based on the concept of feature saliency in the context of mixture-based clustering using the generalized Dirichlet multinomial distribution. The saliency of irrelevant features is reduced toward zero by minimizing the message length, which equates to doing feature and model selection simultaneously. It is proved that the developed approach is effective in identifying both the optimal number of clusters and the most important features, and so enhancing clustering performance significantly, using a range of challenging applications including text and image clustering.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
In our experiments, the values for \(M_{min}\) and \(M_{max}\) have been set to 2 and 50, respectively.
 
Literatur
1.
Zurück zum Zitat Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5(Aug):845–889MathSciNetMATH Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5(Aug):845–889MathSciNetMATH
2.
Zurück zum Zitat Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182MATH Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182MATH
3.
Zurück zum Zitat Liu H, Wu X, Zhang S (2011) Feature selection using hierarchical feature clustering. In: Proceedings of the 20th ACM international conference on information and knowledge management, ACM, pp 979–984 Liu H, Wu X, Zhang S (2011) Feature selection using hierarchical feature clustering. In: Proceedings of the 20th ACM international conference on information and knowledge management, ACM, pp 979–984
4.
Zurück zum Zitat Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 333–342 Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 333–342
5.
Zurück zum Zitat Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28CrossRef Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28CrossRef
6.
Zurück zum Zitat Kohavi R, Sommerfield D (1995) Feature subset selection using the wrapper method: Overfitting and dynamic search space topology. In: KDD, pp 192–197 Kohavi R, Sommerfield D (1995) Feature subset selection using the wrapper method: Overfitting and dynamic search space topology. In: KDD, pp 192–197
7.
Zurück zum Zitat Wolf L, Shashua A (2005) Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weight-based approach. J Mach Learn Res 6(Nov):1855–1887MathSciNetMATH Wolf L, Shashua A (2005) Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weight-based approach. J Mach Learn Res 6(Nov):1855–1887MathSciNetMATH
8.
Zurück zum Zitat Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 4:491–502 Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 4:491–502
9.
Zurück zum Zitat Chuang L-Y, Chang H-W, Tu C-J, Yang C-H (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32(1):29–38MATHCrossRef Chuang L-Y, Chang H-W, Tu C-J, Yang C-H (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32(1):29–38MATHCrossRef
10.
Zurück zum Zitat Lazar C, Taminau J, Meganck S, Steenhoff D, Coletta A, Molter C, de Schaetzen V, Duque R, Bersini H, Nowe A (2012) A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 9(4):1106–1119CrossRef Lazar C, Taminau J, Meganck S, Steenhoff D, Coletta A, Molter C, de Schaetzen V, Duque R, Bersini H, Nowe A (2012) A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 9(4):1106–1119CrossRef
11.
Zurück zum Zitat Tang J, Liu H (2012) Feature selection with linked data in social media. In: Proceedings of the 2012 SIAM international conference on data mining, SIAM, pp 118–128 Tang J, Liu H (2012) Feature selection with linked data in social media. In: Proceedings of the 2012 SIAM international conference on data mining, SIAM, pp 118–128
12.
Zurück zum Zitat Tang J, Liu H (2014) An unsupervised feature selection framework for social media data. IEEE Trans Knowl Data Eng 26(12):2914–2927CrossRef Tang J, Liu H (2014) An unsupervised feature selection framework for social media data. IEEE Trans Knowl Data Eng 26(12):2914–2927CrossRef
13.
Zurück zum Zitat Liu L, Shao L, Rockett P (2013) Boosted key-frame selection and correlated pyramidal motion-feature representation for human action recognition. Pattern Recogn 46(7):1810–1818CrossRef Liu L, Shao L, Rockett P (2013) Boosted key-frame selection and correlated pyramidal motion-feature representation for human action recognition. Pattern Recogn 46(7):1810–1818CrossRef
14.
Zurück zum Zitat Lin C-H, Chen H-Y, Wu Y-S (2014) Study of image retrieval and classification based on adaptive features using genetic algorithm feature selection. Expert Syst Appl 41(15):6611–6621CrossRef Lin C-H, Chen H-Y, Wu Y-S (2014) Study of image retrieval and classification based on adaptive features using genetic algorithm feature selection. Expert Syst Appl 41(15):6611–6621CrossRef
15.
Zurück zum Zitat Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537–550CrossRef Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537–550CrossRef
16.
Zurück zum Zitat Zeng Z, Wang X, Zhang J, Wu Q (2016) Semi-supervised feature selection based on local discriminative information. Neurocomputing 173:102–109CrossRef Zeng Z, Wang X, Zhang J, Wu Q (2016) Semi-supervised feature selection based on local discriminative information. Neurocomputing 173:102–109CrossRef
17.
Zurück zum Zitat Chen X, Yuan G, Nie F, Huang JZ (2017) Semi-supervised feature selection via rescaled linear regression. In: IJCAI, vol 2017, pp 1525–1531 Chen X, Yuan G, Nie F, Huang JZ (2017) Semi-supervised feature selection via rescaled linear regression. In: IJCAI, vol 2017, pp 1525–1531
18.
Zurück zum Zitat Li Z, Tang J (2021) Semi-supervised local feature selection for data classification. Science China Inf Sci 64(9):1–12CrossRef Li Z, Tang J (2021) Semi-supervised local feature selection for data classification. Science China Inf Sci 64(9):1–12CrossRef
19.
Zurück zum Zitat Law MH, Figueiredo MA, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166CrossRef Law MH, Figueiredo MA, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166CrossRef
20.
Zurück zum Zitat Bouguila N (2009) A model-based approach for discrete data clustering and feature weighting using MAP and stochastic complexity. IEEE Trans Knowl Data Eng 21(12):1649–1664CrossRef Bouguila N (2009) A model-based approach for discrete data clustering and feature weighting using MAP and stochastic complexity. IEEE Trans Knowl Data Eng 21(12):1649–1664CrossRef
21.
Zurück zum Zitat Luo M, Nie F, Chang X, Yang Y, Hauptmann AG, Zheng Q (2017) Adaptive unsupervised feature selection with structure regularization. IEEE Trans Neural Netw Learn Syst 29(4):944–956CrossRef Luo M, Nie F, Chang X, Yang Y, Hauptmann AG, Zheng Q (2017) Adaptive unsupervised feature selection with structure regularization. IEEE Trans Neural Netw Learn Syst 29(4):944–956CrossRef
22.
Zurück zum Zitat Li Z, Liu J, Zhu X, Liu T, Lu H (2010) Image annotation using multi-correlation probabilistic matrix factorization. In: Proceedings of the 18th ACM international conference on multimedia, ACM, pp 1187–1190 Li Z, Liu J, Zhu X, Liu T, Lu H (2010) Image annotation using multi-correlation probabilistic matrix factorization. In: Proceedings of the 18th ACM international conference on multimedia, ACM, pp 1187–1190
23.
Zurück zum Zitat Li Z, Liu J, Yang Y, Zhou X, Lu H (2014) Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng 26(9):2138–2150CrossRef Li Z, Liu J, Yang Y, Zhou X, Lu H (2014) Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng 26(9):2138–2150CrossRef
24.
Zurück zum Zitat Hong X, Li H, Miller P, Zhou J, Li L, Crookes D, Lu Y, Li X, Zhou H (2019) Component-based feature saliency for clustering. IEEE transactions on knowledge and data engineering Hong X, Li H, Miller P, Zhou J, Li L, Crookes D, Lu Y, Li X, Zhou H (2019) Component-based feature saliency for clustering. IEEE transactions on knowledge and data engineering
25.
Zurück zum Zitat Ortega JM, Rheinboldt WC (1970) Iterative solution of nonlinear equations in several variables. vol 30. Siam Ortega JM, Rheinboldt WC (1970) Iterative solution of nonlinear equations in several variables. vol 30. Siam
27.
Zurück zum Zitat Dempster AP (1977) Maximum likelihood estimation from incomplete data via the EM algorithm. J R Stat Soc Ser B (Statistical Methodology) 39:1–38MATH Dempster AP (1977) Maximum likelihood estimation from incomplete data via the EM algorithm. J R Stat Soc Ser B (Statistical Methodology) 39:1–38MATH
28.
Zurück zum Zitat Wallace CS (2005) Statistical and inductive inference by minimum message length. Springer, New YorkMATH Wallace CS (2005) Statistical and inductive inference by minimum message length. Springer, New YorkMATH
29.
Zurück zum Zitat Bouguila N (2008) Clustering of count data using generalized Dirichlet multinomial distributions. IEEE Trans Knowl Data Eng 20(4):462–474CrossRef Bouguila N (2008) Clustering of count data using generalized Dirichlet multinomial distributions. IEEE Trans Knowl Data Eng 20(4):462–474CrossRef
30.
Zurück zum Zitat Connor RJ, Mosimann JE (1969) Concepts of independence for proportions with a generalization of the Dirichlet distribution. J Am Stat Assoc 64(325):194–206MathSciNetMATHCrossRef Connor RJ, Mosimann JE (1969) Concepts of independence for proportions with a generalization of the Dirichlet distribution. J Am Stat Assoc 64(325):194–206MathSciNetMATHCrossRef
31.
Zurück zum Zitat Madsen RE, Kauchak D, Elkan C (2005) Modeling word burstiness using the Dirichlet distribution. In: Proceedings of the 22nd international conference on machine learning, ACM, pp 545–552 Madsen RE, Kauchak D, Elkan C (2005) Modeling word burstiness using the Dirichlet distribution. In: Proceedings of the 22nd international conference on machine learning, ACM, pp 545–552
32.
Zurück zum Zitat Wong T-T (2009) Alternative prior assumptions for improving the performance of naïve Bayesian classifiers. Data Min Knowl Disc 18(2):183–213CrossRef Wong T-T (2009) Alternative prior assumptions for improving the performance of naïve Bayesian classifiers. Data Min Knowl Disc 18(2):183–213CrossRef
33.
Zurück zum Zitat Zamzami N, Bouguila N (2018) Consumption behavior prediction using hierarchical Bayesian frameworks. In: 2018 first international conference on artificial intelligence for industries (AI4I), IEEE, pp 31–34 Zamzami N, Bouguila N (2018) Consumption behavior prediction using hierarchical Bayesian frameworks. In: 2018 first international conference on artificial intelligence for industries (AI4I), IEEE, pp 31–34
34.
Zurück zum Zitat Graham MW, Miller DJ (2006) Unsupervised learning of parsimonious mixtures on large spaces with integrated feature and component selection. IEEE Trans Signal Process 54(4):1289–1303MATHCrossRef Graham MW, Miller DJ (2006) Unsupervised learning of parsimonious mixtures on large spaces with integrated feature and component selection. IEEE Trans Signal Process 54(4):1289–1303MATHCrossRef
35.
Zurück zum Zitat Zhou H, Lange K (2010) MM algorithms for some discrete multivariate distributions. J Comput Graph Stat 19(3):645–665MathSciNetCrossRef Zhou H, Lange K (2010) MM algorithms for some discrete multivariate distributions. J Comput Graph Stat 19(3):645–665MathSciNetCrossRef
36.
Zurück zum Zitat Wu X, Jiang B, Yu K, Miao C, Chen H (2019) Accurate Markov boundary discovery for causal feature selection. IEEE Trans Cybern 50:4983–4996CrossRef Wu X, Jiang B, Yu K, Miao C, Chen H (2019) Accurate Markov boundary discovery for causal feature selection. IEEE Trans Cybern 50:4983–4996CrossRef
37.
Zurück zum Zitat Liu C, Zheng C-T, Wu S, Yu Z, Wong H-S (2018) Multitask feature selection by graph-clustered feature sharing. IEEE Trans Cybern 50:74–86CrossRef Liu C, Zheng C-T, Wu S, Yu Z, Wong H-S (2018) Multitask feature selection by graph-clustered feature sharing. IEEE Trans Cybern 50:74–86CrossRef
38.
Zurück zum Zitat Wu H, Liu T, Xie J (2017) Fine-grained product feature extraction in chinese reviews. In: 2017 international conference on computing intelligence and information system (CIIS), IEEE, pp. 327–331 Wu H, Liu T, Xie J (2017) Fine-grained product feature extraction in chinese reviews. In: 2017 international conference on computing intelligence and information system (CIIS), IEEE, pp. 327–331
39.
Zurück zum Zitat Marquetti I, Link JV, Lemes ALG, dos Santos Scholz MB, Valderrama P, Bona E (2016) Partial least square with discriminant analysis and near infrared spectroscopy for evaluation of geographic and genotypic origin of arabica coffee. Comput Electr Agric 121:313–319CrossRef Marquetti I, Link JV, Lemes ALG, dos Santos Scholz MB, Valderrama P, Bona E (2016) Partial least square with discriminant analysis and near infrared spectroscopy for evaluation of geographic and genotypic origin of arabica coffee. Comput Electr Agric 121:313–319CrossRef
40.
Zurück zum Zitat Fan Z, Xu Y, Zuo W, Yang J, Tang J, Lai Z, Zhang D (2014) Modified principal component analysis: An integration of multiple similarity subspace models. IEEE Trans Neural Netw Learn Syst 25(8):1538–1552CrossRef Fan Z, Xu Y, Zuo W, Yang J, Tang J, Lai Z, Zhang D (2014) Modified principal component analysis: An integration of multiple similarity subspace models. IEEE Trans Neural Netw Learn Syst 25(8):1538–1552CrossRef
41.
Zurück zum Zitat Zhao H, Wang Z, Nie F (2018) A new formulation of linear discriminant analysis for robust dimensionality reduction. IEEE Trans Knowl Data Eng 31(4):629–640CrossRef Zhao H, Wang Z, Nie F (2018) A new formulation of linear discriminant analysis for robust dimensionality reduction. IEEE Trans Knowl Data Eng 31(4):629–640CrossRef
42.
Zurück zum Zitat Bouveyron C, Brunet-Saumard C (2014) Model-based clustering of high-dimensional data: a review. Comput Stat Data Anal 71:52–78MathSciNetMATHCrossRef Bouveyron C, Brunet-Saumard C (2014) Model-based clustering of high-dimensional data: a review. Comput Stat Data Anal 71:52–78MathSciNetMATHCrossRef
43.
Zurück zum Zitat Dash M, Liu H (2000) Feature selection for clustering. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 110–121 Dash M, Liu H (2000) Feature selection for clustering. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 110–121
44.
Zurück zum Zitat Wang Y, Feng L (2019) A new hybrid feature selection based on multi-filter weights and multi-feature weights. Appl Intell 49:1–25CrossRef Wang Y, Feng L (2019) A new hybrid feature selection based on multi-filter weights and multi-feature weights. Appl Intell 49:1–25CrossRef
45.
Zurück zum Zitat Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning. Springer, New YorkMATH Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning. Springer, New YorkMATH
46.
Zurück zum Zitat Liu H, Motoda H (2007) Computational methods of feature selection. CRC Press, Boca RatonMATHCrossRef Liu H, Motoda H (2007) Computational methods of feature selection. CRC Press, Boca RatonMATHCrossRef
47.
Zurück zum Zitat Dash M, Choi K, Scheuermann P, Liu H (2002) Feature selection for clustering-a filter solution. In: Proceedings of 2002 IEEE international conference on data mining, IEEE, pp 115–122 Dash M, Choi K, Scheuermann P, Liu H (2002) Feature selection for clustering-a filter solution. In: Proceedings of 2002 IEEE international conference on data mining, IEEE, pp 115–122
48.
Zurück zum Zitat Ambusaidi MA, He X, Nanda P, Tan Z (2016) Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans Comput 65(10):2986–2998MathSciNetMATHCrossRef Ambusaidi MA, He X, Nanda P, Tan Z (2016) Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans Comput 65(10):2986–2998MathSciNetMATHCrossRef
49.
Zurück zum Zitat Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324MATHCrossRef Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324MATHCrossRef
50.
Zurück zum Zitat Kabir MM, Islam MM, Murase K (2010) A new wrapper feature selection approach using neural network. Neurocomputing 73(16–18):3273–3283CrossRef Kabir MM, Islam MM, Murase K (2010) A new wrapper feature selection approach using neural network. Neurocomputing 73(16–18):3273–3283CrossRef
51.
Zurück zum Zitat Apolloni J, Leguizamón G, Alba E (2016) Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl Soft Comput 38:922–932CrossRef Apolloni J, Leguizamón G, Alba E (2016) Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl Soft Comput 38:922–932CrossRef
52.
Zurück zum Zitat Moradkhani M, Amiri A, Javaherian M, Safari H (2015) A hybrid algorithm for feature subset selection in high-dimensional datasets using FICA and IWSSr algorithm. Appl Soft Comput 35:123–135CrossRef Moradkhani M, Amiri A, Javaherian M, Safari H (2015) A hybrid algorithm for feature subset selection in high-dimensional datasets using FICA and IWSSr algorithm. Appl Soft Comput 35:123–135CrossRef
53.
Zurück zum Zitat Tang B, Kay S, He H (2016) Toward optimal feature selection in naive Bayes for text categorization. IEEE Trans Knowl Data Eng 28(9):2508–2521CrossRef Tang B, Kay S, He H (2016) Toward optimal feature selection in naive Bayes for text categorization. IEEE Trans Knowl Data Eng 28(9):2508–2521CrossRef
54.
Zurück zum Zitat Bouillot F, Hai PN, Béchet N, Bringay S, Ienco D, Matwin S, Poncelet P, Roche M, Teisseire M (2012) How to extract relevant knowledge from tweets? In: International workshop on information search, integration, and personalization, Springer, pp 111–120 Bouillot F, Hai PN, Béchet N, Bringay S, Ienco D, Matwin S, Poncelet P, Roche M, Teisseire M (2012) How to extract relevant knowledge from tweets? In: International workshop on information search, integration, and personalization, Springer, pp 111–120
55.
Zurück zum Zitat Mladenic D, Grobelnik M (1999) Feature selection for unbalanced class distribution and naive bayes. In: ICML, vol 99, pp 258–267 Mladenic D, Grobelnik M (1999) Feature selection for unbalanced class distribution and naive bayes. In: ICML, vol 99, pp 258–267
56.
Zurück zum Zitat Caropreso MF, Matwin S, Sebastiani F (2001) A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. Text Databases Doc Manage Theory Pract 5478:78–102 Caropreso MF, Matwin S, Sebastiani F (2001) A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. Text Databases Doc Manage Theory Pract 5478:78–102
57.
Zurück zum Zitat Li Y, Luo C, Chung SM (2008) Text clustering with feature selection by using statistical data. IEEE Trans Knowl Data Eng 20(5):641–652CrossRef Li Y, Luo C, Chung SM (2008) Text clustering with feature selection by using statistical data. IEEE Trans Knowl Data Eng 20(5):641–652CrossRef
58.
Zurück zum Zitat Galavotti L, Sebastiani F, Simi M (2000) Experiments on the use of feature selection and negative evidence in automated text categorization. In: International conference on theory and practice of digital libraries, Springer, pp 59–68 Galavotti L, Sebastiani F, Simi M (2000) Experiments on the use of feature selection and negative evidence in automated text categorization. In: International conference on theory and practice of digital libraries, Springer, pp 59–68
59.
Zurück zum Zitat Talavera L (1999) Feature selection as a preprocessing step for hierarchical clustering. In: ICML, vol 99, pp 389–397 (Citeseer) Talavera L (1999) Feature selection as a preprocessing step for hierarchical clustering. In: ICML, vol 99, pp 389–397 (Citeseer)
60.
Zurück zum Zitat He X, Cai D, Niyogi P (2006) Laplacian score for feature selection. In: Advances in neural information processing systems; 18; pp 507–514 He X, Cai D, Niyogi P (2006) Laplacian score for feature selection. In: Advances in neural information processing systems; 18; pp 507–514
61.
Zurück zum Zitat Dasgupta A, Drineas P, Harb B, Josifovski V, Mahoney MW (2007) Feature selection methods for text classification. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 230–239 Dasgupta A, Drineas P, Harb B, Josifovski V, Mahoney MW (2007) Feature selection methods for text classification. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 230–239
62.
Zurück zum Zitat Sharma KK, Seal A (2020) Clustering analysis using an adaptive fused distance. Eng Appl Artif Intell 96:103928CrossRef Sharma KK, Seal A (2020) Clustering analysis using an adaptive fused distance. Eng Appl Artif Intell 96:103928CrossRef
63.
Zurück zum Zitat Sharma KK, Seal A (2021) Spectral embedded generalized mean based k-nearest neighbors clustering with s-distance. Expert Syst Appl 169:114326CrossRef Sharma KK, Seal A (2021) Spectral embedded generalized mean based k-nearest neighbors clustering with s-distance. Expert Syst Appl 169:114326CrossRef
64.
Zurück zum Zitat Sharma KK, Seal A, Herrera-Viedma E, Krejcar O (2021) An enhanced spectral clustering algorithm with s-distance. Symmetry 13(4):596CrossRef Sharma KK, Seal A, Herrera-Viedma E, Krejcar O (2021) An enhanced spectral clustering algorithm with s-distance. Symmetry 13(4):596CrossRef
65.
Zurück zum Zitat Adams S, Beling PA (2017) A survey of feature selection methods for Gaussian mixture models and hidden Markov models. Artif Intell Rev 52:1–41CrossRef Adams S, Beling PA (2017) A survey of feature selection methods for Gaussian mixture models and hidden Markov models. Artif Intell Rev 52:1–41CrossRef
66.
Zurück zum Zitat Boutemedjet S, Bouguila N, Ziou D (2008) A hybrid feature extraction selection approach for high-dimensional non-gaussian data clustering. IEEE Trans Pattern Anal Mach Intell 31(8):1429–1443CrossRef Boutemedjet S, Bouguila N, Ziou D (2008) A hybrid feature extraction selection approach for high-dimensional non-gaussian data clustering. IEEE Trans Pattern Anal Mach Intell 31(8):1429–1443CrossRef
67.
Zurück zum Zitat Fan W, Bouguila N, Ziou D (2012) Unsupervised hybrid feature extraction selection for high-dimensional non-gaussian data clustering with variational inference. IEEE Trans Knowl Data Eng 25(7):1670–1685CrossRef Fan W, Bouguila N, Ziou D (2012) Unsupervised hybrid feature extraction selection for high-dimensional non-gaussian data clustering with variational inference. IEEE Trans Knowl Data Eng 25(7):1670–1685CrossRef
68.
Zurück zum Zitat Vaithyanathan S, Dom B (2000) Generalized model selection for unsupervised learning in high dimensions. Adv Neural Inf Process Syst 12:970–976 Vaithyanathan S, Dom B (2000) Generalized model selection for unsupervised learning in high dimensions. Adv Neural Inf Process Syst 12:970–976
69.
Zurück zum Zitat Wang X, Kabán A (2006) Model-based estimation of word saliency in text. In: International conference on discovery science, Springer, pp 279–290 Wang X, Kabán A (2006) Model-based estimation of word saliency in text. In: International conference on discovery science, Springer, pp 279–290
70.
Zurück zum Zitat Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of the AAAI conference on artificial intelligence, vol 26 Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of the AAAI conference on artificial intelligence, vol 26
71.
Zurück zum Zitat Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355MathSciNetMATHCrossRef Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355MathSciNetMATHCrossRef
72.
Zurück zum Zitat Cheung Y-m, Zeng H (2007) A maximum weighted likelihood approach to simultaneous model selection and feature weighting in gaussian mixture. In: International conference on artificial neural networks, Springer, pp 78–87 Cheung Y-m, Zeng H (2007) A maximum weighted likelihood approach to simultaneous model selection and feature weighting in gaussian mixture. In: International conference on artificial neural networks, Springer, pp 78–87
73.
Zurück zum Zitat Tsai C-Y, Chiu C-C (2008) Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm. Comput Stat Data Anal 52(10):4658–4672MathSciNetMATHCrossRef Tsai C-Y, Chiu C-C (2008) Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm. Comput Stat Data Anal 52(10):4658–4672MathSciNetMATHCrossRef
74.
Zurück zum Zitat Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 3:381–396CrossRef Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 3:381–396CrossRef
75.
Zurück zum Zitat Wallace CS, Dowe DL (2000) MML clustering of multi-state, poisson, von mises circular and Gaussian distributions. Stat Comput 10(1):73–83CrossRef Wallace CS, Dowe DL (2000) MML clustering of multi-state, poisson, von mises circular and Gaussian distributions. Stat Comput 10(1):73–83CrossRef
76.
Zurück zum Zitat Mosimann JE (1962) On the compound multinomial distribution, the multivariate \(\beta\)-distribution, and correlations among proportions. Biometrika 49(1/2):65–82MathSciNetMATHCrossRef Mosimann JE (1962) On the compound multinomial distribution, the multivariate \(\beta\)-distribution, and correlations among proportions. Biometrika 49(1/2):65–82MathSciNetMATHCrossRef
77.
Zurück zum Zitat Wong T-T (2014) Generalized dirichlet priors for naïve bayesian classifiers with multinomial models in document classification. Data Min Knowl Disc 28(1):123–144MATHCrossRef Wong T-T (2014) Generalized dirichlet priors for naïve bayesian classifiers with multinomial models in document classification. Data Min Knowl Disc 28(1):123–144MATHCrossRef
78.
Zurück zum Zitat Caballero KL, Barajas J, Akella R (2012) The generalized dirichlet distribution in enhanced topic detection. In: Proceedings of the 21st ACM international conference on information and knowledge management, ACM, pp 773–782 Caballero KL, Barajas J, Akella R (2012) The generalized dirichlet distribution in enhanced topic detection. In: Proceedings of the 21st ACM international conference on information and knowledge management, ACM, pp 773–782
79.
Zurück zum Zitat Katz SM (1996) Distribution of content words and phrases in text and language modelling. Nat Lang Eng 2(1):15–59CrossRef Katz SM (1996) Distribution of content words and phrases in text and language modelling. Nat Lang Eng 2(1):15–59CrossRef
80.
Zurück zum Zitat Puig P, Valero J (2006) Count data distributions: some characterizations with applications. J Am Stat Assoc 101(473):332–340MathSciNetMATHCrossRef Puig P, Valero J (2006) Count data distributions: some characterizations with applications. J Am Stat Assoc 101(473):332–340MathSciNetMATHCrossRef
82.
Zurück zum Zitat Bailey NT (1957) The mathematical theory of epidemics. Technical report Bailey NT (1957) The mathematical theory of epidemics. Technical report
83.
Zurück zum Zitat Griffiths D (1973) Maximum likelihood estimation for the beta-binomial distribution and an application to the household distribution of the total number of cases of a disease. Biometrics, pp 637–648 Griffiths D (1973) Maximum likelihood estimation for the beta-binomial distribution and an application to the household distribution of the total number of cases of a disease. Biometrics, pp 637–648
84.
Zurück zum Zitat Pudil P, Novovičová J, Choakjarernwanit N, Kittler J (1995) Feature selection based on the approximation of class densities by finite mixtures of special type. Pattern Recogn 28(9):1389–1398CrossRef Pudil P, Novovičová J, Choakjarernwanit N, Kittler J (1995) Feature selection based on the approximation of class densities by finite mixtures of special type. Pattern Recogn 28(9):1389–1398CrossRef
85.
Zurück zum Zitat Nguyen HD (2017) An introduction to Majorization-Minimization algorithms for machine learning and statistical estimation. Wiley Interdiscip Rev Data Min Knowl Discov 7(2):1198CrossRef Nguyen HD (2017) An introduction to Majorization-Minimization algorithms for machine learning and statistical estimation. Wiley Interdiscip Rev Data Min Knowl Discov 7(2):1198CrossRef
86.
Zurück zum Zitat Tian G-L, Liu Y, Tang M-L, Li T (2019) A novel MM algorithm and the mode-sharing method in bayesian computation for the analysis of general incomplete categorical data. Comput Stat Data Anal 140:122–143MathSciNetMATHCrossRef Tian G-L, Liu Y, Tang M-L, Li T (2019) A novel MM algorithm and the mode-sharing method in bayesian computation for the analysis of general incomplete categorical data. Comput Stat Data Anal 140:122–143MathSciNetMATHCrossRef
87.
Zurück zum Zitat Elkan C (2006) Clustering documents with an exponential-family approximation of the dirichlet compound multinomial distribution. In: Proceedings of the 23rd international conference on machine learning, ACM, pp 289–296 Elkan C (2006) Clustering documents with an exponential-family approximation of the dirichlet compound multinomial distribution. In: Proceedings of the 23rd international conference on machine learning, ACM, pp 289–296
88.
Zurück zum Zitat Baxter RA, Oliver JJ (2000) Finding overlapping components with mml. Stat Comput 10(1):5–16CrossRef Baxter RA, Oliver JJ (2000) Finding overlapping components with mml. Stat Comput 10(1):5–16CrossRef
89.
Zurück zum Zitat Bernardo JM, Smith AF (2001) Bayesian Theory. IOP Publishing, Bristol Bernardo JM, Smith AF (2001) Bayesian Theory. IOP Publishing, Bristol
90.
Zurück zum Zitat Celeux G, Chrétien S, Forbes F, Mkhadri A (2001) A component-wise em algorithm for mixtures. J Comput Graph Stat 10(4):697–712MathSciNetCrossRef Celeux G, Chrétien S, Forbes F, Mkhadri A (2001) A component-wise em algorithm for mixtures. J Comput Graph Stat 10(4):697–712MathSciNetCrossRef
91.
Zurück zum Zitat Novovičová J, Malik A (2003) Application of multinomial mixture model to text classification. In: Iberian conference on pattern recognition and image analysis, Springer, pp 646–653 Novovičová J, Malik A (2003) Application of multinomial mixture model to text classification. In: Iberian conference on pattern recognition and image analysis, Springer, pp 646–653
92.
Zurück zum Zitat Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Eleventh international AAAI conference on web and social media Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Eleventh international AAAI conference on web and social media
93.
Zurück zum Zitat Ortiz EG, Becker BC (2014) Face recognition for web-scale datasets. Comput Vis Image Underst 118:153–170CrossRef Ortiz EG, Becker BC (2014) Face recognition for web-scale datasets. Comput Vis Image Underst 118:153–170CrossRef
94.
Zurück zum Zitat Kumar N, Berg A, Belhumeur PN, Nayar S (2011) Describable visual attributes for face verification and image search. IEEE Trans Pattern Anal Mach Intell 33(10):1962–1977CrossRef Kumar N, Berg A, Belhumeur PN, Nayar S (2011) Describable visual attributes for face verification and image search. IEEE Trans Pattern Anal Mach Intell 33(10):1962–1977CrossRef
95.
Zurück zum Zitat Huang GB, Mattar M, Berg T, Learned-Miller E (2008) Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition, Erik Learned-Miller and Andras Ferencz and Frédéric Jurie, Oct 2008, Marseille, France. ffinria-00321923 Huang GB, Mattar M, Berg T, Learned-Miller E (2008) Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition, Erik Learned-Miller and Andras Ferencz and Frédéric Jurie, Oct 2008, Marseille, France. ffinria-00321923
96.
Zurück zum Zitat Zhang Z, Song Y, Qi H (2017) Age progression/regression by conditional adversarial autoencoder. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5810–5818 Zhang Z, Song Y, Qi H (2017) Age progression/regression by conditional adversarial autoencoder. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5810–5818
97.
Zurück zum Zitat Ricanek K, Tesafaye T (2006) Morph: A longitudinal image database of normal adult age-progression. In: 7th international conference on automatic face and gesture recognition (FGR06), IEEE, pp 341–345 Ricanek K, Tesafaye T (2006) Morph: A longitudinal image database of normal adult age-progression. In: 7th international conference on automatic face and gesture recognition (FGR06), IEEE, pp 341–345
98.
Zurück zum Zitat Guo G, Zhang C (2014) A study on cross-population age estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4257–4263 Guo G, Zhang C (2014) A study on cross-population age estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4257–4263
99.
Zurück zum Zitat He Z, Li X, Zhang Z, Wu F, Geng X, Zhang Y, Yang M-H, Zhuang Y (2017) Data-dependent label distribution learning for age estimation. IEEE Trans Image Process 26(8):3846–3858MathSciNetCrossRef He Z, Li X, Zhang Z, Wu F, Geng X, Zhang Y, Yang M-H, Zhuang Y (2017) Data-dependent label distribution learning for age estimation. IEEE Trans Image Process 26(8):3846–3858MathSciNetCrossRef
Metadaten
Titel
A novel minorization–maximization framework for simultaneous feature selection and clustering of high-dimensional count data
verfasst von
Nuha Zamzami
Nizar Bouguila
Publikationsdatum
27.07.2022
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 1/2023
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-022-01094-z

Weitere Artikel der Ausgabe 1/2023

Pattern Analysis and Applications 1/2023 Zur Ausgabe

Premium Partner