Skip to main content

2018 | OriginalPaper | Buchkapitel

Probabilistic Feature Selection in Machine Learning

verfasst von : Indrajit Ghosh

Erschienen in: Artificial Intelligence and Soft Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In machine learning, Case Based Reasoning is a prominent technique for harvesting knowledge from past experiences. The past experiences are represented in the form of a repository of cases having a set of features. But each feature may not have the equal relevancy in describing a case. Measuring the relevancy of each feature is always a prime issue. A subset of relevant features describes a case with adequate accuracy. An appropriate subset of relevant features should be selected for improving the performance of the system and to reduce dimensionality. In case based domain, feature selection is a process of selecting an appropriate subset of relevant features. There are various real domains which are inherently case based and features are expressed in terms of linguistic variables. To assign a numerical weight to each linguistic feature, a lot of feature subset selection algorithms have been proposed. But the weighting values are usually determined using subjective judgement or a trial and error basis.
This work presents an alternative concept in this direction. It can be efficiently applied to select the relevant linguistic features by measuring the probability in term of numerical values. It can also rule out irrelevant and noisy features. Applications of this approach in various real world domain show an excellent performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Payne, T.R., Edwards, P.: Implicit feature selection with the value difference metric. In: Prade, H. (ed.) 13th European Conference on Artificial Intelligence (ECAI-1998). Wiley, Hoboken (1998) Payne, T.R., Edwards, P.: Implicit feature selection with the value difference metric. In: Prade, H. (ed.) 13th European Conference on Artificial Intelligence (ECAI-1998). Wiley, Hoboken (1998)
2.
Zurück zum Zitat Minsky, M., Papert, S.: Perceptrons. The MIT Press, Cambridge (1988)MATH Minsky, M., Papert, S.: Perceptrons. The MIT Press, Cambridge (1988)MATH
3.
Zurück zum Zitat Roy, A.: Summery of panel discussion at ICNN97 on connectionist learning. Connectionist learning: is it time to reconsider the foundations. JNNS Newsl. Neural Netw. 11(2) (1998) Roy, A.: Summery of panel discussion at ICNN97 on connectionist learning. Connectionist learning: is it time to reconsider the foundations. JNNS Newsl. Neural Netw. 11(2) (1998)
4.
Zurück zum Zitat Roy, A.: Artificial neural networks - a science in trouble. Vivek Q. Artif. Intell. 13(2), 17–24 (2000) Roy, A.: Artificial neural networks - a science in trouble. Vivek Q. Artif. Intell. 13(2), 17–24 (2000)
5.
Zurück zum Zitat Aha, D.W., Bankart, R.L.: Feature selection for case-based classification of cloud types: an empirical comparison. In: Workshop on Case-Based Reasoning, Technical Report WS-94-01. AAAI Press (1994) Aha, D.W., Bankart, R.L.: Feature selection for case-based classification of cloud types: an empirical comparison. In: Workshop on Case-Based Reasoning, Technical Report WS-94-01. AAAI Press (1994)
6.
Zurück zum Zitat Nguyen, H.V., Gopalkrishnan, V.: Feature extraction for outlier detection in high-dimensional spaces. In: The 4th Workshop on Feature Selection in Data Mining (2010) Nguyen, H.V., Gopalkrishnan, V.: Feature extraction for outlier detection in high-dimensional spaces. In: The 4th Workshop on Feature Selection in Data Mining (2010)
7.
Zurück zum Zitat Das, M., Liu, H.: Feature selection for classification. Intell. Data Anal.: Int. J. 1(3), 131–156 (1997)CrossRef Das, M., Liu, H.: Feature selection for classification. Intell. Data Anal.: Int. J. 1(3), 131–156 (1997)CrossRef
8.
Zurück zum Zitat Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)MathSciNetCrossRef Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)MathSciNetCrossRef
9.
Zurück zum Zitat Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH
10.
Zurück zum Zitat Liu, H., Motoda, H.: A selective sampling approach to active feature selection. Artif. Intell. 159, 49–74 (2004)MathSciNetCrossRef Liu, H., Motoda, H.: A selective sampling approach to active feature selection. Artif. Intell. 159, 49–74 (2004)MathSciNetCrossRef
11.
Zurück zum Zitat John, G., Kohavi, R., Pfleger, K.: Irrelevant feature and subset selection problem. In: Proceedings of the Eleventh International Machine Learning Conference, pp. 121–129 (1994) John, G., Kohavi, R., Pfleger, K.: Irrelevant feature and subset selection problem. In: Proceedings of the Eleventh International Machine Learning Conference, pp. 121–129 (1994)
12.
Zurück zum Zitat Vafaie, H., De Jong, K.: Robust feature selection algorithms. In: Proceeding of the Fifth Conference on Tools for Artificial Intelligence, pp. 356–363 (1993) Vafaie, H., De Jong, K.: Robust feature selection algorithms. In: Proceeding of the Fifth Conference on Tools for Artificial Intelligence, pp. 356–363 (1993)
13.
Zurück zum Zitat Song, L., Smola, A., Gretoon, A., Borgwardt, K., Bedo, J.: Supervised feature selection via dependence estimation. In: International Conference on Machine Learning (2007) Song, L., Smola, A., Gretoon, A., Borgwardt, K., Bedo, J.: Supervised feature selection via dependence estimation. In: International Conference on Machine Learning (2007)
14.
Zurück zum Zitat Xu, Z., Jin, R., Ye, J., Lyu, M.R., King, I.: Discriminative semi-supervised feature selection via manifold regularization. In: Proceedings of the 21th International Joint Conference on Artificial Intelligence, IJCAI 2009 (2009) Xu, Z., Jin, R., Ye, J., Lyu, M.R., King, I.: Discriminative semi-supervised feature selection via manifold regularization. In: Proceedings of the 21th International Joint Conference on Artificial Intelligence, IJCAI 2009 (2009)
15.
Zurück zum Zitat Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: International Conference on Machine Learning, ICML-2007 (2007) Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: International Conference on Machine Learning, ICML-2007 (2007)
16.
Zurück zum Zitat Zhao, Z., Wang, L., Liu, H.: Efficient spectral feature selection with minimum redundancy. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI-2010 (2010) Zhao, Z., Wang, L., Liu, H.: Efficient spectral feature selection with minimum redundancy. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI-2010 (2010)
17.
Zurück zum Zitat Yu, L., Liu, H.: Redundancy based feature selection for microarray data. In: Proceedings of the Tenth ACM SIGKDD Conference on Knowledge Discovery and Data Mining, ACM SIGKDD-2004 (2004) Yu, L., Liu, H.: Redundancy based feature selection for microarray data. In: Proceedings of the Tenth ACM SIGKDD Conference on Knowledge Discovery and Data Mining, ACM SIGKDD-2004 (2004)
18.
Zurück zum Zitat Queiros, C.E., Gelsema, E.: On feature selection. In: Proceedings of the Seventh International Conference on Pattern Recognition, pp. 128–130 (1984) Queiros, C.E., Gelsema, E.: On feature selection. In: Proceedings of the Seventh International Conference on Pattern Recognition, pp. 128–130 (1984)
19.
Zurück zum Zitat Chang, S., Dasgupta, N., Carin, L.: A Bayesian approach to unsupervised feature selection and density estimation using expectation propagation. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1043–1050 (2005) Chang, S., Dasgupta, N., Carin, L.: A Bayesian approach to unsupervised feature selection and density estimation using expectation propagation. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1043–1050 (2005)
20.
Zurück zum Zitat Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)MathSciNetMATH Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)MathSciNetMATH
21.
Zurück zum Zitat He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems 18, pp. 507–514. MIT Press (2006) He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems 18, pp. 507–514. MIT Press (2006)
22.
Zurück zum Zitat Piramuthu, S.: Evaluating feature selection methods for learning in data mining application. Eur. J. Oper. Res. 156(2), 483–494 (2004)MathSciNetCrossRef Piramuthu, S.: Evaluating feature selection methods for learning in data mining application. Eur. J. Oper. Res. 156(2), 483–494 (2004)MathSciNetCrossRef
23.
Zurück zum Zitat Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 74–81 (2001) Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 74–81 (2001)
24.
Zurück zum Zitat Kumar, V., Minz, S.: Feature selection: a literature review. Smart Comput. Rev. 4(3), 211–229 (2014)CrossRef Kumar, V., Minz, S.: Feature selection: a literature review. Smart Comput. Rev. 4(3), 211–229 (2014)CrossRef
25.
Zurück zum Zitat Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. In: Data Classification: Algorithms and Applications. CRC Press (2013) Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. In: Data Classification: Algorithms and Applications. CRC Press (2013)
26.
Zurück zum Zitat Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluation. In: Proceedings of ICDM, pp. 306–313 (2002) Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluation. In: Proceedings of ICDM, pp. 306–313 (2002)
27.
Zurück zum Zitat Dadaneh, B.Z., Markid, H.Y., Zakerolhosseini, A.: Unsupervised probabilistic feature selection using ant colony optimization. Expert Syst. Appl. 53, 27–42 (2016)CrossRef Dadaneh, B.Z., Markid, H.Y., Zakerolhosseini, A.: Unsupervised probabilistic feature selection using ant colony optimization. Expert Syst. Appl. 53, 27–42 (2016)CrossRef
28.
Zurück zum Zitat Uysal, A.K., Gunal, S.: A novel probabilistic feature selection method for text classification. Knowl.-Based Syst. 36, 226–235 (2012)CrossRef Uysal, A.K., Gunal, S.: A novel probabilistic feature selection method for text classification. Knowl.-Based Syst. 36, 226–235 (2012)CrossRef
29.
Zurück zum Zitat Alibeigi, M., Hashemi, S., Hamzeh, A.: Unsupervised feature selection using feature density functions. Int. J. Comput. Electr. Autom. Control Inf. Eng. 3(3), 847–852 (2009) Alibeigi, M., Hashemi, S., Hamzeh, A.: Unsupervised feature selection using feature density functions. Int. J. Comput. Electr. Autom. Control Inf. Eng. 3(3), 847–852 (2009)
30.
Zurück zum Zitat Kohavi, R.: Feature subset selection as search with probabilistic estimates. In: AAAI Fall Symposium on Relevance, pp. 122–126 (1994) Kohavi, R.: Feature subset selection as search with probabilistic estimates. In: AAAI Fall Symposium on Relevance, pp. 122–126 (1994)
31.
Zurück zum Zitat Salehi, E., Nayachavadi, J., Gras, R.: A statistical implicative analysis based algorithm and MMPC algorithm for detecting multiple dependencies. In: The 4th Workshop on Feature Selection in Data Mining (2010) Salehi, E., Nayachavadi, J., Gras, R.: A statistical implicative analysis based algorithm and MMPC algorithm for detecting multiple dependencies. In: The 4th Workshop on Feature Selection in Data Mining (2010)
32.
Zurück zum Zitat Liu, H., Setiono, R.: Feature selection and classification - a probabilistic wrapper approach. In: Proceedings of the Ninth International Conference on Industrial and Engineering Application of AI and ES, pp. 419–424 (1996) Liu, H., Setiono, R.: Feature selection and classification - a probabilistic wrapper approach. In: Proceedings of the Ninth International Conference on Industrial and Engineering Application of AI and ES, pp. 419–424 (1996)
33.
Zurück zum Zitat Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo (1988)MATH Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo (1988)MATH
34.
Zurück zum Zitat Agresti, A.: Categorical Data Analysis: Probability and Mathematical Statistics. Wiley, Hoboken (1990)MATH Agresti, A.: Categorical Data Analysis: Probability and Mathematical Statistics. Wiley, Hoboken (1990)MATH
35.
Zurück zum Zitat Kherfi, M.L., Ziou, D.: Relevance feedback for CBIR: a new approach based on probabilistic feature weighting with positive and negative examples. IEEE Trans. Image Process. 15(4), 1017–1030 (2006)CrossRef Kherfi, M.L., Ziou, D.: Relevance feedback for CBIR: a new approach based on probabilistic feature weighting with positive and negative examples. IEEE Trans. Image Process. 15(4), 1017–1030 (2006)CrossRef
36.
Zurück zum Zitat Bolon-Canedo, V., Sanchez-Marono, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)CrossRef Bolon-Canedo, V., Sanchez-Marono, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)CrossRef
Metadaten
Titel
Probabilistic Feature Selection in Machine Learning
verfasst von
Indrajit Ghosh
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-91253-0_58