Skip to main content
Top

2019 | OriginalPaper | Chapter

Unsupervised Feature Selection Using Correlation Score

Authors : Tanuja Pattanshetti, Vahida Attar

Published in: Computing, Communication and Signal Processing

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Data of huge dimensionality is generated because of wide application of technologies. Using this data for the very purpose of decision-making is greatly affected because of the curse of dimensionality as selection of all features will lead to overfitting and ignoring the relevant ones can lead to information loss. Feature selection algorithms help to overcome this problem by identifying the subset of original features by retaining relevant features and by removing the redundant ones. This paper aims to evaluate and analyze some of the most popular feature selection algorithms using different benchmarked datasets. Relief, ReliefF, and Random Forest algorithms are evaluated and analyzed in the form of combinations of different rankers and classifiers. It is observed empirically that the accuracy of the ranker and classifier varies from dataset to dataset. This paper introduces the concept of applying multivariate correlation analysis (MCA) for feature selection. From results, it can be inferred that MCA exhibits better performance over the legacy-based feature selection algorithms.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Pattanshetti, T., Attar, V.: Survey of performance modeling of big data applications. In: 7th IEEE Conference on Cloud Computing, Data Science and Engineering, Confluence (2017) Pattanshetti, T., Attar, V.: Survey of performance modeling of big data applications. In: 7th IEEE Conference on Cloud Computing, Data Science and Engineering, Confluence (2017)
2.
go back to reference Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. 1157–82 (2003) Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. 1157–82 (2003)
3.
go back to reference Chandrashekar, G., Sahin, F.: A Survey on Feature Selection Methods, vol. 40, pp. 16–28. Elsevier (2013) Chandrashekar, G., Sahin, F.: A Survey on Feature Selection Methods, vol. 40, pp. 16–28. Elsevier (2013)
4.
go back to reference Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable Selection using Random Forest. 31, 2225–223, (2010)CrossRef Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable Selection using Random Forest. 31, 2225–223, (2010)CrossRef
5.
go back to reference Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24, 301–312 (2002)CrossRef Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24, 301–312 (2002)CrossRef
6.
go back to reference Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)CrossRef Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)CrossRef
7.
go back to reference Kira, K., Rendell, L.A.: A practical approach to feature selection. In: 9th International Conference on Machine Learning, pp. 249–256 (1999)CrossRef Kira, K., Rendell, L.A.: A practical approach to feature selection. In: 9th International Conference on Machine Learning, pp. 249–256 (1999)CrossRef
8.
go back to reference Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin based feature selection—theory and algorithms. In: 21st International Conference on Machine Learning (2004) Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin based feature selection—theory and algorithms. In: 21st International Conference on Machine Learning (2004)
9.
go back to reference Sun, Yijun: Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 29, 6 (2007)CrossRef Sun, Yijun: Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 29, 6 (2007)CrossRef
10.
go back to reference Kononenko, I.: Estimating Attributes: Analysis and Extensions of RELIEF European Conference on Machine Learning, vol. 784, pp. 171–182(1994)CrossRef Kononenko, I.: Estimating Attributes: Analysis and Extensions of RELIEF European Conference on Machine Learning, vol. 784, pp. 171–182(1994)CrossRef
11.
go back to reference Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast co-relation-based filter solution. In: Proceedings of the Twentieth International Conference on Machine Learning (2003) Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast co-relation-based filter solution. In: Proceedings of the Twentieth International Conference on Machine Learning (2003)
12.
go back to reference Duch, W., Biesiada, J.: Feature selection for high-dimensional data: a kolmogorov-smirnov co-relation-based filter solution. Advances in Soft Computing, pp. 95–104. Springer (2005) Duch, W., Biesiada, J.: Feature selection for high-dimensional data: a kolmogorov-smirnov co-relation-based filter solution. Advances in Soft Computing, pp. 95–104. Springer (2005)
13.
go back to reference Refaeilzadeh, P., Tang, L., Liu, H.: On Comparison of Feature Selection Algorithms WS-07-05, 34-39 (2003) Refaeilzadeh, P., Tang, L., Liu, H.: On Comparison of Feature Selection Algorithms WS-07-05, 34-39 (2003)
14.
go back to reference Chi, J.: Entropy based feature evaluation and selection technique. In: Proceedings of 4th Australian Conference on Neural Networks. ACNN (1993) Chi, J.: Entropy based feature evaluation and selection technique. In: Proceedings of 4th Australian Conference on Neural Networks. ACNN (1993)
15.
go back to reference Statnikov, A., Aliferis, C., Tsamardinos, I., Hardin, D., Levy, S.: A comprehensive evaluation of multi-category classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21, 631–643 (2005)CrossRef Statnikov, A., Aliferis, C., Tsamardinos, I., Hardin, D., Levy, S.: A comprehensive evaluation of multi-category classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21, 631–643 (2005)CrossRef
16.
go back to reference Wang, S., Tang, J., Liu, H.: Embedded Unsupervised Feature Selection, Association for the Advancement of Artificial Intelligence (2015) Wang, S., Tang, J., Liu, H.: Embedded Unsupervised Feature Selection, Association for the Advancement of Artificial Intelligence (2015)
17.
go back to reference Li, J., Hu, X., Tang, J., Liu, H.: Unsupervised Streaming Feature Selection in Social Media, CIKM’15. ACM, Melbourne, Australia (2015) Li, J., Hu, X., Tang, J., Liu, H.: Unsupervised Streaming Feature Selection in Social Media, CIKM’15. ACM, Melbourne, Australia (2015)
Metadata
Title
Unsupervised Feature Selection Using Correlation Score
Authors
Tanuja Pattanshetti
Vahida Attar
Copyright Year
2019
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-1513-8_37