Skip to main content

2019 | OriginalPaper | Buchkapitel

Unsupervised Feature Selection Using Correlation Score

verfasst von : Tanuja Pattanshetti, Vahida Attar

Erschienen in: Computing, Communication and Signal Processing

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data of huge dimensionality is generated because of wide application of technologies. Using this data for the very purpose of decision-making is greatly affected because of the curse of dimensionality as selection of all features will lead to overfitting and ignoring the relevant ones can lead to information loss. Feature selection algorithms help to overcome this problem by identifying the subset of original features by retaining relevant features and by removing the redundant ones. This paper aims to evaluate and analyze some of the most popular feature selection algorithms using different benchmarked datasets. Relief, ReliefF, and Random Forest algorithms are evaluated and analyzed in the form of combinations of different rankers and classifiers. It is observed empirically that the accuracy of the ranker and classifier varies from dataset to dataset. This paper introduces the concept of applying multivariate correlation analysis (MCA) for feature selection. From results, it can be inferred that MCA exhibits better performance over the legacy-based feature selection algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Pattanshetti, T., Attar, V.: Survey of performance modeling of big data applications. In: 7th IEEE Conference on Cloud Computing, Data Science and Engineering, Confluence (2017) Pattanshetti, T., Attar, V.: Survey of performance modeling of big data applications. In: 7th IEEE Conference on Cloud Computing, Data Science and Engineering, Confluence (2017)
2.
Zurück zum Zitat Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. 1157–82 (2003) Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. 1157–82 (2003)
3.
Zurück zum Zitat Chandrashekar, G., Sahin, F.: A Survey on Feature Selection Methods, vol. 40, pp. 16–28. Elsevier (2013) Chandrashekar, G., Sahin, F.: A Survey on Feature Selection Methods, vol. 40, pp. 16–28. Elsevier (2013)
4.
Zurück zum Zitat Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable Selection using Random Forest. 31, 2225–223, (2010)CrossRef Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable Selection using Random Forest. 31, 2225–223, (2010)CrossRef
5.
Zurück zum Zitat Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24, 301–312 (2002)CrossRef Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24, 301–312 (2002)CrossRef
6.
Zurück zum Zitat Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)CrossRef Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)CrossRef
7.
Zurück zum Zitat Kira, K., Rendell, L.A.: A practical approach to feature selection. In: 9th International Conference on Machine Learning, pp. 249–256 (1999)CrossRef Kira, K., Rendell, L.A.: A practical approach to feature selection. In: 9th International Conference on Machine Learning, pp. 249–256 (1999)CrossRef
8.
Zurück zum Zitat Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin based feature selection—theory and algorithms. In: 21st International Conference on Machine Learning (2004) Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin based feature selection—theory and algorithms. In: 21st International Conference on Machine Learning (2004)
9.
Zurück zum Zitat Sun, Yijun: Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 29, 6 (2007)CrossRef Sun, Yijun: Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 29, 6 (2007)CrossRef
10.
Zurück zum Zitat Kononenko, I.: Estimating Attributes: Analysis and Extensions of RELIEF European Conference on Machine Learning, vol. 784, pp. 171–182(1994)CrossRef Kononenko, I.: Estimating Attributes: Analysis and Extensions of RELIEF European Conference on Machine Learning, vol. 784, pp. 171–182(1994)CrossRef
11.
Zurück zum Zitat Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast co-relation-based filter solution. In: Proceedings of the Twentieth International Conference on Machine Learning (2003) Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast co-relation-based filter solution. In: Proceedings of the Twentieth International Conference on Machine Learning (2003)
12.
Zurück zum Zitat Duch, W., Biesiada, J.: Feature selection for high-dimensional data: a kolmogorov-smirnov co-relation-based filter solution. Advances in Soft Computing, pp. 95–104. Springer (2005) Duch, W., Biesiada, J.: Feature selection for high-dimensional data: a kolmogorov-smirnov co-relation-based filter solution. Advances in Soft Computing, pp. 95–104. Springer (2005)
13.
Zurück zum Zitat Refaeilzadeh, P., Tang, L., Liu, H.: On Comparison of Feature Selection Algorithms WS-07-05, 34-39 (2003) Refaeilzadeh, P., Tang, L., Liu, H.: On Comparison of Feature Selection Algorithms WS-07-05, 34-39 (2003)
14.
Zurück zum Zitat Chi, J.: Entropy based feature evaluation and selection technique. In: Proceedings of 4th Australian Conference on Neural Networks. ACNN (1993) Chi, J.: Entropy based feature evaluation and selection technique. In: Proceedings of 4th Australian Conference on Neural Networks. ACNN (1993)
15.
Zurück zum Zitat Statnikov, A., Aliferis, C., Tsamardinos, I., Hardin, D., Levy, S.: A comprehensive evaluation of multi-category classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21, 631–643 (2005)CrossRef Statnikov, A., Aliferis, C., Tsamardinos, I., Hardin, D., Levy, S.: A comprehensive evaluation of multi-category classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21, 631–643 (2005)CrossRef
16.
Zurück zum Zitat Wang, S., Tang, J., Liu, H.: Embedded Unsupervised Feature Selection, Association for the Advancement of Artificial Intelligence (2015) Wang, S., Tang, J., Liu, H.: Embedded Unsupervised Feature Selection, Association for the Advancement of Artificial Intelligence (2015)
17.
Zurück zum Zitat Li, J., Hu, X., Tang, J., Liu, H.: Unsupervised Streaming Feature Selection in Social Media, CIKM’15. ACM, Melbourne, Australia (2015) Li, J., Hu, X., Tang, J., Liu, H.: Unsupervised Streaming Feature Selection in Social Media, CIKM’15. ACM, Melbourne, Australia (2015)
Metadaten
Titel
Unsupervised Feature Selection Using Correlation Score
verfasst von
Tanuja Pattanshetti
Vahida Attar
Copyright-Jahr
2019
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-1513-8_37

Neuer Inhalt