Skip to main content
Erschienen in: Wireless Personal Communications 3/2019

08.05.2019

Correlation Based Feature Selection Algorithms for Varying Datasets of Different Dimensionality

verfasst von: A. Meena Kowshalya, R. Madhumathi, N. Gopika

Erschienen in: Wireless Personal Communications | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Curse of dimensionality problem needs to be addressed carefully when designing a classifier. Given a huge dimensional dataset, one interesting problem is the choice of optimal selection of features for classification. Feature selection is an interesting and most optimal solution to the curse of dimensionality problem. Numerous feature selection algorithms have been proposed in the recent past to solve the curse of dimensionality problem but no one stop solution prevails. This paper proposes two novel algorithms for feature selection namely Reverse Piece-wise Correlation Based Feature Selection (RPwCBFS) and Shuffled Piece-wise Correlation Based Feature Selection (SPwCBFS) that divides the feature space into pieces and computes the similarity of feature subsets in reverse order and in random shuffled manner respectively. The proposed algorithms are compared with Fast Correlation Based Feature selection (FCBF), Fast Correlation Based Feature selection # (FCBF#) and Fast Correlation Based Feature selection In Piece (FCBFiP). Standard medium and huge dimensional datasets are used for experimentation purpose. Experimental results prove that the Reverse Piece-wise Correlation Based Feature Selection algorithm (RPwCBFS) and Shuffled Piece-wise Correlation Based Feature Selection algorithm (SPwCBFS) are prominent solution for feature selection when the underlying dataset is medium sized. For huge dimensional datasets, Shuffled Piece-wise Correlation Based Feature Selection algorithm (SPwCBFS) proves to be an optimal choice.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03). Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03).
2.
Zurück zum Zitat Senliol, B, et al. (2008). Fast correlation based filter (FCBF) with a different search strategy. In 2008 23rd international symposium on computer and information sciences. IEEE. Senliol, B, et al. (2008). Fast correlation based filter (FCBF) with a different search strategy. In 2008 23rd international symposium on computer and information sciences. IEEE.
3.
Zurück zum Zitat Egea, S., et al. (2018). Intelligent IoT traffic classification using novel search strategy for fast-based-correlation feature selection in industrial environments. IEEE Internet of Things Journal, 5(3), 1616–1624.CrossRef Egea, S., et al. (2018). Intelligent IoT traffic classification using novel search strategy for fast-based-correlation feature selection in industrial environments. IEEE Internet of Things Journal, 5(3), 1616–1624.CrossRef
4.
Zurück zum Zitat Hancer, E., Xue, B., & Zhang, M. (2018). Differential evolution for filter feature selection based on information theory and feature ranking. Knowledge-Based Systems, 140, 103–119.CrossRef Hancer, E., Xue, B., & Zhang, M. (2018). Differential evolution for filter feature selection based on information theory and feature ranking. Knowledge-Based Systems, 140, 103–119.CrossRef
5.
Zurück zum Zitat Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5, 1205–1224.MathSciNetMATH Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5, 1205–1224.MathSciNetMATH
6.
Zurück zum Zitat Onan, A., & Korukoğlu, S. (2017). A feature selection model based on genetic rank aggregation for text sentiment classification. Journal of Information Science, 43(1), 25–38.CrossRef Onan, A., & Korukoğlu, S. (2017). A feature selection model based on genetic rank aggregation for text sentiment classification. Journal of Information Science, 43(1), 25–38.CrossRef
7.
Zurück zum Zitat Hall, M. A., & Smith, L. A. (1997). Feature subset selection: A correlation based filter approach. In International Conference on Neural Information Processing and Intelligent Information Systems (pp. 855–858). Hall, M. A., & Smith, L. A. (1997). Feature subset selection: A correlation based filter approach. In International Conference on Neural Information Processing and Intelligent Information Systems (pp. 855–858).
8.
Zurück zum Zitat Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In International Conference on Machine Learning (Vol. 1, pp. 74–81). Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In International Conference on Machine Learning (Vol. 1, pp. 74–81).
9.
Zurück zum Zitat Zhang, Y., Gong, D., & Cheng, J. (2017). Multi-objective particle swarm optimization approach for cost-based feature selection in classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 14(1), 64–75.CrossRef Zhang, Y., Gong, D., & Cheng, J. (2017). Multi-objective particle swarm optimization approach for cost-based feature selection in classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 14(1), 64–75.CrossRef
10.
Zurück zum Zitat Hall, M. A. (2000). Correlation-based feature selection of discrete and numeric class machine learning. In International Conference on Machine Learning (pp. 359–366). Hall, M. A. (2000). Correlation-based feature selection of discrete and numeric class machine learning. In International Conference on Machine Learning (pp. 359–366).
11.
Zurück zum Zitat Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491–502.CrossRef Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491–502.CrossRef
12.
Zurück zum Zitat Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Icml, (vol. 97, pp. 412–420). Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Icml, (vol. 97, pp. 412–420).
13.
Zurück zum Zitat Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 1226–1238.CrossRef Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 1226–1238.CrossRef
14.
Zurück zum Zitat Jacob, S., & Raju, G. (2017). Software defect prediction in large space systems through hybrid feature selection and classification. International Arab Journal of Information Technology, 14(2), 208–214. Jacob, S., & Raju, G. (2017). Software defect prediction in large space systems through hybrid feature selection and classification. International Arab Journal of Information Technology, 14(2), 208–214.
15.
Zurück zum Zitat Mao, K. Z. (2004). Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 34(1), 629–634.CrossRef Mao, K. Z. (2004). Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 34(1), 629–634.CrossRef
21.
Zurück zum Zitat Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar), 1157–1182.MATH Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar), 1157–1182.MATH
22.
Zurück zum Zitat Urbanowicz, R. J., et al. (2018). Relief-based feature selection: Introduction and review. Journal of Biomedical Informatics, 85, 189–203.CrossRef Urbanowicz, R. J., et al. (2018). Relief-based feature selection: Introduction and review. Journal of Biomedical Informatics, 85, 189–203.CrossRef
23.
Zurück zum Zitat Alsheikh, M. A., et al. (2014). Machine learning in wireless sensor networks: Algorithms, strategies, and applications. IEEE Communications Surveys and Tutorials, 16(4), 1996–2018.CrossRef Alsheikh, M. A., et al. (2014). Machine learning in wireless sensor networks: Algorithms, strategies, and applications. IEEE Communications Surveys and Tutorials, 16(4), 1996–2018.CrossRef
24.
Zurück zum Zitat Wahid, F., Ghazali, R., & Ismail, L. H. (2019). An enhanced approach of artificial bee colony for energy management in energy efficient residential building. Wireless Personal Communications, 104(1), 235–257.CrossRef Wahid, F., Ghazali, R., & Ismail, L. H. (2019). An enhanced approach of artificial bee colony for energy management in energy efficient residential building. Wireless Personal Communications, 104(1), 235–257.CrossRef
25.
Zurück zum Zitat Wahid, Fazli, & Ghazali, Rozaida. (2019). Hybrid of firefly algorithm and pattern search for solving optimization problems. Evolutionary Intelligence, 12(1), 1–10.CrossRef Wahid, Fazli, & Ghazali, Rozaida. (2019). Hybrid of firefly algorithm and pattern search for solving optimization problems. Evolutionary Intelligence, 12(1), 1–10.CrossRef
26.
Zurück zum Zitat Wahid, F., Ghazali, R., & Shah, H. (2018). An improved hybrid firefly algorithm for solving optimization problems. In International conference on soft computing and data mining, (pp. 14–23). Cham: Springer. Wahid, F., Ghazali, R., & Shah, H. (2018). An improved hybrid firefly algorithm for solving optimization problems. In International conference on soft computing and data mining, (pp. 14–23). Cham: Springer.
27.
Zurück zum Zitat Wahid, F., & Kim, D. H. (2016). An efficient approach for energy consumption optimization and management in residential building using artificial bee colony and fuzzy logic. In Mathematical Problems in Engineering (pp. 1–13). Hidawai. Wahid, F., & Kim, D. H. (2016). An efficient approach for energy consumption optimization and management in residential building using artificial bee colony and fuzzy logic. In Mathematical Problems in Engineering (pp. 1–13). Hidawai.
28.
Zurück zum Zitat Wahid, F., & Kim, D. H. (2017) Short-term energy consumption prediction in korean residential buildings using optimized multi-layer perceptron. Kuwait Journal of Science, 44(2), 179–187. Wahid, F., & Kim, D. H. (2017) Short-term energy consumption prediction in korean residential buildings using optimized multi-layer perceptron. Kuwait Journal of Science, 44(2), 179–187.
29.
Zurück zum Zitat Wahid, F., Ghazali, R., Shah, A. S., & Fayaz, M. (2017). Prediction of energy consumption in the buildings using multi-layer perceptron and random forest. IJAST, 101, 13–22.CrossRef Wahid, F., Ghazali, R., Shah, A. S., & Fayaz, M. (2017). Prediction of energy consumption in the buildings using multi-layer perceptron and random forest. IJAST, 101, 13–22.CrossRef
Metadaten
Titel
Correlation Based Feature Selection Algorithms for Varying Datasets of Different Dimensionality
verfasst von
A. Meena Kowshalya
R. Madhumathi
N. Gopika
Publikationsdatum
08.05.2019
Verlag
Springer US
Erschienen in
Wireless Personal Communications / Ausgabe 3/2019
Print ISSN: 0929-6212
Elektronische ISSN: 1572-834X
DOI
https://doi.org/10.1007/s11277-019-06504-w

Weitere Artikel der Ausgabe 3/2019

Wireless Personal Communications 3/2019 Zur Ausgabe

Neuer Inhalt