Skip to main content
Top
Published in: Wireless Personal Communications 3/2019

08-05-2019

Correlation Based Feature Selection Algorithms for Varying Datasets of Different Dimensionality

Authors: A. Meena Kowshalya, R. Madhumathi, N. Gopika

Published in: Wireless Personal Communications | Issue 3/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Curse of dimensionality problem needs to be addressed carefully when designing a classifier. Given a huge dimensional dataset, one interesting problem is the choice of optimal selection of features for classification. Feature selection is an interesting and most optimal solution to the curse of dimensionality problem. Numerous feature selection algorithms have been proposed in the recent past to solve the curse of dimensionality problem but no one stop solution prevails. This paper proposes two novel algorithms for feature selection namely Reverse Piece-wise Correlation Based Feature Selection (RPwCBFS) and Shuffled Piece-wise Correlation Based Feature Selection (SPwCBFS) that divides the feature space into pieces and computes the similarity of feature subsets in reverse order and in random shuffled manner respectively. The proposed algorithms are compared with Fast Correlation Based Feature selection (FCBF), Fast Correlation Based Feature selection # (FCBF#) and Fast Correlation Based Feature selection In Piece (FCBFiP). Standard medium and huge dimensional datasets are used for experimentation purpose. Experimental results prove that the Reverse Piece-wise Correlation Based Feature Selection algorithm (RPwCBFS) and Shuffled Piece-wise Correlation Based Feature Selection algorithm (SPwCBFS) are prominent solution for feature selection when the underlying dataset is medium sized. For huge dimensional datasets, Shuffled Piece-wise Correlation Based Feature Selection algorithm (SPwCBFS) proves to be an optimal choice.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03). Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03).
2.
go back to reference Senliol, B, et al. (2008). Fast correlation based filter (FCBF) with a different search strategy. In 2008 23rd international symposium on computer and information sciences. IEEE. Senliol, B, et al. (2008). Fast correlation based filter (FCBF) with a different search strategy. In 2008 23rd international symposium on computer and information sciences. IEEE.
3.
go back to reference Egea, S., et al. (2018). Intelligent IoT traffic classification using novel search strategy for fast-based-correlation feature selection in industrial environments. IEEE Internet of Things Journal, 5(3), 1616–1624.CrossRef Egea, S., et al. (2018). Intelligent IoT traffic classification using novel search strategy for fast-based-correlation feature selection in industrial environments. IEEE Internet of Things Journal, 5(3), 1616–1624.CrossRef
4.
go back to reference Hancer, E., Xue, B., & Zhang, M. (2018). Differential evolution for filter feature selection based on information theory and feature ranking. Knowledge-Based Systems, 140, 103–119.CrossRef Hancer, E., Xue, B., & Zhang, M. (2018). Differential evolution for filter feature selection based on information theory and feature ranking. Knowledge-Based Systems, 140, 103–119.CrossRef
5.
go back to reference Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5, 1205–1224.MathSciNetMATH Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5, 1205–1224.MathSciNetMATH
6.
go back to reference Onan, A., & Korukoğlu, S. (2017). A feature selection model based on genetic rank aggregation for text sentiment classification. Journal of Information Science, 43(1), 25–38.CrossRef Onan, A., & Korukoğlu, S. (2017). A feature selection model based on genetic rank aggregation for text sentiment classification. Journal of Information Science, 43(1), 25–38.CrossRef
7.
go back to reference Hall, M. A., & Smith, L. A. (1997). Feature subset selection: A correlation based filter approach. In International Conference on Neural Information Processing and Intelligent Information Systems (pp. 855–858). Hall, M. A., & Smith, L. A. (1997). Feature subset selection: A correlation based filter approach. In International Conference on Neural Information Processing and Intelligent Information Systems (pp. 855–858).
8.
go back to reference Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In International Conference on Machine Learning (Vol. 1, pp. 74–81). Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In International Conference on Machine Learning (Vol. 1, pp. 74–81).
9.
go back to reference Zhang, Y., Gong, D., & Cheng, J. (2017). Multi-objective particle swarm optimization approach for cost-based feature selection in classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 14(1), 64–75.CrossRef Zhang, Y., Gong, D., & Cheng, J. (2017). Multi-objective particle swarm optimization approach for cost-based feature selection in classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 14(1), 64–75.CrossRef
10.
go back to reference Hall, M. A. (2000). Correlation-based feature selection of discrete and numeric class machine learning. In International Conference on Machine Learning (pp. 359–366). Hall, M. A. (2000). Correlation-based feature selection of discrete and numeric class machine learning. In International Conference on Machine Learning (pp. 359–366).
11.
go back to reference Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491–502.CrossRef Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491–502.CrossRef
12.
go back to reference Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Icml, (vol. 97, pp. 412–420). Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Icml, (vol. 97, pp. 412–420).
13.
go back to reference Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 1226–1238.CrossRef Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 1226–1238.CrossRef
14.
go back to reference Jacob, S., & Raju, G. (2017). Software defect prediction in large space systems through hybrid feature selection and classification. International Arab Journal of Information Technology, 14(2), 208–214. Jacob, S., & Raju, G. (2017). Software defect prediction in large space systems through hybrid feature selection and classification. International Arab Journal of Information Technology, 14(2), 208–214.
15.
go back to reference Mao, K. Z. (2004). Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 34(1), 629–634.CrossRef Mao, K. Z. (2004). Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 34(1), 629–634.CrossRef
21.
go back to reference Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar), 1157–1182.MATH Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar), 1157–1182.MATH
22.
go back to reference Urbanowicz, R. J., et al. (2018). Relief-based feature selection: Introduction and review. Journal of Biomedical Informatics, 85, 189–203.CrossRef Urbanowicz, R. J., et al. (2018). Relief-based feature selection: Introduction and review. Journal of Biomedical Informatics, 85, 189–203.CrossRef
23.
go back to reference Alsheikh, M. A., et al. (2014). Machine learning in wireless sensor networks: Algorithms, strategies, and applications. IEEE Communications Surveys and Tutorials, 16(4), 1996–2018.CrossRef Alsheikh, M. A., et al. (2014). Machine learning in wireless sensor networks: Algorithms, strategies, and applications. IEEE Communications Surveys and Tutorials, 16(4), 1996–2018.CrossRef
24.
go back to reference Wahid, F., Ghazali, R., & Ismail, L. H. (2019). An enhanced approach of artificial bee colony for energy management in energy efficient residential building. Wireless Personal Communications, 104(1), 235–257.CrossRef Wahid, F., Ghazali, R., & Ismail, L. H. (2019). An enhanced approach of artificial bee colony for energy management in energy efficient residential building. Wireless Personal Communications, 104(1), 235–257.CrossRef
25.
go back to reference Wahid, Fazli, & Ghazali, Rozaida. (2019). Hybrid of firefly algorithm and pattern search for solving optimization problems. Evolutionary Intelligence, 12(1), 1–10.CrossRef Wahid, Fazli, & Ghazali, Rozaida. (2019). Hybrid of firefly algorithm and pattern search for solving optimization problems. Evolutionary Intelligence, 12(1), 1–10.CrossRef
26.
go back to reference Wahid, F., Ghazali, R., & Shah, H. (2018). An improved hybrid firefly algorithm for solving optimization problems. In International conference on soft computing and data mining, (pp. 14–23). Cham: Springer. Wahid, F., Ghazali, R., & Shah, H. (2018). An improved hybrid firefly algorithm for solving optimization problems. In International conference on soft computing and data mining, (pp. 14–23). Cham: Springer.
27.
go back to reference Wahid, F., & Kim, D. H. (2016). An efficient approach for energy consumption optimization and management in residential building using artificial bee colony and fuzzy logic. In Mathematical Problems in Engineering (pp. 1–13). Hidawai. Wahid, F., & Kim, D. H. (2016). An efficient approach for energy consumption optimization and management in residential building using artificial bee colony and fuzzy logic. In Mathematical Problems in Engineering (pp. 1–13). Hidawai.
28.
go back to reference Wahid, F., & Kim, D. H. (2017) Short-term energy consumption prediction in korean residential buildings using optimized multi-layer perceptron. Kuwait Journal of Science, 44(2), 179–187. Wahid, F., & Kim, D. H. (2017) Short-term energy consumption prediction in korean residential buildings using optimized multi-layer perceptron. Kuwait Journal of Science, 44(2), 179–187.
29.
go back to reference Wahid, F., Ghazali, R., Shah, A. S., & Fayaz, M. (2017). Prediction of energy consumption in the buildings using multi-layer perceptron and random forest. IJAST, 101, 13–22.CrossRef Wahid, F., Ghazali, R., Shah, A. S., & Fayaz, M. (2017). Prediction of energy consumption in the buildings using multi-layer perceptron and random forest. IJAST, 101, 13–22.CrossRef
Metadata
Title
Correlation Based Feature Selection Algorithms for Varying Datasets of Different Dimensionality
Authors
A. Meena Kowshalya
R. Madhumathi
N. Gopika
Publication date
08-05-2019
Publisher
Springer US
Published in
Wireless Personal Communications / Issue 3/2019
Print ISSN: 0929-6212
Electronic ISSN: 1572-834X
DOI
https://doi.org/10.1007/s11277-019-06504-w

Other articles of this Issue 3/2019

Wireless Personal Communications 3/2019 Go to the issue