Weitere Artikel dieser Ausgabe durch Wischen aufrufen
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Curse of dimensionality problem needs to be addressed carefully when designing a classifier. Given a huge dimensional dataset, one interesting problem is the choice of optimal selection of features for classification. Feature selection is an interesting and most optimal solution to the curse of dimensionality problem. Numerous feature selection algorithms have been proposed in the recent past to solve the curse of dimensionality problem but no one stop solution prevails. This paper proposes two novel algorithms for feature selection namely Reverse Piece-wise Correlation Based Feature Selection (RPwCBFS) and Shuffled Piece-wise Correlation Based Feature Selection (SPwCBFS) that divides the feature space into pieces and computes the similarity of feature subsets in reverse order and in random shuffled manner respectively. The proposed algorithms are compared with Fast Correlation Based Feature selection (FCBF), Fast Correlation Based Feature selection # (FCBF#) and Fast Correlation Based Feature selection In Piece (FCBFiP). Standard medium and huge dimensional datasets are used for experimentation purpose. Experimental results prove that the Reverse Piece-wise Correlation Based Feature Selection algorithm (RPwCBFS) and Shuffled Piece-wise Correlation Based Feature Selection algorithm (SPwCBFS) are prominent solution for feature selection when the underlying dataset is medium sized. For huge dimensional datasets, Shuffled Piece-wise Correlation Based Feature Selection algorithm (SPwCBFS) proves to be an optimal choice.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML- 03).
Senliol, B, et al. (2008). Fast correlation based filter (FCBF) with a different search strategy. In 2008 23rd international symposium on computer and information sciences. IEEE.
Egea, S., et al. (2018). Intelligent IoT traffic classification using novel search strategy for fast-based-correlation feature selection in industrial environments. IEEE Internet of Things Journal, 5(3), 1616–1624. CrossRef
Hancer, E., Xue, B., & Zhang, M. (2018). Differential evolution for filter feature selection based on information theory and feature ranking. Knowledge-Based Systems, 140, 103–119. CrossRef
Onan, A., & Korukoğlu, S. (2017). A feature selection model based on genetic rank aggregation for text sentiment classification. Journal of Information Science, 43(1), 25–38. CrossRef
Hall, M. A., & Smith, L. A. (1997). Feature subset selection: A correlation based filter approach. In International Conference on Neural Information Processing and Intelligent Information Systems (pp. 855–858).
Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In International Conference on Machine Learning (Vol. 1, pp. 74–81).
Zhang, Y., Gong, D., & Cheng, J. (2017). Multi-objective particle swarm optimization approach for cost-based feature selection in classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 14(1), 64–75. CrossRef
Hall, M. A. (2000). Correlation-based feature selection of discrete and numeric class machine learning. In International Conference on Machine Learning (pp. 359–366).
Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491–502. CrossRef
Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Icml, (vol. 97, pp. 412–420).
Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 1226–1238. CrossRef
Jacob, S., & Raju, G. (2017). Software defect prediction in large space systems through hybrid feature selection and classification. International Arab Journal of Information Technology, 14(2), 208–214.
Mao, K. Z. (2004). Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 34(1), 629–634. CrossRef
Forina, M., et al. (2010). UCI machine learning repository. Wine Dataset, [Online] Available: https://archive.ics.uci.edu/ml/datasets/wine.
Fisher, R. A. (2010). UCI machine learning repository. Iris Dataset, [Online] Available: https://archive.ics.uci.edu/ml/datasets/iris.
Zwitter, M., & Soklic, M. (1988). UCI machine learning repository. Breast cancer Dataset, [Online] Available: http://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+%28diagnostic%29.
Alpaydin, E., & Alimoglu, F. (2008). UCI machine learning repository. Digits Dataset, [Online] Available: https://archive.ics.uci.edu/ml/datasets/optical+recognition+of+handwritten+digits.
Reyes-Ortiz, J. L., Anguita, D., Ghio, A. Oneto, L., & Parra, X. (2013). UCI machine learning repository. UCI HAR Dataset, [Online] Available: https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones.
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar), 1157–1182. MATH
Urbanowicz, R. J., et al. (2018). Relief-based feature selection: Introduction and review. Journal of Biomedical Informatics, 85, 189–203. CrossRef
Alsheikh, M. A., et al. (2014). Machine learning in wireless sensor networks: Algorithms, strategies, and applications. IEEE Communications Surveys and Tutorials, 16(4), 1996–2018. CrossRef
Wahid, F., Ghazali, R., & Ismail, L. H. (2019). An enhanced approach of artificial bee colony for energy management in energy efficient residential building. Wireless Personal Communications, 104(1), 235–257. CrossRef
Wahid, Fazli, & Ghazali, Rozaida. (2019). Hybrid of firefly algorithm and pattern search for solving optimization problems. Evolutionary Intelligence, 12(1), 1–10. CrossRef
Wahid, F., Ghazali, R., & Shah, H. (2018). An improved hybrid firefly algorithm for solving optimization problems. In International conference on soft computing and data mining, (pp. 14–23). Cham: Springer.
Wahid, F., & Kim, D. H. (2016). An efficient approach for energy consumption optimization and management in residential building using artificial bee colony and fuzzy logic. In Mathematical Problems in Engineering (pp. 1–13). Hidawai.
Wahid, F., & Kim, D. H. (2017) Short-term energy consumption prediction in korean residential buildings using optimized multi-layer perceptron. Kuwait Journal of Science, 44(2), 179–187.
Wahid, F., Ghazali, R., Shah, A. S., & Fayaz, M. (2017). Prediction of energy consumption in the buildings using multi-layer perceptron and random forest. IJAST, 101, 13–22. CrossRef
- Correlation Based Feature Selection Algorithms for Varying Datasets of Different Dimensionality
A. Meena Kowshalya
- Springer US