Skip to main content
Top
Published in: Cluster Computing 3/2019

16-01-2018

Improving medical diagnosis performance using hybrid feature selection via relieff and entropy based genetic search (RF-EGA) approach: application to breast cancer prediction

Authors: Ilangovan Sangaiah, A. Vincent Antony Kumar

Published in: Cluster Computing | Special Issue 3/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this research a new hybrid prediction algorithm for breast cancer has been made from a breast cancer data set. Many approaches are available in diagnosing the medical diseases like genetic algorithm, ant colony optimization, particle swarm optimization, cuckoo search algorithm, etc., The proposed algorithm uses a ReliefF attribute reduction with entropy based genetic algorithm for breast cancer detection. The hybrid combination of these techniques is used to handle the dataset with high dimension and uncertainties. The data are obtained from the Wisconsin breast cancer dataset; these data have been categorized based on different properties. The performance of the proposed method is evaluated and the results are compared with other well known feature selection methods. The obtained result shows that the proposed method has a remarkable ability to generate reduced-size subset of salient features while yielding significant classification accuracy for large datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)MATH Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)MATH
2.
go back to reference Goldberg, D.E.: Genetic Algorithm in Search, Optimization & Machine Learning. Addison Wesley, Reading (1989)MATH Goldberg, D.E.: Genetic Algorithm in Search, Optimization & Machine Learning. Addison Wesley, Reading (1989)MATH
3.
go back to reference Kononenko, I.: Estimation attributes: analysis and Extensions of RELIEF. In: Proceedings of the 1994 European Conference on Machine Learning, pp. 171–182 (1994)CrossRef Kononenko, I.: Estimation attributes: analysis and Extensions of RELIEF. In: Proceedings of the 1994 European Conference on Machine Learning, pp. 171–182 (1994)CrossRef
4.
go back to reference Yang, P., Zhang, Z.: An embedded two-layer feature selection approach for microarray data analysis. EEE Intell. Inf. Bull. 10(1), 24–32 (2009) Yang, P., Zhang, Z.: An embedded two-layer feature selection approach for microarray data analysis. EEE Intell. Inf. Bull. 10(1), 24–32 (2009)
5.
go back to reference Huerta, E.B.: A Hybrid GA/SVM approach for gene selection and classification of microarray data. pp. 34–44 (2006) Huerta, E.B.: A Hybrid GA/SVM approach for gene selection and classification of microarray data. pp. 34–44 (2006)
6.
go back to reference Olaniyi, E.O., Oyedotun, O.K., Adnan, K.: Heart diseases diagnosis using neural networks arbitration. Int. J. Intell. Syst. Appl. (IJISA) 7(12), 75 (2015) Olaniyi, E.O., Oyedotun, O.K., Adnan, K.: Heart diseases diagnosis using neural networks arbitration. Int. J. Intell. Syst. Appl. (IJISA) 7(12), 75 (2015)
7.
go back to reference Hsieh, S.L., Hsieh, S.H., Cheng, P.H., et al.: Design ensemble machine learning model for breast cancer diagnosis. J. Med. Syst. 36(5), 2841–2847 (2012)CrossRef Hsieh, S.L., Hsieh, S.H., Cheng, P.H., et al.: Design ensemble machine learning model for breast cancer diagnosis. J. Med. Syst. 36(5), 2841–2847 (2012)CrossRef
8.
go back to reference Sallehuddin, R., Ubaidillah, S.H., Mustaffa, N.H.: Classification of liver cancer using artificial neural network and support vector machine. In: Proceedings of International Conference on Advance in Communication Network, and Computing, Elsevier Science, CNC (2014) Sallehuddin, R., Ubaidillah, S.H., Mustaffa, N.H.: Classification of liver cancer using artificial neural network and support vector machine. In: Proceedings of International Conference on Advance in Communication Network, and Computing, Elsevier Science, CNC (2014)
9.
go back to reference Long, N.C., Meesad, P., Unger, H.: A highly accurate firefly based algorithm for heart disease prediction. Expert Syst. Appl. 42(21), 8221–8231 (2015)CrossRef Long, N.C., Meesad, P., Unger, H.: A highly accurate firefly based algorithm for heart disease prediction. Expert Syst. Appl. 42(21), 8221–8231 (2015)CrossRef
10.
go back to reference Jabbar, M.A., Deekshatulu, B.L., Chandra, P.: Heart disease prediction system using associative classification and genetic algorithm. (2012) Jabbar, M.A., Deekshatulu, B.L., Chandra, P.: Heart disease prediction system using associative classification and genetic algorithm. (2012)
11.
go back to reference Kim, J.K., Lee, J.S., Park, D.K., Lim, Y.S., Lee, Y.H., Jung, E.Y.: Adaptive mining prediction model for content recommendation to coronary heart disease patients. Clust. Comput. 17(3), 881–891 (2014)CrossRef Kim, J.K., Lee, J.S., Park, D.K., Lim, Y.S., Lee, Y.H., Jung, E.Y.: Adaptive mining prediction model for content recommendation to coronary heart disease patients. Clust. Comput. 17(3), 881–891 (2014)CrossRef
12.
go back to reference Choubey, D.K., Sanchita, P.: GAXXSlahUndXXMLP NN: a hybrid intelligent system for diabetes disease diagnosis. Int. J. Intell. Syst. Appl. 8(1), 49 (2016) Choubey, D.K., Sanchita, P.: GAXXSlahUndXXMLP NN: a hybrid intelligent system for diabetes disease diagnosis. Int. J. Intell. Syst. Appl. 8(1), 49 (2016)
13.
go back to reference Ordonez, C., Omiecinski, E., De Braal L. et al.: Mining constrained association rules to predict heart disease. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 433–440. San Jose, CA, USA (2001) Ordonez, C., Omiecinski, E., De Braal L. et al.: Mining constrained association rules to predict heart disease. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 433–440. San Jose, CA, USA (2001)
14.
go back to reference Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Ed. Leslie Pack Kaelbling. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Ed. Leslie Pack Kaelbling. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH
15.
go back to reference Wang, H., Khoshgoftaar, T.M., Van Hulse, J., Gao, K.: Metric selection for software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 21(2), 237–257 (2011)CrossRef Wang, H., Khoshgoftaar, T.M., Van Hulse, J., Gao, K.: Metric selection for software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 21(2), 237–257 (2011)CrossRef
16.
go back to reference Hall, M.A., Smith, L.A.: Feature subset selection: a correlation based filter approach. In: Proceedings of 1997 International Conference on Neural Information Processing and Intelligent Information Systems, pp. 855–858 (1997) Hall, M.A., Smith, L.A.: Feature subset selection: a correlation based filter approach. In: Proceedings of 1997 International Conference on Neural Information Processing and Intelligent Information Systems, pp. 855–858 (1997)
17.
go back to reference Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene experssion data. J. Bioinf. Comput. Biol. 3(2), 185–205 (2005)CrossRef Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene experssion data. J. Bioinf. Comput. Biol. 3(2), 185–205 (2005)CrossRef
18.
go back to reference Jayaram, M.A., Karegowda, A.G., Manjunath, A.S.: Feature subset selection problem using wrapper approach in supervised learning. Int. J. Comput. Appl. 1(7), 13–16 (2010) Jayaram, M.A., Karegowda, A.G., Manjunath, A.S.: Feature subset selection problem using wrapper approach in supervised learning. Int. J. Comput. Appl. 1(7), 13–16 (2010)
19.
go back to reference Unler, A., Murat, A., Chinnam, R.B.: mr 2 PSO: a maximum relevance minimum redundancy approach based on swarm intelligence for support vector machine classification. Inf. Sci. 181(20), 4625–4641 (2011)CrossRef Unler, A., Murat, A., Chinnam, R.B.: mr 2 PSO: a maximum relevance minimum redundancy approach based on swarm intelligence for support vector machine classification. Inf. Sci. 181(20), 4625–4641 (2011)CrossRef
20.
go back to reference Jensen, R., Shen, Q.: Fuzzy-rough data reduction with ant colony optimization. Present. Fuzzy Sets Syst. 149, 5–20 (2005)MathSciNetCrossRef Jensen, R., Shen, Q.: Fuzzy-rough data reduction with ant colony optimization. Present. Fuzzy Sets Syst. 149, 5–20 (2005)MathSciNetCrossRef
21.
go back to reference Zhang, C.K., Hu, H.: Feature selection using the hybrid of ant colony optimization and mutual information for the forecaster. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1728–1732 (2005) Zhang, C.K., Hu, H.: Feature selection using the hybrid of ant colony optimization and mutual information for the forecaster. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1728–1732 (2005)
22.
go back to reference Liu, H., Setiono, R.: A probabilistic approach to feature selection—a filter solution. In Proceedings of the 13th International Conference on Machine Learning, pp. 319–327 (1996) Liu, H., Setiono, R.: A probabilistic approach to feature selection—a filter solution. In Proceedings of the 13th International Conference on Machine Learning, pp. 319–327 (1996)
25.
go back to reference Hualong, B., Jing, X.: Hybrid feature selection mechanism based high dimensional date sets reduction. Energy Proc. 11(1), 4973–4978 (2011)CrossRef Hualong, B., Jing, X.: Hybrid feature selection mechanism based high dimensional date sets reduction. Energy Proc. 11(1), 4973–4978 (2011)CrossRef
26.
go back to reference Tan, F., Fu, X., Zhang, Y., Bourgeois, A.G.: A genetic algorithm based method for feature subset selection. Soft Comput. 11(1), 111–120 (2008) Tan, F., Fu, X., Zhang, Y., Bourgeois, A.G.: A genetic algorithm based method for feature subset selection. Soft Comput. 11(1), 111–120 (2008)
Metadata
Title
Improving medical diagnosis performance using hybrid feature selection via relieff and entropy based genetic search (RF-EGA) approach: application to breast cancer prediction
Authors
Ilangovan Sangaiah
A. Vincent Antony Kumar
Publication date
16-01-2018
Publisher
Springer US
Published in
Cluster Computing / Issue Special Issue 3/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-018-1702-5

Other articles of this Special Issue 3/2019

Cluster Computing 3/2019 Go to the issue

Premium Partner