Skip to main content
Top

2020 | OriginalPaper | Chapter

Prediction Model of Breast Cancer Based on mRMR Feature Selection

Authors : Junwen Di, Zhiguo Shi

Published in: Neural Information Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In real life, there are a lot of unbalanced data, and there are great differences in the data volume in category distribution, especially in the medical data where this problem is more prominent because of the prevalence rate. In this paper, the P-mRMR algorithm is proposed based on the mRMR algorithm to improve the feature selection process of unbalance data, and to process the attributes with more missing values and integrate the missing values into feature selection while selecting features specific to the characteristics of more missing values in the data set, so as to reduce the complexity of the data pre-processing. In the experiments, the AUC, confusion matrix and probability of missing value are used to compare the algorithms. The experiment shows that the features selected by the improved algorithm have better results in the classifiers.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)CrossRef Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)CrossRef
2.
go back to reference Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A., Benítez, J., Herrera, F.: A review of microarray datasets and applied feature selection methods. Inf. Sci. 282, 111–135 (2014)CrossRef Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A., Benítez, J., Herrera, F.: A review of microarray datasets and applied feature selection methods. Inf. Sci. 282, 111–135 (2014)CrossRef
3.
go back to reference Bolón-Canedo, V., Seth, S., Sánchez-Maroño, N., Alonso-Betanzos, A., Principe, J.C.: Statistical dependence measure for feature selection in microarray datasets. In: European Symposium on ESANN (2012) Bolón-Canedo, V., Seth, S., Sánchez-Maroño, N., Alonso-Betanzos, A., Principe, J.C.: Statistical dependence measure for feature selection in microarray datasets. In: European Symposium on ESANN (2012)
4.
go back to reference Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2011)MATH Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2011)MATH
5.
go back to reference Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. AM Sigkdd Explor. Newsl. 6(1), 1–6 (2004)CrossRef Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. AM Sigkdd Explor. Newsl. 6(1), 1–6 (2004)CrossRef
6.
go back to reference Chen, C., Breiman, L.: Using random forest to learn imbalanced data. University of California, Berkeley (2004) Chen, C., Breiman, L.: Using random forest to learn imbalanced data. University of California, Berkeley (2004)
7.
go back to reference Chen, H., Li, T., Fan, X., Luo, C.: Feature selection for imbalanced data based on neighborhood rough sets. Inf. Sci. 483, 1–20 (2019)CrossRef Chen, H., Li, T., Fan, X., Luo, C.: Feature selection for imbalanced data based on neighborhood rough sets. Inf. Sci. 483, 1–20 (2019)CrossRef
8.
go back to reference Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(6), 1157–1182 (2003)MATH Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(6), 1157–1182 (2003)MATH
9.
go back to reference Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)CrossRef Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)CrossRef
10.
go back to reference Li, A., Wang, R., Xu, L.: Shrink: a breast cancer risk assessment model based on medical social network. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 1189–1196. IEEE (2017) Li, A., Wang, R., Xu, L.: Shrink: a breast cancer risk assessment model based on medical social network. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 1189–1196. IEEE (2017)
11.
go back to reference Li, D.C., Liu, C.W., Hu, S.C.: A learning method for the class imbalance problem with medical data sets. Comput. Biol. Med. 40(5), 509–518 (2010)CrossRef Li, D.C., Liu, C.W., Hu, S.C.: A learning method for the class imbalance problem with medical data sets. Comput. Biol. Med. 40(5), 509–518 (2010)CrossRef
12.
go back to reference Li, J., et al.: Feature selection: a data perspective. AM Comput. Surv. 50(6), 1–45 (2016) Li, J., et al.: Feature selection: a data perspective. AM Comput. Surv. 50(6), 1–45 (2016)
13.
go back to reference Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. B 39(2), 539–550 (2009)CrossRef Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. B 39(2), 539–550 (2009)CrossRef
15.
go back to reference Maldonado, S., Weber, R., Famili, F.: Feature selection for high-dimensional class-imbalanced data sets using support vector machines. Inf. Sci. 286, 228–246 (2014)CrossRef Maldonado, S., Weber, R., Famili, F.: Feature selection for high-dimensional class-imbalanced data sets using support vector machines. Inf. Sci. 286, 228–246 (2014)CrossRef
16.
go back to reference Moayedikia, A., Ong, K.L., Boo, Y.L., Yeoh, W.G., Jensen, R.: Feature selection for high dimensional imbalanced class data using harmony search. Eng. Appl. Artif. Intell. 57, 38–49 (2017)CrossRef Moayedikia, A., Ong, K.L., Boo, Y.L., Yeoh, W.G., Jensen, R.: Feature selection for high dimensional imbalanced class data using harmony search. Eng. Appl. Artif. Intell. 57, 38–49 (2017)CrossRef
17.
go back to reference Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., Napolitano, A.: Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. A Syst. Hum. 40(1), 185–197 (2010)CrossRef Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., Napolitano, A.: Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. A Syst. Hum. 40(1), 185–197 (2010)CrossRef
18.
go back to reference Urbanowicz, R.J., Melissa, M., La, C.W., Olson, R.S., Moore, J.H.: Relief-based feature selection: introduction and review. J. Biomed. Inform. 85, 189–203 (2017)CrossRef Urbanowicz, R.J., Melissa, M., La, C.W., Olson, R.S., Moore, J.H.: Relief-based feature selection: introduction and review. J. Biomed. Inform. 85, 189–203 (2017)CrossRef
19.
go back to reference Wasikowski, M., Chen, X.W.: Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng. 22(10), 1388–1400 (2010)CrossRef Wasikowski, M., Chen, X.W.: Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng. 22(10), 1388–1400 (2010)CrossRef
20.
go back to reference Yan-Xia, L.I., Yi, C., You-Qiang, H.U., Hong-Peng, Y.: Review of imbalanced data classification methods. Control Decis. 34(04), 673–688 (2019) Yan-Xia, L.I., Yi, C., You-Qiang, H.U., Hong-Peng, Y.: Review of imbalanced data classification methods. Control Decis. 34(04), 673–688 (2019)
21.
go back to reference Yin, L., Ge, Y., Xiao, K., Wang, X., Quan, X.: Feature selection for high-dimensional imbalanced data. Neurocomput. 105, 3–11 (2013)CrossRef Yin, L., Ge, Y., Xiao, K., Wang, X., Quan, X.: Feature selection for high-dimensional imbalanced data. Neurocomput. 105, 3–11 (2013)CrossRef
22.
go back to reference Zhang, C., Wang, G., Zhou, Y., Yao, L., Wang, X.: Feature selection for high dimensional imbalanced class data based on f-measure optimization. In: 2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC), pp. 278–283. IEEE(2018) Zhang, C., Wang, G., Zhou, Y., Yao, L., Wang, X.: Feature selection for high dimensional imbalanced class data based on f-measure optimization. In: 2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC), pp. 278–283. IEEE(2018)
Metadata
Title
Prediction Model of Breast Cancer Based on mRMR Feature Selection
Authors
Junwen Di
Zhiguo Shi
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-63820-7_4

Premium Partner