Skip to main content
Top

2018 | OriginalPaper | Chapter

Assessing Feature Selection Techniques for a Colorectal Cancer Prediction Model

Authors : Nahúm Cueto-López, Rocío Alaiz-Rodríguez, María Teresa García-Ordás, Carmen González-Donquiles, Vicente Martín

Published in: International Joint Conference SOCO’17-CISIS’17-ICEUTE’17 León, Spain, September 6–8, 2017, Proceeding

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Risk prediction models for colorectal cancer play an important role to identify people at higher risk of developing this disease as well as the risk factors associated with it. Feature selection techniques help to improve the prediction model performance and to gain insight in the data itself. The assessment of the stability of feature selection/ranking algorithms becomes an important issue when the aim is to analyze the most relevant features. This work assesses several feature ranking algorithms in terms of performance and robustness for a set of risk prediction models. Experimental results demonstrate that stability and model performance should be studied jointly as RF turned out to be the most stable algorithm but outperformed by others in terms of model performance while SVM-wrapper and the Pearson correlation coefficient are moderately stable while achieving good model performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ferlay, J., Soerjomataram, I., Ervik, M., Dikshit, R., Eser, S., Mathers, C., Rebelo, M., Parkin, D., Forman, D., Bray, F.: Cancer incidence and mortality. International Agency for Research on Cancer (2012) Ferlay, J., Soerjomataram, I., Ervik, M., Dikshit, R., Eser, S., Mathers, C., Rebelo, M., Parkin, D., Forman, D., Bray, F.: Cancer incidence and mortality. International Agency for Research on Cancer (2012)
2.
go back to reference Center, M., Jemal, A., Ward, E.: International trends in colorectal cancer incidence rates. Cancer Epidemiol Biomarkers Prev. (2009) Center, M., Jemal, A., Ward, E.: International trends in colorectal cancer incidence rates. Cancer Epidemiol Biomarkers Prev. (2009)
3.
go back to reference Hu, X., Feng, F., Li, X., Yuan, P., Luan, R., Yan, J., Liu, W., Yang, Y.: Gene polymorphisms related to insulin resistance and gene-environment interaction in colorectal cancer risk. Ann. Hum. Biol. 42, 560–568 (2015) Hu, X., Feng, F., Li, X., Yuan, P., Luan, R., Yan, J., Liu, W., Yang, Y.: Gene polymorphisms related to insulin resistance and gene-environment interaction in colorectal cancer risk. Ann. Hum. Biol. 42, 560–568 (2015)
4.
go back to reference Ouakrim, D.A., Pizot, C., Boniol, M., Malvezzi, M., Boniol, M., Negri, E., Bota, M., Jenkins, M.A., Bleiberg, H., Autier, P.: Trends in colorectal cancer mortality in Europe: retrospective analysis of the who mortality database. BMJ 351 (2015) Ouakrim, D.A., Pizot, C., Boniol, M., Malvezzi, M., Boniol, M., Negri, E., Bota, M., Jenkins, M.A., Bleiberg, H., Autier, P.: Trends in colorectal cancer mortality in Europe: retrospective analysis of the who mortality database. BMJ 351 (2015)
5.
go back to reference Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I.: Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13, 8–17 (2015)CrossRef Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I.: Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13, 8–17 (2015)CrossRef
6.
go back to reference Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH
7.
go back to reference Victo, G., Raj, V.C.: Review on feature selection techniques and the impact of SVM for cancer classification using gene expression profile. CoRR (2011) Victo, G., Raj, V.C.: Review on feature selection techniques and the impact of SVM for cancer classification using gene expression profile. CoRR (2011)
8.
go back to reference Wang, H., Khoshgoftaar, T.M., Napolitano, A.: Stability of filter- and wrapper-based software metric selection techniques. In: Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration, pp. 309–314 (2014) Wang, H., Khoshgoftaar, T.M., Napolitano, A.: Stability of filter- and wrapper-based software metric selection techniques. In: Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration, pp. 309–314 (2014)
9.
go back to reference Guzmán-Martínez, R., Alaiz-Rodríguez, R.: Feature selection stability assessment based on the Jensen-Shannon divergence. In: Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, pp. 597–612, Springer, Heidelberg (2011) Guzmán-Martínez, R., Alaiz-Rodríguez, R.: Feature selection stability assessment based on the Jensen-Shannon divergence. In: Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, pp. 597–612, Springer, Heidelberg (2011)
10.
go back to reference Pes, B., Dess, N., Angioni, M.: Exploiting the ensemble paradigm for stable feature selection: a case study on high-dimensional genomic data. Inf. Fusion 35, 132–147 (2017)CrossRef Pes, B., Dess, N., Angioni, M.: Exploiting the ensemble paradigm for stable feature selection: a case study on high-dimensional genomic data. Inf. Fusion 35, 132–147 (2017)CrossRef
11.
go back to reference Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing). Springer, New York (2006)CrossRefMATH Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing). Springer, New York (2006)CrossRefMATH
12.
go back to reference Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)CrossRef Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)CrossRef
13.
go back to reference Guyon, I., Gunn, S., Hur, A.B., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Proceedings of the 17th International Conference on Neural Information Processing Systems, NIPS 2004, Cambridge, MA, USA, pp. 545–552. MIT Press (2004) Guyon, I., Gunn, S., Hur, A.B., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Proceedings of the 17th International Conference on Neural Information Processing Systems, NIPS 2004, Cambridge, MA, USA, pp. 545–552. MIT Press (2004)
14.
go back to reference Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, Burlington (1999) Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, Burlington (1999)
15.
go back to reference Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques, pp. 313–325. Springer, Heidelberg (2008) Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques, pp. 313–325. Springer, Heidelberg (2008)
16.
go back to reference Ibáñez-Sanz, G., Díez-Villanueva, A., Alonso, M.H., Rodríguez-Moranta, F., Pérez-Gómez, B., Bustamante, M., Martin, V., Llorca, J., Amiano, P., Ardanaz, E., Tardón, A., Jiménez-Moleón, J.J., Peiró, R., Alguacil, J., Navarro, C., Guinó, E., Binefa, G., Navarro, P.F., Espinosa, A., Dávila-Batista, V., Molina, A.J., Palazuelos, C., Castaño-Vinyals, G., Aragonés, N., Kogevinas, M., Pollán, M., Moreno, V.: Risk model for colorectal cancer in spanish population using environmental and genetic factors: results from the MCC-Spain study. Scientific Reports, vol. 7, p. 43263, February 2017. EP Ibáñez-Sanz, G., Díez-Villanueva, A., Alonso, M.H., Rodríguez-Moranta, F., Pérez-Gómez, B., Bustamante, M., Martin, V., Llorca, J., Amiano, P., Ardanaz, E., Tardón, A., Jiménez-Moleón, J.J., Peiró, R., Alguacil, J., Navarro, C., Guinó, E., Binefa, G., Navarro, P.F., Espinosa, A., Dávila-Batista, V., Molina, A.J., Palazuelos, C., Castaño-Vinyals, G., Aragonés, N., Kogevinas, M., Pollán, M., Moreno, V.: Risk model for colorectal cancer in spanish population using environmental and genetic factors: results from the MCC-Spain study. Scientific Reports, vol. 7, p. 43263, February 2017. EP
17.
go back to reference Castano-Vinyals, G., Aragonés, N., Pérez-Gómez, B., Martín, V., Llorca, J., Moreno, V.: Population-based multicase-control study in common tumors in Spain (MCC-Spain): rationale and study design. Gac. Sanit. (2015) Castano-Vinyals, G., Aragonés, N., Pérez-Gómez, B., Martín, V., Llorca, J., Moreno, V.: Population-based multicase-control study in common tumors in Spain (MCC-Spain): rationale and study design. Gac. Sanit. (2015)
Metadata
Title
Assessing Feature Selection Techniques for a Colorectal Cancer Prediction Model
Authors
Nahúm Cueto-López
Rocío Alaiz-Rodríguez
María Teresa García-Ordás
Carmen González-Donquiles
Vicente Martín
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-67180-2_46

Premium Partner