Skip to main content

2017 | OriginalPaper | Buchkapitel

Combination of Feature Selection Methods for the Effective Classification of Microarray Gene Expression Data

verfasst von : T. Sheela, Lalitha Rangarajan

Erschienen in: Recent Trends in Image Processing and Pattern Recognition

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Gene selection from microarray gene expression data is very difficult due to the large dimensionality of the data. The number of samples in the microarray data set is very small compared to the number of genes as features. To reduce dimensionality, selection of significant genes is necessary. An effective method of gene feature selection helps in dimensionality reduction and improves the performance of the sample classification. In this work, we have examined if combination of feature selection methods can improve the performance of classification algorithms. We propose two methods of combination of feature selection techniques. Experimental results suggest that appropriate combination of filter gene selection methods is more effective than individual techniques for microarray data classification. We have compared our combination methods using different learning algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ferreira, A.J., Figueiredo, M.A.T.: Efficient feature selection filters for high dimensional data. Pattern Recogn. Lett. 33, 1794–1804 (2012)CrossRef Ferreira, A.J., Figueiredo, M.A.T.: Efficient feature selection filters for high dimensional data. Pattern Recogn. Lett. 33, 1794–1804 (2012)CrossRef
2.
Zurück zum Zitat Chan, D., Bridges, S.M., Burgess, S.C.: An Ensemble Method for Identifying Robust Features for Biomarker Discovery, pp. 377–392. Chapman & Hall, Boca Raton (2007) Chan, D., Bridges, S.M., Burgess, S.C.: An Ensemble Method for Identifying Robust Features for Biomarker Discovery, pp. 377–392. Chapman & Hall, Boca Raton (2007)
3.
Zurück zum Zitat Chandra, B., Gupta, M.: An efficient statistical feature selection approach for classification of gene expression data. J. Biomed. Inform. 44(4), 529–535 (2011)CrossRef Chandra, B., Gupta, M.: An efficient statistical feature selection approach for classification of gene expression data. J. Biomed. Inform. 44(4), 529–535 (2011)CrossRef
4.
Zurück zum Zitat Chopra, P., Lee, J., Kang, J., Lee, S.: Improving cancer classification accuracy using gene pairs. PLoS ONE 5(12), e14305 (2010)CrossRef Chopra, P., Lee, J., Kang, J., Lee, S.: Improving cancer classification accuracy using gene pairs. PLoS ONE 5(12), e14305 (2010)CrossRef
5.
Zurück zum Zitat Deegalla, S., Bostrom, H.: Improving fusion of dimensionality reduction methods for nearest neighbor classification. In: Proceedings of the 12th International Conference on Information Fusion, pp. 460–465 (2009) Deegalla, S., Bostrom, H.: Improving fusion of dimensionality reduction methods for nearest neighbor classification. In: Proceedings of the 12th International Conference on Information Fusion, pp. 460–465 (2009)
6.
Zurück zum Zitat Fawcett, T.: An introduction to ROC analysis. ROC Anal. Pattern Recogn. 27, 861–874 (2006)CrossRef Fawcett, T.: An introduction to ROC analysis. ROC Anal. Pattern Recogn. 27, 861–874 (2006)CrossRef
7.
Zurück zum Zitat Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRef Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRef
8.
Zurück zum Zitat Han, F., Sun, W., Ling, Q.H.: A novel strategy for gene selection of microarray data based on gene-to-class sensitivity information. PLoS ONE 9(5), e97530 (2014)CrossRef Han, F., Sun, W., Ling, Q.H.: A novel strategy for gene selection of microarray data based on gene-to-class sensitivity information. PLoS ONE 9(5), e97530 (2014)CrossRef
9.
Zurück zum Zitat Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77, 103–123 (2009)CrossRef Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77, 103–123 (2009)CrossRef
10.
Zurück zum Zitat Dutkowski, J., Gambin, A.: On consensus biomarker selection. BMC Bioinform. 8(Suppl. 5), S5 (2007)CrossRef Dutkowski, J., Gambin, A.: On consensus biomarker selection. BMC Bioinform. 8(Suppl. 5), S5 (2007)CrossRef
11.
Zurück zum Zitat Jin, C.L., Ling, C.X., Huang, J., Zhang, H.: AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of 18th International Conference on Artificial Intelligence, pp. 329–341 (2003) Jin, C.L., Ling, C.X., Huang, J., Zhang, H.: AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of 18th International Conference on Artificial Intelligence, pp. 329–341 (2003)
12.
Zurück zum Zitat Keedwell, E.C., Narayanan, A.: Intelligent Bioinformatics: The Application of Artificial Intelligence Techniques to Bioinformatics Problems. Wiley, London (2005)CrossRef Keedwell, E.C., Narayanan, A.: Intelligent Bioinformatics: The Application of Artificial Intelligence Techniques to Bioinformatics Problems. Wiley, London (2005)CrossRef
13.
Zurück zum Zitat Kolde, R., Laur, S., Adler, P., Vilo, J.: Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics 28(4), 573–580 (2012)CrossRef Kolde, R., Laur, S., Adler, P., Vilo, J.: Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics 28(4), 573–580 (2012)CrossRef
14.
Zurück zum Zitat Mamitsuka, H.: Selecting features in microarray classification using ROC curves. Pattern Recogn. 39, 2393–2404 (2006)CrossRefMATH Mamitsuka, H.: Selecting features in microarray classification using ROC curves. Pattern Recogn. 39, 2393–2404 (2006)CrossRefMATH
15.
Zurück zum Zitat Perez, M.: Machine learning and soft computing approaches to microarray differential expression analysis and feature selection. Ph.D. Thesis 2011, University of the Witwatersrand, Johannesburg (2012) Perez, M.: Machine learning and soft computing approaches to microarray differential expression analysis and feature selection. Ph.D. Thesis 2011, University of the Witwatersrand, Johannesburg (2012)
16.
Zurück zum Zitat MathWorks: Bioinformatics Toolbox. MATLAB edn. (2007) MathWorks: Bioinformatics Toolbox. MATLAB edn. (2007)
17.
Zurück zum Zitat Nguyen, T., Khosravi, A., Creighton, D.: Heirarchical gene selection and genetic fuzzy system for cancer microarray data classification. PLoS ONE 10(3), e0120364 (2015)CrossRef Nguyen, T., Khosravi, A., Creighton, D.: Heirarchical gene selection and genetic fuzzy system for cancer microarray data classification. PLoS ONE 10(3), e0120364 (2015)CrossRef
18.
Zurück zum Zitat Yang, P., Yang, Y.H., Zhou, B.B., Zomaya, A.Y.: A review of ensemble methods in bioinformatics. Curr. Bioinform. 5(4), 296–308 (2010)CrossRef Yang, P., Yang, Y.H., Zhou, B.B., Zomaya, A.Y.: A review of ensemble methods in bioinformatics. Curr. Bioinform. 5(4), 296–308 (2010)CrossRef
19.
Zurück zum Zitat Yang, P., Zhou, B.B., Zhang, Z., Zomaya, A.Y.: A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data. BMC Bioinform. 11(Suppl. 1), S5 (2010). doi:10.1186/1471-2105-11-S1-S5 Yang, P., Zhou, B.B., Zhang, Z., Zomaya, A.Y.: A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data. BMC Bioinform. 11(Suppl. 1), S5 (2010). doi:10.​1186/​1471-2105-11-S1-S5
20.
Zurück zum Zitat Pepe, M.S., Longton, G., Anderson, G.L., Schummer, M.: Selecting differentially expressed genes from microarray experiments. Biometrics 59, 133–142 (2003)MathSciNetCrossRefMATH Pepe, M.S., Longton, G., Anderson, G.L., Schummer, M.: Selecting differentially expressed genes from microarray experiments. Biometrics 59, 133–142 (2003)MathSciNetCrossRefMATH
21.
Zurück zum Zitat Saeys, Y., Lnza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRef Saeys, Y., Lnza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRef
22.
Zurück zum Zitat Saeys, Y., Abeel, T., Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87481-2_21 CrossRef Saeys, Y., Abeel, T., Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-87481-2_​21 CrossRef
23.
Zurück zum Zitat Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)CrossRef Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)CrossRef
25.
Zurück zum Zitat Xu, J., Sun, L., Gao, Y., Xu, T.: An ensemble feature selection technique for cancer recognition. Biomed. Mater. Eng. 24(1), 1001–1008 (2014). doi:10.3233/BME-130897 Xu, J., Sun, L., Gao, Y., Xu, T.: An ensemble feature selection technique for cancer recognition. Biomed. Mater. Eng. 24(1), 1001–1008 (2014). doi:10.​3233/​BME-130897
26.
Zurück zum Zitat Yang, Y.H., Xiao, Y., Segal, M.R.: Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics 21(7), 1084–1093 (2005)CrossRef Yang, Y.H., Xiao, Y., Segal, M.R.: Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics 21(7), 1084–1093 (2005)CrossRef
27.
Zurück zum Zitat Peng, Y., Wu, Z., Jiang, J.: A novel feature selection approach for biomedical data classification. J. Biomed. Inform. 43, 15–23 (2010)CrossRef Peng, Y., Wu, Z., Jiang, J.: A novel feature selection approach for biomedical data classification. J. Biomed. Inform. 43, 15–23 (2010)CrossRef
28.
Zurück zum Zitat Zhang, Z., Yang, P., Wu, X., Zhang, C.: An agent-based hybrid system for microarray data analysis. IEEE Intell. Syst. 24(5), 53–63 (2009)CrossRef Zhang, Z., Yang, P., Wu, X., Zhang, C.: An agent-based hybrid system for microarray data analysis. IEEE Intell. Syst. 24(5), 53–63 (2009)CrossRef
Metadaten
Titel
Combination of Feature Selection Methods for the Effective Classification of Microarray Gene Expression Data
verfasst von
T. Sheela
Lalitha Rangarajan
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-4859-3_13