Skip to main content
Erschienen in: Annals of Data Science 4/2018

09.03.2018

Artificial Neural Network Classification of High Dimensional Data with Novel Optimization Approach of Dimension Reduction

verfasst von: Rabia Aziz, C. K. Verma, Namita Srivastava

Erschienen in: Annals of Data Science | Ausgabe 4/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Classification of high dimensional data is a very crucial task in bioinformatics. Cancer classification of the microarray is a typical application of machine learning due to the large numbers of genes. Feature (genes) selection and classification with computational intelligent techniques play an important role in diagnosis and prediction of disease in the microarray. Artificial neural networks (ANN) is an artificial intelligence technique for classifying, image processing and predicting the data. This paper evaluates the performance of ANN classifier using six different hybrid feature selection techniques, for gene selection of microarray data. These hybrid techniques use Independent component analysis (ICA), as an extraction technique, popular filter techniques and bio-inspired algorithm for optimization of the ICA feature vector. Five binary gene expression microarray datasets are used to compare the performance of these techniques and determine how these techniques improve the performance of ANN classifier. These techniques can be extremely useful in feature selection because they achieve the highest classification accuracy along with the lowest average number of selected genes. Furthermore, to check the significant difference between these different algorithms a statistical hypothesis test was employed with a certain level of confidence. The experimental result shows that a combination of ICA with genetic bee colony algorithm shows superior performance as it heuristically removes non-contributing features to improve the performance of classifiers.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Xu Y, Selaru FM, Yin J, Zou TT, Shustova V, Mori Y, Sato F, Liu TC, Olaru A, Wang S (2002) Artificial neural networks and gene filtering distinguish between global gene expression profiles of Barrett’s esophagus and esophageal cancer. Cancer Res 62(12):3493–3497 Xu Y, Selaru FM, Yin J, Zou TT, Shustova V, Mori Y, Sato F, Liu TC, Olaru A, Wang S (2002) Artificial neural networks and gene filtering distinguish between global gene expression profiles of Barrett’s esophagus and esophageal cancer. Cancer Res 62(12):3493–3497
2.
Zurück zum Zitat Ng RT, Pei J (2007) Introduction to the special issue on data mining for health informatics. ACM SIGKDD Explor Newsl 9(1):1–2CrossRef Ng RT, Pei J (2007) Introduction to the special issue on data mining for health informatics. ACM SIGKDD Explor Newsl 9(1):1–2CrossRef
3.
Zurück zum Zitat Aziz R, Verma C, Srivastava N (2017) Dimension reduction methods for microarray data: a review. AIMS Bioeng 4:179–197CrossRef Aziz R, Verma C, Srivastava N (2017) Dimension reduction methods for microarray data: a review. AIMS Bioeng 4:179–197CrossRef
4.
Zurück zum Zitat Shang C, Shen Q (2005) Aiding classification of gene expression data with feature selection: a comparative study. Int J Comput Intell Res 1(1):68–76CrossRef Shang C, Shen Q (2005) Aiding classification of gene expression data with feature selection: a comparative study. Int J Comput Intell Res 1(1):68–76CrossRef
5.
Zurück zum Zitat Aziz R, Verma C, Srivastava N (2015) A weighted-SNR feature selection from independent component subspace for NB classification of microarray data. Int J Adv Biotechnol Res 6(2):245–255 Aziz R, Verma C, Srivastava N (2015) A weighted-SNR feature selection from independent component subspace for NB classification of microarray data. Int J Adv Biotechnol Res 6(2):245–255
6.
Zurück zum Zitat Peng Y (2006) A novel ensemble machine learning for robust microarray data classification. Comput Biol Med 36(6):553–573CrossRef Peng Y (2006) A novel ensemble machine learning for robust microarray data classification. Comput Biol Med 36(6):553–573CrossRef
7.
Zurück zum Zitat Mohan A, Rao MD, Sunderrajan S, Pennathur G (2014) Automatic classification of protein structures using physicochemical parameters. Interdiscip Sci Comput Life Sci 6(3):176–186CrossRef Mohan A, Rao MD, Sunderrajan S, Pennathur G (2014) Automatic classification of protein structures using physicochemical parameters. Interdiscip Sci Comput Life Sci 6(3):176–186CrossRef
9.
Zurück zum Zitat Jain AK, Mao J, Mohiuddin K (1996) Artificial neural networks: a tutorial. Computer 3:31–44CrossRef Jain AK, Mao J, Mohiuddin K (1996) Artificial neural networks: a tutorial. Computer 3:31–44CrossRef
10.
Zurück zum Zitat Tong DL, Schierz AC (2011) Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data. Artif Intell Med 53(1):47–56CrossRef Tong DL, Schierz AC (2011) Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data. Artif Intell Med 53(1):47–56CrossRef
11.
Zurück zum Zitat Peterson LE, Ozen M, Erdem H, Amini A, Gomez L, Nelson CC, Ittmann M (2005) Artificial neural network analysis of DNA microarray-based prostate cancer recurrence. In: Proceedings of the 2005 IEEE symposium on computational intelligence in bioinformatics and computational biology, CIBCB’05. IEEE, pp 1–8 Peterson LE, Ozen M, Erdem H, Amini A, Gomez L, Nelson CC, Ittmann M (2005) Artificial neural network analysis of DNA microarray-based prostate cancer recurrence. In: Proceedings of the 2005 IEEE symposium on computational intelligence in bioinformatics and computational biology, CIBCB’05. IEEE, pp 1–8
12.
Zurück zum Zitat Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679CrossRef Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679CrossRef
13.
Zurück zum Zitat Lancashire LJ, Lemetre C, Ball GR (2009) An introduction to artificial neural networks in bioinformatics—application to complex microarray and mass spectrometry datasets in cancer studies. Brief Bioinform 10:315–329CrossRef Lancashire LJ, Lemetre C, Ball GR (2009) An introduction to artificial neural networks in bioinformatics—application to complex microarray and mass spectrometry datasets in cancer studies. Brief Bioinform 10:315–329CrossRef
14.
Zurück zum Zitat Sarhan AM (2009) Cancer classification based on microarray gene expression data using DCT and ANN. J Theor Appl Inf Technol 6(2):208–216 Sarhan AM (2009) Cancer classification based on microarray gene expression data using DCT and ANN. J Theor Appl Inf Technol 6(2):208–216
15.
Zurück zum Zitat Huynh H, Kim J-J, Won Y (2009) Classification study on DNA micro array with feed forward neural network trained by singular value decomposition. Int J Bio-Sci Bio-Technol 1(1):17–24 Huynh H, Kim J-J, Won Y (2009) Classification study on DNA micro array with feed forward neural network trained by singular value decomposition. Int J Bio-Sci Bio-Technol 1(1):17–24
16.
Zurück zum Zitat Catto JW, Abbod MF, Wild PJ, Linkens DA, Pilarsky C, Rehman I, Rosario DJ, Denzinger S, Burger M, Stoehr R (2010) The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression. Eur Urol 57(3):398–406CrossRef Catto JW, Abbod MF, Wild PJ, Linkens DA, Pilarsky C, Rehman I, Rosario DJ, Denzinger S, Burger M, Stoehr R (2010) The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression. Eur Urol 57(3):398–406CrossRef
17.
Zurück zum Zitat Fernández-Navarro F, Hervás-Martínez C, Ruiz R, Riquelme JC (2012) Evolutionary generalized radial basis function neural networks for improving prediction accuracy in gene classification using feature selection. Appl Soft Comput 12(6):1787–1800CrossRef Fernández-Navarro F, Hervás-Martínez C, Ruiz R, Riquelme JC (2012) Evolutionary generalized radial basis function neural networks for improving prediction accuracy in gene classification using feature selection. Appl Soft Comput 12(6):1787–1800CrossRef
19.
Zurück zum Zitat Dela Rosa JL, Magpantay AE, Gonzaga AC, Solano G (2014) Cluster center genes as candidate biomarkers for the classification of Leukemia. In: The 5th international conference on information, intelligence, systems and applications, IISA 2014. IEEE, pp 124–129 Dela Rosa JL, Magpantay AE, Gonzaga AC, Solano G (2014) Cluster center genes as candidate biomarkers for the classification of Leukemia. In: The 5th international conference on information, intelligence, systems and applications, IISA 2014. IEEE, pp 124–129
20.
Zurück zum Zitat Akadi E, Ouardighi E (2009) A new gene selection approach based on minimum redundancy-maximum relevance (MRMR) and genetic algorithm (GA). In: 2009 IEEE/ACS international conference on computer systems and applications, pp 69–75 Akadi E, Ouardighi E (2009) A new gene selection approach based on minimum redundancy-maximum relevance (MRMR) and genetic algorithm (GA). In: 2009 IEEE/ACS international conference on computer systems and applications, pp 69–75
22.
Zurück zum Zitat Alshamlan HM, Badr GH, Alohali YA (2014) The performance of bio-inspired evolutionary gene selection methods for cancer classification using microarray dataset. Int J Biosci Biochem Bioinform 4(3):166–170 Alshamlan HM, Badr GH, Alohali YA (2014) The performance of bio-inspired evolutionary gene selection methods for cancer classification using microarray dataset. Int J Biosci Biochem Bioinform 4(3):166–170
23.
Zurück zum Zitat Yu H, Ni J, Zhao J (2013) ACOSampling: an ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data. Neurocomputing 101:309–318CrossRef Yu H, Ni J, Zhao J (2013) ACOSampling: an ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data. Neurocomputing 101:309–318CrossRef
24.
Zurück zum Zitat Ghorai S, Mukherjee A, Sengupta S, Dutta PK (2011) Cancer classification from gene expression data by NPPC ensemble. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 8(3):659–671CrossRef Ghorai S, Mukherjee A, Sengupta S, Dutta PK (2011) Cancer classification from gene expression data by NPPC ensemble. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 8(3):659–671CrossRef
25.
Zurück zum Zitat Sahu B, Mishra D (2012) A novel feature selection algorithm using particle swarm optimization for cancer microarray data. Procedia Eng 38:27–31CrossRef Sahu B, Mishra D (2012) A novel feature selection algorithm using particle swarm optimization for cancer microarray data. Procedia Eng 38:27–31CrossRef
26.
Zurück zum Zitat Danaee P, Ghaeini R, Hendrix DA (2017) A deep learning approach for cancer detection and relevant gene identification. In: Pacific symposium on biocomputing 2017. World Scientific, pp 219–229 Danaee P, Ghaeini R, Hendrix DA (2017) A deep learning approach for cancer detection and relevant gene identification. In: Pacific symposium on biocomputing 2017. World Scientific, pp 219–229
27.
Zurück zum Zitat Aziz R, Verma C, Jha M, Srivastava N (2017) Artificial neural network classification of microarray data using new hybrid gene selection method. Int J Data Min Bioinform 17(1):42–65CrossRef Aziz R, Verma C, Jha M, Srivastava N (2017) Artificial neural network classification of microarray data using new hybrid gene selection method. Int J Data Min Bioinform 17(1):42–65CrossRef
28.
Zurück zum Zitat Hyvarinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, LondonCrossRef Hyvarinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, LondonCrossRef
29.
Zurück zum Zitat Aziz R, Verma C, Srivastava N (2016) A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data. Genom Data 8:4–15CrossRef Aziz R, Verma C, Srivastava N (2016) A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data. Genom Data 8:4–15CrossRef
30.
Zurück zum Zitat Hsu C-C, Chen M-C, Chen L-S (2010) Integrating independent component analysis and support vector machine for multivariate process monitoring. Comput Ind Eng 59(1):145–156CrossRef Hsu C-C, Chen M-C, Chen L-S (2010) Integrating independent component analysis and support vector machine for multivariate process monitoring. Comput Ind Eng 59(1):145–156CrossRef
31.
Zurück zum Zitat Rabia A, Namita S, Chandan KV (2015) t-Independent component analysis for SVM classification of DNA- microarray data. Int J Bioinform Res 6(1):305–312 Rabia A, Namita S, Chandan KV (2015) t-Independent component analysis for SVM classification of DNA- microarray data. Int J Bioinform Res 6(1):305–312
32.
Zurück zum Zitat Hengpraprohm S (2013) GA-based classifier with SNR weighted features for cancer microarray data classification. Int J Signal Process Syst 1:29–33CrossRef Hengpraprohm S (2013) GA-based classifier with SNR weighted features for cancer microarray data classification. Int J Signal Process Syst 1:29–33CrossRef
33.
Zurück zum Zitat Cesar I (2012) Feature selection using fuzzy entropy measures with Yu’s Similarity measure. Dissertation, Lappeenranta University of Technology Cesar I (2012) Feature selection using fuzzy entropy measures with Yu’s Similarity measure. Dissertation, Lappeenranta University of Technology
34.
Zurück zum Zitat Lee H-M, Chen C-M, Chen J-M, Jou Y-L (2001) An efficient fuzzy classifier with feature selection based on fuzzy entropy. IEEE Trans Syst Man Cybern Part B Cybern 31(3):426–432CrossRef Lee H-M, Chen C-M, Chen J-M, Jou Y-L (2001) An efficient fuzzy classifier with feature selection based on fuzzy entropy. IEEE Trans Syst Man Cybern Part B Cybern 31(3):426–432CrossRef
35.
Zurück zum Zitat Huerta EB, Duval B, Hao J-K (2006) A hybrid GA/SVM approach for gene selection and classification of microarray data. In: Workshops on applications of evolutionary computation. Springer, pp 34–44 Huerta EB, Duval B, Hao J-K (2006) A hybrid GA/SVM approach for gene selection and classification of microarray data. In: Workshops on applications of evolutionary computation. Springer, pp 34–44
36.
Zurück zum Zitat Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical report-tr06, Erciyes university, engineering faculty, computer engineering department Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical report-tr06, Erciyes university, engineering faculty, computer engineering department
37.
Zurück zum Zitat Garro BA, Rodríguez K, Vázquez RA (2015) Classification of DNA microarrays using artificial neural networks and ABC algorithm. Appl Soft Comput 38:548–560CrossRef Garro BA, Rodríguez K, Vázquez RA (2015) Classification of DNA microarrays using artificial neural networks and ABC algorithm. Appl Soft Comput 38:548–560CrossRef
38.
Zurück zum Zitat Aziz R, Verma C, Srivastava N (2017) A novel approach for dimension reduction of microarray. Comput Biol Chem 71:161–169CrossRef Aziz R, Verma C, Srivastava N (2017) A novel approach for dimension reduction of microarray. Comput Biol Chem 71:161–169CrossRef
39.
Zurück zum Zitat Kıran MS, Özceylan E, Gündüz M, Paksoy T (2012) A novel hybrid approach based on particle swarm optimization and ant colony algorithm to forecast energy demand of Turkey. Energy Convers Manag 53(1):75–83CrossRef Kıran MS, Özceylan E, Gündüz M, Paksoy T (2012) A novel hybrid approach based on particle swarm optimization and ant colony algorithm to forecast energy demand of Turkey. Energy Convers Manag 53(1):75–83CrossRef
40.
Zurück zum Zitat Jatoth RK, Rajasekhar A (2010) Speed control of pmsm by hybrid genetic artificial bee colony algorithm. In: 2010 IEEE international conference on communication control and computing technologies (ICCCCT). IEEE, pp 241–246 Jatoth RK, Rajasekhar A (2010) Speed control of pmsm by hybrid genetic artificial bee colony algorithm. In: 2010 IEEE international conference on communication control and computing technologies (ICCCCT). IEEE, pp 241–246
41.
Zurück zum Zitat Alshamlan HM, Badr GH, Alohali YA (2015) Genetic bee colony (GBC) algorithm: a new gene selection method for microarray cancer classification. Comput Biol Chem 56:49–60CrossRef Alshamlan HM, Badr GH, Alohali YA (2015) Genetic bee colony (GBC) algorithm: a new gene selection method for microarray cancer classification. Comput Biol Chem 56:49–60CrossRef
42.
Zurück zum Zitat McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133CrossRef McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133CrossRef
43.
Zurück zum Zitat Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314CrossRef Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314CrossRef
44.
Zurück zum Zitat Rosenblatt F (1961) Principles of neurodynamics. Perceptrons and the theory of brain mechanisms. DTIC Document Rosenblatt F (1961) Principles of neurodynamics. Perceptrons and the theory of brain mechanisms. DTIC Document
45.
Zurück zum Zitat Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. DTIC Document Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. DTIC Document
46.
Zurück zum Zitat Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750CrossRef Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750CrossRef
47.
Zurück zum Zitat Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537CrossRef Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537CrossRef
48.
Zurück zum Zitat Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie JP (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209CrossRef Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie JP (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209CrossRef
49.
Zurück zum Zitat Gordon GJ, Jensen RV, Hsiao L-L, Gullans SR, Blumenstock JE, Ramaswamy S, Richards WG, Sugarbaker DJ, Bueno R (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62(17):4963–4967 Gordon GJ, Jensen RV, Hsiao L-L, Gullans SR, Blumenstock JE, Ramaswamy S, Richards WG, Sugarbaker DJ, Bueno R (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62(17):4963–4967
50.
Zurück zum Zitat Nutt CL, Mani D, Betensky RA, Tamayo P, Cairncross JG, Ladd C, Pohl U, Hartmann C, McLaughlin ME, Batchelor TT (2003) Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res 63(7):1602–1607 Nutt CL, Mani D, Betensky RA, Tamayo P, Cairncross JG, Ladd C, Pohl U, Hartmann C, McLaughlin ME, Batchelor TT (2003) Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res 63(7):1602–1607
51.
Zurück zum Zitat Song B, Zhang G, Zhu W, Liang Z (2014) ROC operating point selection for classification of imbalanced data with application to computer-aided polyp detection in CT colonography. Int J Comput Assist Radiol Surg 9(1):79–89CrossRef Song B, Zhang G, Zhu W, Liang Z (2014) ROC operating point selection for classification of imbalanced data with application to computer-aided polyp detection in CT colonography. Int J Comput Assist Radiol Surg 9(1):79–89CrossRef
Metadaten
Titel
Artificial Neural Network Classification of High Dimensional Data with Novel Optimization Approach of Dimension Reduction
verfasst von
Rabia Aziz
C. K. Verma
Namita Srivastava
Publikationsdatum
09.03.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
Annals of Data Science / Ausgabe 4/2018
Print ISSN: 2198-5804
Elektronische ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-018-0155-2

Weitere Artikel der Ausgabe 4/2018

Annals of Data Science 4/2018 Zur Ausgabe