Skip to main content
Erschienen in: Medical & Biological Engineering & Computing 6/2016

01.06.2016 | Original Article

A novel sparse coding algorithm for classification of tumors based on gene expression data

verfasst von: Morteza Kolali Khormuji, Mehrnoosh Bazrafkan

Erschienen in: Medical & Biological Engineering & Computing | Ausgabe 6/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

High-dimensional genomic and proteomic data play an important role in many applications in medicine such as prognosis of diseases, diagnosis, prevention and molecular biology, to name a few. Classifying such data is a challenging task due to the various issues such as curse of dimensionality, noise and redundancy. Recently, some researchers have used the sparse representation (SR) techniques to analyze high-dimensional biological data in various applications in classification of cancer patients based on gene expression datasets. A common problem with all SR-based biological data classification methods is that they cannot utilize the topological (geometrical) structure of data. More precisely, these methods transfer the data into sparse feature space without preserving the local structure of data points. In this paper, we proposed a novel SR-based cancer classification algorithm based on gene expression data that takes into account the geometrical information of all data. Precisely speaking, we incorporate the local linear embedding algorithm into the sparse coding framework, by which we can preserve the geometrical structure of all data. For performance comparison, we applied our algorithm on six tumor gene expression datasets, by which we demonstrate that the proposed method achieves higher classification accuracy than state-of-the-art SR-based tumor classification algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aharon M, Elad M, Bruckstein A (2006) SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322CrossRef Aharon M, Elad M, Bruckstein A (2006) SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322CrossRef
2.
Zurück zum Zitat Babagholami-Mohamadabadi B, Zarghami A, Pourhaghighi HA, Manzuri-Shalmani MT (2013) Probabilistic non-linear distance metric learning for constrained clustering. In: Proceedings of the 4th MultiClust Workshop on Multiple Clusterings, Multi-view Data, and Multi-source Knowledge-driven Clustering. ACM, pp 1–4 Babagholami-Mohamadabadi B, Zarghami A, Pourhaghighi HA, Manzuri-Shalmani MT (2013) Probabilistic non-linear distance metric learning for constrained clustering. In: Proceedings of the 4th MultiClust Workshop on Multiple Clusterings, Multi-view Data, and Multi-source Knowledge-driven Clustering. ACM, pp 1–4
3.
Zurück zum Zitat Babagholami-Mohamadabadi B, Jourabloo A, Zarghami A, Kasaei S (2014) A bayesian framework for sparse representation-based 3-d human pose estimation. IEEE Signal Process Lett 21(3):297–300CrossRef Babagholami-Mohamadabadi B, Jourabloo A, Zarghami A, Kasaei S (2014) A bayesian framework for sparse representation-based 3-d human pose estimation. IEEE Signal Process Lett 21(3):297–300CrossRef
4.
Zurück zum Zitat Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396CrossRef Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396CrossRef
5.
Zurück zum Zitat Bertucci F, Salas S, Eysteries S, Nasser V, Finetti P, Ginestier C, Charafe-Jauffret E, Loriod B, Bachelart L, Montfort J et al (2004) Gene expression profiling of colon cancer by DNA microarrays and correlation with histoclinical parameters. Oncogene 23(7):1377–1391CrossRefPubMed Bertucci F, Salas S, Eysteries S, Nasser V, Finetti P, Ginestier C, Charafe-Jauffret E, Loriod B, Bachelart L, Montfort J et al (2004) Gene expression profiling of colon cancer by DNA microarrays and correlation with histoclinical parameters. Oncogene 23(7):1377–1391CrossRefPubMed
6.
Zurück zum Zitat Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M et al (2001) Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci 98(24):13790–13795CrossRefPubMedPubMedCentral Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M et al (2001) Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci 98(24):13790–13795CrossRefPubMedPubMedCentral
7.
Zurück zum Zitat Bryan K, Cunningham P, Bolshakova N (2006) Application of simulated annealing to the biclustering of gene expression data. IEEE Trans Inf Technol Biomed 10(3):519–525CrossRefPubMed Bryan K, Cunningham P, Bolshakova N (2006) Application of simulated annealing to the biclustering of gene expression data. IEEE Trans Inf Technol Biomed 10(3):519–525CrossRefPubMed
8.
Zurück zum Zitat Chen SS, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61CrossRef Chen SS, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61CrossRef
9.
Zurück zum Zitat Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297 Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
10.
Zurück zum Zitat Dyrskjøt L, Thykjaer T, Kruhøffer M, Jensen JL, Marcussen N, Hamilton-Dutoit S, Wolf H, Ørntoft TF (2003) Identifying distinct classes of bladder carcinoma using microarrays. Nat Genet 33(1):90–96CrossRefPubMed Dyrskjøt L, Thykjaer T, Kruhøffer M, Jensen JL, Marcussen N, Hamilton-Dutoit S, Wolf H, Ørntoft TF (2003) Identifying distinct classes of bladder carcinoma using microarrays. Nat Genet 33(1):90–96CrossRefPubMed
11.
Zurück zum Zitat Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537CrossRefPubMed Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537CrossRefPubMed
12.
Zurück zum Zitat Hale ET, Yin W, Zhang Y (2007) A fixed-point continuation method for l1-regularized minimization with applications to compressed sensing. CAAM TR07-07, Rice University, Houston Hale ET, Yin W, Zhang Y (2007) A fixed-point continuation method for l1-regularized minimization with applications to compressed sensing. CAAM TR07-07, Rice University, Houston
13.
Zurück zum Zitat Hang X, Wu FX (2009) Sparse representation for classification of tumors using gene expression data. BioMed Res Int 2009:1–6 Hang X, Wu FX (2009) Sparse representation for classification of tumors using gene expression data. BioMed Res Int 2009:1–6
14.
Zurück zum Zitat Huang DS, Zheng CH (2006) Independent component analysis-based penalized discriminant method for tumor classification using gene expression data. Bioinformatics 22(15):1855–1862CrossRefPubMed Huang DS, Zheng CH (2006) Independent component analysis-based penalized discriminant method for tumor classification using gene expression data. Bioinformatics 22(15):1855–1862CrossRefPubMed
15.
Zurück zum Zitat Kreutz-Delgado K, Murray JF, Rao BD, Engan K, Lee TW, Sejnowski TJ (2003) Dictionary learning algorithms for sparse representation. Neural Comput 15(2):349–396CrossRefPubMedPubMedCentral Kreutz-Delgado K, Murray JF, Rao BD, Engan K, Lee TW, Sejnowski TJ (2003) Dictionary learning algorithms for sparse representation. Neural Comput 15(2):349–396CrossRefPubMedPubMedCentral
16.
Zurück zum Zitat Lopez FJ, Cuadros M, Cano C, Concha A, Blanco A (2012) Biomedical application of fuzzy association rules for identifying breast cancer biomarkers. Med Biol Eng Comput 50(9):981–990CrossRefPubMed Lopez FJ, Cuadros M, Cano C, Concha A, Blanco A (2012) Biomedical application of fuzzy association rules for identifying breast cancer biomarkers. Med Biol Eng Comput 50(9):981–990CrossRefPubMed
17.
Zurück zum Zitat Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2008) Supervised dictionary learning. NIPS 21:1033–1040 Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2008) Supervised dictionary learning. NIPS 21:1033–1040
18.
Zurück zum Zitat Mallat SG, Zhang Z (1993) Matching pursuits with time-frequency dictionaries. IEEE Trans Signal Process 41(12):3397–3415CrossRef Mallat SG, Zhang Z (1993) Matching pursuits with time-frequency dictionaries. IEEE Trans Signal Process 41(12):3397–3415CrossRef
19.
Zurück zum Zitat Mu T, Nandi AK, Rangayyan RM (2007) Classification of breast masses via nonlinear transformation of features based on a kernel matrix. Med Biol Eng Comput 45(8):769–780CrossRefPubMed Mu T, Nandi AK, Rangayyan RM (2007) Classification of breast masses via nonlinear transformation of features based on a kernel matrix. Med Biol Eng Comput 45(8):769–780CrossRefPubMed
20.
Zurück zum Zitat Paul TK, Iba H (2009) Prediction of cancer class with majority voting genetic programming classifier using gene expression data. IEEE/ACM Trans Comput Biol Bioinf TCBB 6(2):353–367CrossRef Paul TK, Iba H (2009) Prediction of cancer class with majority voting genetic programming classifier using gene expression data. IEEE/ACM Trans Comput Biol Bioinf TCBB 6(2):353–367CrossRef
21.
Zurück zum Zitat Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRefPubMed Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRefPubMed
22.
Zurück zum Zitat Sanchez R, Argaez M, Guillen P (2011) Sparse representation via ℓ1-minimization for underdetermined systems in classification of tumors with gene expression data. In: Annual international conference of the IEEE in engineering in medicine and biology society, EMBC, pp 3362–3366 Sanchez R, Argaez M, Guillen P (2011) Sparse representation via ℓ1-minimization for underdetermined systems in classification of tumors with gene expression data. In: Annual international conference of the IEEE in engineering in medicine and biology society, EMBC, pp 3362–3366
23.
Zurück zum Zitat Sawiris GP, Sherman-Baust CA, Becker KG, Cheadle C, Teichberg D, Morin PJ (2002) Development of a highly specialized cDNA array for the study and diagnosis of epithelial ovarian cancer. Cancer Res 62(10):2923–2928PubMed Sawiris GP, Sherman-Baust CA, Becker KG, Cheadle C, Teichberg D, Morin PJ (2002) Development of a highly specialized cDNA array for the study and diagnosis of epithelial ovarian cancer. Cancer Res 62(10):2923–2928PubMed
24.
Zurück zum Zitat Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS et al (2002) Diffuse large b-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8(1):68–74CrossRefPubMed Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS et al (2002) Diffuse large b-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8(1):68–74CrossRefPubMed
25.
Zurück zum Zitat Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie JP et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209CrossRefPubMed Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie JP et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209CrossRefPubMed
26.
Zurück zum Zitat Tang Y, Zhang YQ, Huang Z (2007) Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 4(3):365–381CrossRef Tang Y, Zhang YQ, Huang Z (2007) Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 4(3):365–381CrossRef
27.
Zurück zum Zitat Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRefPubMed Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRefPubMed
28.
Zurück zum Zitat van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart A, Bernards R, Friend SH et al (2003) Expression profiling predicts outcome in breast cancer. Breast Cancer Res 5(1):57–58CrossRef van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart A, Bernards R, Friend SH et al (2003) Expression profiling predicts outcome in breast cancer. Breast Cancer Res 5(1):57–58CrossRef
29.
Zurück zum Zitat Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227CrossRefPubMed Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227CrossRefPubMed
30.
Zurück zum Zitat Zhang Zy, Zha Hy (2004) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. J Shanghai Univ (English Edition) 8(4):406–424CrossRef Zhang Zy, Zha Hy (2004) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. J Shanghai Univ (English Edition) 8(4):406–424CrossRef
31.
Zurück zum Zitat Zheng CH, Zhang L, Ng TY, Shiu CK, Huang DS (2011) Metasample-based sparse representation for tumor classification. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 8(5):1273–1282CrossRef Zheng CH, Zhang L, Ng TY, Shiu CK, Huang DS (2011) Metasample-based sparse representation for tumor classification. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 8(5):1273–1282CrossRef
32.
Zurück zum Zitat Zolfaghari M, Jourabloo A, Gozlou SG, Pedrood B, Manzuri-Shalmani MT (2014) 3D human pose estimation from image using couple sparse coding. Mach Vis Appl 25(6):1489–1499CrossRef Zolfaghari M, Jourabloo A, Gozlou SG, Pedrood B, Manzuri-Shalmani MT (2014) 3D human pose estimation from image using couple sparse coding. Mach Vis Appl 25(6):1489–1499CrossRef
Metadaten
Titel
A novel sparse coding algorithm for classification of tumors based on gene expression data
verfasst von
Morteza Kolali Khormuji
Mehrnoosh Bazrafkan
Publikationsdatum
01.06.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Medical & Biological Engineering & Computing / Ausgabe 6/2016
Print ISSN: 0140-0118
Elektronische ISSN: 1741-0444
DOI
https://doi.org/10.1007/s11517-015-1382-8

Weitere Artikel der Ausgabe 6/2016

Medical & Biological Engineering & Computing 6/2016 Zur Ausgabe

Premium Partner