Skip to main content
Top

2018 | OriginalPaper | Chapter

Analyzing Feature Importance for Metabolomics Using Genetic Programming

Authors : Ting Hu, Karoliina Oksanen, Weidong Zhang, Edward Randell, Andrew Furey, Guangju Zhai

Published in: Genetic Programming

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The emerging and fast-developing field of metabolomics examines the abundance of small-molecule metabolites in body fluids to study the cellular processes related to how the human body responds to genetic and environmental perturbations. Considering the complexity of metabolism, metabolites and their represented cellular processes can correlate and synergistically contribute to a phenotypic status. Genetic programming (GP) provides advanced analytical instruments for the investigation of multifactorial causes of metabolic diseases. In this article, we analyzed a population-based metabolomics dataset on osteoarthritis (OA) and developed a Linear GP (LGP) algorithm to search classification models that can best predict the disease outcome, as well as to identify the most important metabolic markers associated with the disease. The LGP algorithm was able to evolve prediction models with high accuracies especially with a more focused search using a reduced feature set that only includes potentially relevant metabolites. We also identified a set of key metabolic markers that may improve our understanding of the biochemistry and pathogenesis of the disease.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Kitano, H.: Systems biology: a brief overview. Science 295(5560), 1662–1664 (2002)CrossRef Kitano, H.: Systems biology: a brief overview. Science 295(5560), 1662–1664 (2002)CrossRef
2.
go back to reference Kitano, H.: Computational systems biology. Nature 420(6912), 206–210 (2002)CrossRef Kitano, H.: Computational systems biology. Nature 420(6912), 206–210 (2002)CrossRef
3.
go back to reference Ideker, T., Galitski, T., Hood, L.: A new approach to decoding life: systems biology. Annu. Rev. Genom. Hum. Genet. 2(1), 343–372 (2001)CrossRef Ideker, T., Galitski, T., Hood, L.: A new approach to decoding life: systems biology. Annu. Rev. Genom. Hum. Genet. 2(1), 343–372 (2001)CrossRef
4.
go back to reference Cusick, M.E., Klitgord, N., Vidal, M., Hill, D.E.: Interactome: gateway into systems biology. Hum. Mol. Genet. 14(suppl 2), R171–181 (2005)CrossRef Cusick, M.E., Klitgord, N., Vidal, M., Hill, D.E.: Interactome: gateway into systems biology. Hum. Mol. Genet. 14(suppl 2), R171–181 (2005)CrossRef
5.
go back to reference Bruggeman, F.J., Westerhoff, H.V.: The nature of systems biology. Trends Microbiol. 15(1), 45–50 (2007)CrossRef Bruggeman, F.J., Westerhoff, H.V.: The nature of systems biology. Trends Microbiol. 15(1), 45–50 (2007)CrossRef
6.
go back to reference Shim, S.H.: Cell imaging: an intracellular dance visualized. Nature 546, 39–40 (2017)CrossRef Shim, S.H.: Cell imaging: an intracellular dance visualized. Nature 546, 39–40 (2017)CrossRef
7.
go back to reference Wang, K., Lee, I., Carlson, G., Hood, L., Galas, D.: Systems biology and the discovery of diagnostic biomarkers. Dis. Markers 28(4), 199–207 (2010)CrossRef Wang, K., Lee, I., Carlson, G., Hood, L., Galas, D.: Systems biology and the discovery of diagnostic biomarkers. Dis. Markers 28(4), 199–207 (2010)CrossRef
8.
go back to reference Butcher, E.C., Berg, E.L., Kunkel, E.J.: Systems biology in drug discovery. Nat. Biotechnol. 22(10), 1253–1259 (2004)CrossRef Butcher, E.C., Berg, E.L., Kunkel, E.J.: Systems biology in drug discovery. Nat. Biotechnol. 22(10), 1253–1259 (2004)CrossRef
9.
go back to reference Li, Y., Chen, L.: Big biological data: challenges and opportunities. Genom. Proteomics Bioinf. 12(5), 187–189 (2014)CrossRef Li, Y., Chen, L.: Big biological data: challenges and opportunities. Genom. Proteomics Bioinf. 12(5), 187–189 (2014)CrossRef
10.
go back to reference Alfieri, R., Milanesi, L.: Multi-level data integration and data mining in systems biology. In: Handbook of Research on Systems Biology Applications in Medicine, pp. 476–496. IGI Global (2009) Alfieri, R., Milanesi, L.: Multi-level data integration and data mining in systems biology. In: Handbook of Research on Systems Biology Applications in Medicine, pp. 476–496. IGI Global (2009)
11.
go back to reference Sugimoto, M., Kawakami, M., Robert, M., Soga, T., Tomita, M.: Bioinformatics tools for mass spectroscopy-based metabolomic data processing and analysis. Curr. Bioinf. 7(1), 96–108 (2012)CrossRef Sugimoto, M., Kawakami, M., Robert, M., Soga, T., Tomita, M.: Bioinformatics tools for mass spectroscopy-based metabolomic data processing and analysis. Curr. Bioinf. 7(1), 96–108 (2012)CrossRef
12.
go back to reference Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417 (1933)CrossRefMATH Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417 (1933)CrossRefMATH
13.
go back to reference Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)MATH Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)MATH
15.
go back to reference Worzel, W.P., Yu, J., Almal, A.A., Chinnaiyan, A.M.: Applications of genetic programming in cancer research. Int. J. Biochem. Cell Biol. 41(2), 405–413 (2009)CrossRef Worzel, W.P., Yu, J., Almal, A.A., Chinnaiyan, A.M.: Applications of genetic programming in cancer research. Int. J. Biochem. Cell Biol. 41(2), 405–413 (2009)CrossRef
16.
go back to reference Kandpal, M., Kalyan, C.M., Samavedham, L.: Genetic programming-based approach to elucidate biochemical interaction networks from data. IET Syst. Biol. 7(1), 18–25 (2013)CrossRef Kandpal, M., Kalyan, C.M., Samavedham, L.: Genetic programming-based approach to elucidate biochemical interaction networks from data. IET Syst. Biol. 7(1), 18–25 (2013)CrossRef
17.
go back to reference Gowda, G.N., Zhang, S., Gu, H., Asiago, V., Shanaiah, N., Raftery, D.: Metabolomics-based methods for early disease diagnostics. Expert Rev. Mol. Diagn. 8(5), 617–633 (2008)CrossRef Gowda, G.N., Zhang, S., Gu, H., Asiago, V., Shanaiah, N., Raftery, D.: Metabolomics-based methods for early disease diagnostics. Expert Rev. Mol. Diagn. 8(5), 617–633 (2008)CrossRef
18.
go back to reference WHO Scientic Group: the burden of musculoskeletal conditions at the start of the new millennium. WHO Technical Report Series 919, 218 (2003) WHO Scientic Group: the burden of musculoskeletal conditions at the start of the new millennium. WHO Technical Report Series 919, 218 (2003)
19.
go back to reference Reginster, J.Y.: The prevalence and burden of arthritis. Rheumatology 41, 3–6 (2004)CrossRef Reginster, J.Y.: The prevalence and burden of arthritis. Rheumatology 41, 3–6 (2004)CrossRef
20.
go back to reference Zhai, G., Aref-Eshghi, E., Rahman, P., Zhang, H., Martin, G., Furey, A., Green, R.C., Sun, G.: Attempt to replicate the published osteoarthritis-associated genetic variants in the newfoundland & labrador population. J. Orthop. Rheumatol. 1(3), 5 (2014) Zhai, G., Aref-Eshghi, E., Rahman, P., Zhang, H., Martin, G., Furey, A., Green, R.C., Sun, G.: Attempt to replicate the published osteoarthritis-associated genetic variants in the newfoundland & labrador population. J. Orthop. Rheumatol. 1(3), 5 (2014)
21.
go back to reference Hu, T., Zhang, W., Fan, Z., Sun, G., Likhodi, S., Randell, E., Zhai, G.: Metabolomics differential correlation network analysis of osteoarthritis. Pac. Symp. Biocomput. 21, 120–131 (2016) Hu, T., Zhang, W., Fan, Z., Sun, G., Likhodi, S., Randell, E., Zhai, G.: Metabolomics differential correlation network analysis of osteoarthritis. Pac. Symp. Biocomput. 21, 120–131 (2016)
22.
go back to reference Altman, R., Alarcon, G., Appelrouth, D., Bloch, D., Borenstein, D., Brandt, K., Brown, C., Cooke, T.D., et al.: The american college of rheumatology criteria for the classification and reporting of osteoarthritis of the hip. Arthritis Rheum. 34(5), 505–514 (1991)CrossRef Altman, R., Alarcon, G., Appelrouth, D., Bloch, D., Borenstein, D., Brandt, K., Brown, C., Cooke, T.D., et al.: The american college of rheumatology criteria for the classification and reporting of osteoarthritis of the hip. Arthritis Rheum. 34(5), 505–514 (1991)CrossRef
23.
go back to reference Zhang, W., Likhodii, S., Aref-Eshghi, E., Zhang, Y., Harper, P.E., Randell, E., Green, R., Martin, G., Furey, A., Sun, G., Rahman, P., Zhai, G.: Relationship between blood plasma and synovial fluid metabolite concentrations in patients with osteoarthritis. J. Rheumatol. 42(5), 859–865 (2015)CrossRef Zhang, W., Likhodii, S., Aref-Eshghi, E., Zhang, Y., Harper, P.E., Randell, E., Green, R., Martin, G., Furey, A., Sun, G., Rahman, P., Zhai, G.: Relationship between blood plasma and synovial fluid metabolite concentrations in patients with osteoarthritis. J. Rheumatol. 42(5), 859–865 (2015)CrossRef
24.
go back to reference Brameier, M.F., Banzhaf, W.: Linear Genetic Programming. Springer, New York (2007)MATH Brameier, M.F., Banzhaf, W.: Linear Genetic Programming. Springer, New York (2007)MATH
25.
go back to reference Brameier, M.F., Banzhaf, W.: A comparison of linear genetic programming and neural networks in medical data mining. IEEE Trans. Evol. Comput. 5(1), 17–26 (2001)CrossRefMATH Brameier, M.F., Banzhaf, W.: A comparison of linear genetic programming and neural networks in medical data mining. IEEE Trans. Evol. Comput. 5(1), 17–26 (2001)CrossRefMATH
26.
go back to reference Guven, A.: Linear genetic programming for time-series modeling of daily flow rate. J. Earth Syst. Sci. 118(2), 137–146 (2009)CrossRef Guven, A.: Linear genetic programming for time-series modeling of daily flow rate. J. Earth Syst. Sci. 118(2), 137–146 (2009)CrossRef
29.
go back to reference Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH
30.
go back to reference Zhang, W., Sun, G., Likhodii, S., Liu, M., Aref-Eshghi, E., Harper, P.E., Martin, G., Furey, A., Green, R., Randell, E., Rahman, P., Zhai, G.: Metabolomic analysis of human plasma reveals that arginine is depleted in knee osteoarthritis patients. Osteoarthr. Cartil. 24, 827–834 (2016)CrossRef Zhang, W., Sun, G., Likhodii, S., Liu, M., Aref-Eshghi, E., Harper, P.E., Martin, G., Furey, A., Green, R., Randell, E., Rahman, P., Zhai, G.: Metabolomic analysis of human plasma reveals that arginine is depleted in knee osteoarthritis patients. Osteoarthr. Cartil. 24, 827–834 (2016)CrossRef
31.
go back to reference Zhai, G., Wang-Sattler, R., Hart, D.J., Arden, N.K., Hakim, A.J., Illig, T., Spector, T.D.: Serum branched-chain amino acid to histidine ratio: a novel metabolomic biomarker of knee osteoarthritis. Ann. Rheum. Dis. 69(6), 1227–1231 (2010)CrossRef Zhai, G., Wang-Sattler, R., Hart, D.J., Arden, N.K., Hakim, A.J., Illig, T., Spector, T.D.: Serum branched-chain amino acid to histidine ratio: a novel metabolomic biomarker of knee osteoarthritis. Ann. Rheum. Dis. 69(6), 1227–1231 (2010)CrossRef
32.
go back to reference Zhang, W., Sun, G., Likhodii, S., Aref-Eshghi, E., Harper, P.E., Randell, E., Green, R., Martin, G., Furey, A., Rahman, P., Zhai, G.: Metabolomic analysis of human synovial fluid and plasma reveals that phosphatidylcholine metabolism is associated with both osteoarthritis and diabetes mellitus. Metabolomics 12, 24 (2016)CrossRef Zhang, W., Sun, G., Likhodii, S., Aref-Eshghi, E., Harper, P.E., Randell, E., Green, R., Martin, G., Furey, A., Rahman, P., Zhai, G.: Metabolomic analysis of human synovial fluid and plasma reveals that phosphatidylcholine metabolism is associated with both osteoarthritis and diabetes mellitus. Metabolomics 12, 24 (2016)CrossRef
33.
go back to reference Zhang, W., Sun, G., Aitken, D., Likhodii, S., Liu, M., Martin, G., Furey, A., Randell, E., Rahman, P., Jones, G., Zhai, G.: Lysophosphatidylcholines to phosphatidylcholines ratio predicts advanced knee osteoarthritis. Rheumatology 55(9), 1566–1574 (2016)CrossRef Zhang, W., Sun, G., Aitken, D., Likhodii, S., Liu, M., Martin, G., Furey, A., Randell, E., Rahman, P., Jones, G., Zhai, G.: Lysophosphatidylcholines to phosphatidylcholines ratio predicts advanced knee osteoarthritis. Rheumatology 55(9), 1566–1574 (2016)CrossRef
34.
go back to reference Zhang, W., Likhodii, S., Zhang, Y., Aref-Eshghi, E., Harper, P.E., Randell, E., Green, R., Martin, G., Furey, A., Sun, G., Rahman, P., Zhai, G.: Classification of osteoarthritis phenotypes by metabolomics analysis. BMJ Open 4, e006286 (2014)CrossRef Zhang, W., Likhodii, S., Zhang, Y., Aref-Eshghi, E., Harper, P.E., Randell, E., Green, R., Martin, G., Furey, A., Sun, G., Rahman, P., Zhai, G.: Classification of osteoarthritis phenotypes by metabolomics analysis. BMJ Open 4, e006286 (2014)CrossRef
35.
go back to reference Marcinkiewicz, J., Kontny, E.: Taurine and inflammatory diseases. Amino Acids 46(1), 7–20 (2014)CrossRef Marcinkiewicz, J., Kontny, E.: Taurine and inflammatory diseases. Amino Acids 46(1), 7–20 (2014)CrossRef
36.
go back to reference Loeser, R.F.: Aging and osteoarthritis: the role of chondrocyte senescence and aging changes in the cartilage matrix. Osteoarthr. Cartil. 17(8), 971–979 (2009)CrossRef Loeser, R.F.: Aging and osteoarthritis: the role of chondrocyte senescence and aging changes in the cartilage matrix. Osteoarthr. Cartil. 17(8), 971–979 (2009)CrossRef
37.
go back to reference Kontny, E., Wojtecka-ŁUkasik, E., Rell-Bakalarska, K., Dziewczopolski, W., Maśliński, W., Maślinski, S.: Impaired generation of taurine chloramine by synovial fluid neutrophils of rheumatoid arthritis patients. Amino Acids 23(4), 415–418 (2002)CrossRef Kontny, E., Wojtecka-ŁUkasik, E., Rell-Bakalarska, K., Dziewczopolski, W., Maśliński, W., Maślinski, S.: Impaired generation of taurine chloramine by synovial fluid neutrophils of rheumatoid arthritis patients. Amino Acids 23(4), 415–418 (2002)CrossRef
38.
go back to reference Loeser, R.F., Carlson, C.S., Carlo, M.D., Cole, A.: Detection of nitrotyrosine in aging and osteoarthritic cartilage: correlation of oxidative damage with the presence of interleukin-1\(\beta \) and with chondrocyte resistance to insulin-like growth factor 1. Arthritis Rheumatol. 46(9), 2349–2357 (2002)CrossRef Loeser, R.F., Carlson, C.S., Carlo, M.D., Cole, A.: Detection of nitrotyrosine in aging and osteoarthritic cartilage: correlation of oxidative damage with the presence of interleukin-1\(\beta \) and with chondrocyte resistance to insulin-like growth factor 1. Arthritis Rheumatol. 46(9), 2349–2357 (2002)CrossRef
39.
go back to reference Forrest, C.M., Kennedy, A., Stone, T.W., Stoy, N., Darlington, L.G.: Kynurenine and neopterin levels in patients with rheumatoid arthritis and osteoporosis during drug treatment. In: Allegri, G., Costa, C.V.L., Ragazzi, E., Steinhart, H., Varesio, L. (eds.) Developments in Tryptophan and Serotonin Metabolism. AEMB, vol. 527, pp. 287–295. Springer, Boston (2003). https://doi.org/10.1007/978-1-4615-0135-0_32 CrossRef Forrest, C.M., Kennedy, A., Stone, T.W., Stoy, N., Darlington, L.G.: Kynurenine and neopterin levels in patients with rheumatoid arthritis and osteoporosis during drug treatment. In: Allegri, G., Costa, C.V.L., Ragazzi, E., Steinhart, H., Varesio, L. (eds.) Developments in Tryptophan and Serotonin Metabolism. AEMB, vol. 527, pp. 287–295. Springer, Boston (2003). https://​doi.​org/​10.​1007/​978-1-4615-0135-0_​32 CrossRef
Metadata
Title
Analyzing Feature Importance for Metabolomics Using Genetic Programming
Authors
Ting Hu
Karoliina Oksanen
Weidong Zhang
Edward Randell
Andrew Furey
Guangju Zhai
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-77553-1_5

Premium Partner