Skip to main content
Top

2020 | OriginalPaper | Chapter

Profiling Environmental Conditions from DNA

Authors : Sambriddhi Mainali, Max H. Garzon, Fredy A. Colorado

Published in: Bioinformatics and Biomedical Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

DNA is quintessential to carry out basic functions by organisms as it encodes information necessary for metabolomics and proteomics, among others. In particular, it is common nowadays to use DNA for profiling living organisms based on their phenotypic traits. These traits are the outcomes of the genetic makeup constrained by the interaction between living organisms and their surrounding environment over time. For environmental conditions, however, the conventional assumption is that they are too random and ephemeral to be encoded in the DNA of an organism. Here, we demonstrate that, to the contrary, genomic DNA may also encode sufficient information about some environmental features of an organism’s habitat for a machine learning model to reveal them, although there seem to be exceptions, i.e. some environmental features do not appear to be coded in DNA, unless our methods miss that information. Nevertheless, we demonstrate that these features can be used to train better models for better predictions of other environmental factors. These results lead directly to the question of whether over evolutionary history, DNA itself is actually also a repository of information related to the environment where the lineage has developed, perhaps even more cryptically than the way it encodes phenotypic information.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Adleman, L.M.: Molecular computation of solutions to combinatorial problems. Science 266(5), 1021–1024 (1994)CrossRef Adleman, L.M.: Molecular computation of solutions to combinatorial problems. Science 266(5), 1021–1024 (1994)CrossRef
2.
go back to reference Barberán, A., Ramirez, K.S., Leff, J.W., Bradford, M.A., Wall, D.H., Fierer, N.: Why are some microbes more ubiquitous than others? Predicting the habitat breadth of soul bacteria. Ecol. Lett. 17(7), 794–802 (2014)CrossRef Barberán, A., Ramirez, K.S., Leff, J.W., Bradford, M.A., Wall, D.H., Fierer, N.: Why are some microbes more ubiquitous than others? Predicting the habitat breadth of soul bacteria. Ecol. Lett. 17(7), 794–802 (2014)CrossRef
3.
go back to reference Candel, A., Parmar, V., LeDell, E., Arora, A.: Deep learning with H2O. H2O. ai Inc (2016) Candel, A., Parmar, V., LeDell, E., Arora, A.: Deep learning with H2O. H2O. ai Inc (2016)
4.
go back to reference Chuine, I.: Why does phenology drive species distribution? Philos. Trans. R. Soc. B Biol. Sci. 365(1555), 3149–3160 (2010)CrossRef Chuine, I.: Why does phenology drive species distribution? Philos. Trans. R. Soc. B Biol. Sci. 365(1555), 3149–3160 (2010)CrossRef
5.
go back to reference Colorado-Garzón, F.A., Adler, P.H., García, L.F., Muñoz de Hoyos, P., Bueno, M.L., Matta, N.E.: Estimating diversity of black flies in the Simulium ignescens and Simulium tunja complexes in Colombia: chromosomal rearrangements as the core of integrative taxonomy. J. Hered. 108(1), 12–24 (2017)CrossRef Colorado-Garzón, F.A., Adler, P.H., García, L.F., Muñoz de Hoyos, P., Bueno, M.L., Matta, N.E.: Estimating diversity of black flies in the Simulium ignescens and Simulium tunja complexes in Colombia: chromosomal rearrangements as the core of integrative taxonomy. J. Hered. 108(1), 12–24 (2017)CrossRef
6.
go back to reference Cook-Deegan, R., DeRienzo, C., Carbone, J., Chandrasekharan, S., Heaney, C., Conover, C.: Impact of gene patents and licensing practices on access to genetic testing for inherited susceptibility to cancer: comparing breast and ovarian cancers with colon cancers. Genet. Med. 12, S15–S38 (2010)CrossRef Cook-Deegan, R., DeRienzo, C., Carbone, J., Chandrasekharan, S., Heaney, C., Conover, C.: Impact of gene patents and licensing practices on access to genetic testing for inherited susceptibility to cancer: comparing breast and ovarian cancers with colon cancers. Genet. Med. 12, S15–S38 (2010)CrossRef
7.
go back to reference Darlington, P.J.: The cost of evolution and the imprecision of adaptation. Proc. Natl. Acad. Sci. 74(4), 1647–1651 (1977)CrossRef Darlington, P.J.: The cost of evolution and the imprecision of adaptation. Proc. Natl. Acad. Sci. 74(4), 1647–1651 (1977)CrossRef
8.
go back to reference Elith, J., Leathwick, J.R.: Species distribution models: ecological explanation and prediction across space and time. Annu. Rev. Ecol. Evol. Syst. 40, 677–697 (2009)CrossRef Elith, J., Leathwick, J.R.: Species distribution models: ecological explanation and prediction across space and time. Annu. Rev. Ecol. Evol. Syst. 40, 677–697 (2009)CrossRef
10.
go back to reference Garzon, M.H., Mainali, S.: Towards reliable microarray analysis and design. In: The 9th International Conference on Bioinformatics and Computational Biology, ISCA (2017) Garzon, M.H., Mainali, S.: Towards reliable microarray analysis and design. In: The 9th International Conference on Bioinformatics and Computational Biology, ISCA (2017)
13.
go back to reference Garzon, M.H., Wong, T.Y.: DNA chips for species identification and biological phylogenies. Nat. Comput. 10, 375–389 (2011)CrossRef Garzon, M.H., Wong, T.Y.: DNA chips for species identification and biological phylogenies. Nat. Comput. 10, 375–389 (2011)CrossRef
14.
go back to reference Garzon, M., Neathery, P., Deaton, R., Murphy, R.C., Franceschetti, D.R., Stevens Jr., S.E.: A new metric for DNA computing. In: Proceedings of the 2nd Genetic Programming Conference, pp. 472–478. Morgan-Kaufmann (1997) Garzon, M., Neathery, P., Deaton, R., Murphy, R.C., Franceschetti, D.R., Stevens Jr., S.E.: A new metric for DNA computing. In: Proceedings of the 2nd Genetic Programming Conference, pp. 472–478. Morgan-Kaufmann (1997)
15.
go back to reference Guisan, A., et al.: Predicting species distributions for conservation decisions. Ecol. Lett. 16(12), 1424–1435 (2013)CrossRef Guisan, A., et al.: Predicting species distributions for conservation decisions. Ecol. Lett. 16(12), 1424–1435 (2013)CrossRef
16.
go back to reference Haykin, S.: Neural Networks and Learning Machines. Prenctice-Hall, New Jersey (2018) Haykin, S.: Neural Networks and Learning Machines. Prenctice-Hall, New Jersey (2018)
17.
go back to reference Hoegh-Guldberg, O., et al.: Assisted colonization and rapid climate change. Science 321, 345–346 (2008)CrossRef Hoegh-Guldberg, O., et al.: Assisted colonization and rapid climate change. Science 321, 345–346 (2008)CrossRef
18.
go back to reference Li, X., Qian, B., Wei, J., Zhang, X., Chen, S., Zheng, Q.: Domain knowledge guided deep atrial fibrillation classification and its visual interpretation. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 129–138. ACM (2019) Li, X., Qian, B., Wei, J., Zhang, X., Chen, S., Zheng, Q.: Domain knowledge guided deep atrial fibrillation classification and its visual interpretation. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 129–138. ACM (2019)
19.
go back to reference Mainali, S., Colorado, F.A., Garzon, M.H.: Foretelling the phenotype of a genomic sequence. In: IEEE Transactions on Computational Biology and Bioinformatics, revision under review (2020) Mainali, S., Colorado, F.A., Garzon, M.H.: Foretelling the phenotype of a genomic sequence. In: IEEE Transactions on Computational Biology and Bioinformatics, revision under review (2020)
22.
go back to reference Radovanović, S., Delibašić, B., Jovanović, M., Vukićević, M., Suknović, M.: Framework for integration of domain knowledge into logistic regression. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, p. 24. ACM (2018) Radovanović, S., Delibašić, B., Jovanović, M., Vukićević, M., Suknović, M.: Framework for integration of domain knowledge into logistic regression. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, p. 24. ACM (2018)
24.
go back to reference Seeman, N.C.: Nucleic acid junctions and lattices. J. Theor. Biol. 99(2), 237–247 (1982)CrossRef Seeman, N.C.: Nucleic acid junctions and lattices. J. Theor. Biol. 99(2), 237–247 (1982)CrossRef
25.
26.
go back to reference Sober, E.: What is wrong with intelligent design? Q. Rev. Biol. 82(1), 3–8 (2007)CrossRef Sober, E.: What is wrong with intelligent design? Q. Rev. Biol. 82(1), 3–8 (2007)CrossRef
27.
go back to reference Vasseur, F., et al.: Adaptive diversification of growth allometry in the plant Arabidopsis thaliana. PNAS 115:13 3416-3421 (2018) Vasseur, F., et al.: Adaptive diversification of growth allometry in the plant Arabidopsis thaliana. PNAS 115:13 3416-3421 (2018)
28.
go back to reference Wang, J.X., Wu, J.L., Xiao, H.: Physics-informed machine learning approach for reconstructing Reynolds stress modeling discrepancies based on DNS data. Phys. Rev. Fluids 2(3), 034603 (2017)CrossRef Wang, J.X., Wu, J.L., Xiao, H.: Physics-informed machine learning approach for reconstructing Reynolds stress modeling discrepancies based on DNS data. Phys. Rev. Fluids 2(3), 034603 (2017)CrossRef
29.
go back to reference Watson, J.D., Crick, F.: A structure for deoxyribose nucleic acid. Nature 171, 737–738 (1953)CrossRef Watson, J.D., Crick, F.: A structure for deoxyribose nucleic acid. Nature 171, 737–738 (1953)CrossRef
30.
go back to reference Weigel, D., Mott, R.: The 1001 genomes project for Arabidopsis thaliana. Genome Biol. 10(5), 107 (2009)CrossRef Weigel, D., Mott, R.: The 1001 genomes project for Arabidopsis thaliana. Genome Biol. 10(5), 107 (2009)CrossRef
31.
go back to reference Yin, C., Zhao, R., Qian, B., Lv, X., Zhang, P.: Domain Knowledge guided deep learning with electronic health records. In: IEEE International Conference on Data Mining (ICDM) (2019) Yin, C., Zhao, R., Qian, B., Lv, X., Zhang, P.: Domain Knowledge guided deep learning with electronic health records. In: IEEE International Conference on Data Mining (ICDM) (2019)
Metadata
Title
Profiling Environmental Conditions from DNA
Authors
Sambriddhi Mainali
Max H. Garzon
Fredy A. Colorado
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-45385-5_58

Premium Partner