Abstract
Hardware and software advancements along with the accumulation of large amounts of data in recent years have together spurred a remarkable growth in the application of neural networks to various scientific fields. Machine learning based on neural networks with multiple (hidden) layers is becoming an extremely powerful approach for analyzing data. With the accumulation of large amounts of protein data such as structural and functional assay data, the effects of such approaches within the field of protein informatics are increasing. Here, we introduce our recent studies based on applications of neural networks for protein structure and function prediction and dynamic analysis involving: (i) inter-residue contact prediction based on a multiple sequence alignment (MSA) of amino acid sequences, (ii) prediction of protein–compound interaction using assay data, and (iii) detection of protein allostery from trajectories of molecular dynamic (MD) simulation.
Abbreviations
- MSA:
-
multiple sequence alignment
- MD:
-
molecular dynamics
- RNN:
-
residual neural network
- GNN:
-
graphic neural network
- CNN:
-
convolutional neural network
- DIO:
-
differences between the input and output
- NMR:
-
nuclear magnetic resonance
References
Adhikari B, Bhattacharya D, Cao R, Cheng J (2015) CONFOLD: residue-residue contact-guided ab initio protein folding. Proteins 83:1436–1449. https://doi.org/10.1002/prot.24829
Bahdanau D et al (2014) Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations
Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of deep learning in drug discovery. Drug Discov Today 23(6):1241–1250. https://doi.org/10.1016/j.drudis.2018.01.039
Cooper A, Dryden DTF (1984) Allostery without conformational change - a plausible model. Eur Biophys J 11:103–109. https://doi.org/10.1007/BF00276625
Costa F, De Grave K (2010) Fast neighborhood subgraph pairwise distance kernel. In: International Conference on Machine Learning
El-Gebali S et al (2019) The Pfam protein families database in 2019. Nucleic Acids Res 47:D427–D432. https://doi.org/10.1093/nar/gky995
Fuentes EJ, Der CJ, Lee AL (2004) Ligand-dependent dynamics and intramolecular signaling in a PDZ domain. J Mol Biol 335:1105–1115. https://doi.org/10.1016/j.jmb.2003.11.010
Fukuda H, Tomii K (2020) DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment. BMC Bioinformatics 21:10. https://doi.org/10.1186/s12859-019-3190-x
Günther S et al (2008) Supertarget and matador: resources for exploring drug-target relationships. Nucleic Acids Res 36:D919–D922. https://doi.org/10.1093/nar/gkm862
Kandathil SM, Greener JG, Jones DT (2019) Recent developments in deep learning applied to protein structure prediction. Proteins 87:1179–1189. https://doi.org/10.1002/prot.25824
Karsch-Mizrachi I et al (2018) The international nucleotide sequence database collaboration. Nucleic Acids Res 46:D48–D51. https://doi.org/10.1093/nar/gkx1097
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:435–444. https://doi.org/10.1038/nature14539
Lemke T, Peter C (2019) EncoderMap: dimensionality reduction and generation of molecule conformations. J Chem Theory Comput 15:1209–1215. https://doi.org/10.1021/acs.jctc.8b00975
Liu J, Nussinov R (2016) Allostery: an overview of its history, concepts, methods, and applications. PLoS Comput Biol 12:e1004966. https://doi.org/10.1371/journal.pcbi.1004966
Liu H, Sun J, Guan J, Zheng J, Zhou S (2015) Improving compound-protein interaction prediction by building up highly credible negative samples. Bioinformatics 31:i221–i229. https://doi.org/10.1093/bioinformatics/btv256
Monastyrskyy B, D'Andrea D, Fidelis K, Tramontano A, Kryshtafovych A (2016) New encouraging developments in contact prediction: assessment of the CASP11 results. Proteins 84(Suppl 1):131–144. https://doi.org/10.1002/prot.24943
Mysinger MM, Carchia M, Irwin JJ, Shoichet BK (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem 55:6582–6594. https://doi.org/10.1021/jm300687e
R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing; Vienna, Austria
Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65:386–408. https://doi.org/10.1037/h0042519
Schaarschmidt J, Monastyrskyy B, Kryshtafovych A, Bonvin AMJJ (2018) Assessment of contact predictions in CASP12: co-evolution and deep learning coming of age. Proteins 86(Suppl 1):51–66. https://doi.org/10.1002/prot.25407
Senior AW et al (2020) Improved protein structure prediction using potentials from deep learning. Nature 577:706–710. https://doi.org/10.1038/s41586-019-1923-7
Shi Q, Chen W, Huang S, Wang Y, Xue Z (2019) Deep learning for mining protein data. Brief Bioinform. https://doi.org/10.1093/bib/bbz156
Shimagaki K, Weigt M (2019) Selection of sequence motifs and generative Hopfield-Potts models for protein families. Phys Rev E 100:032128. https://doi.org/10.1103/PhysRevE.100.032128
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp 3104–3112
Toxvaerd S, Heilmann OJ, Dyre JC (2012) Energy conservation in molecular dynamics simulations of classical systems. J Chem Phys 136:224106. https://doi.org/10.1063/1.4726728
Tsubaki M, Tomii K, Sese J (2019) Compound–protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics 35:309–318. https://doi.org/10.1093/bioinformatics/bty535
Tsuchiya Y, Taneishi K, Yonezawa Y (2019) Autoencoder-based detection of dynamic allostery triggered by ligand binding based on molecular dynamics. J Chem Inf Model 59:4043–4051. https://doi.org/10.1021/acs.jcim.9b00426
Wishart DS et al (2008) Drugbank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 36:D901–D906. https://doi.org/10.1093/nar/gkm958
Zhang J, Sapienza PJ, Ke H, Chang A, Hengel SR, Wang H, Phillips GN, Lee AL (2010) Crystallographic and nuclear magnetic resonance evaluation of the impact of peptide binding to the second PDZ domain of protein tyrosine phosphatase 1E. Biochemistry 49:9280–9291. https://doi.org/10.1021/bi101131f
Funding
This research was partially supported as a Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS)) from AMED under Grant number JP19am0101110.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tsuchiya, Y., Tomii, K. Neural networks for protein structure and function prediction and dynamic analysis. Biophys Rev 12, 569–573 (2020). https://doi.org/10.1007/s12551-020-00685-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12551-020-00685-6