Direct Coupling Analysis for Protein Contact Prediction

Morcos, Faruck; Hwa, Terence; Onuchic, José N.; Weigt, Martin

doi:10.1007/978-1-4939-0366-5_5

Faruck Morcos³,
Terence Hwa⁴,
José N. Onuchic³ &
…
Martin Weigt⁵

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1137))

4506 Accesses
38 Citations

Abstract

During evolution, structure, and function of proteins are remarkably conserved, whereas amino-acid sequences vary strongly between homologous proteins. Structural conservation constrains sequence variability and forces different residues to coevolve, i.e., to show correlated patterns of amino-acid occurrences. However, residue correlation may result from direct coupling, e.g., by a contact in the folded protein, or be induced indirectly via intermediate residues. To use empirically observed correlations for predicting residue–residue contacts, direct and indirect effects have to be disentangled. Here we present mechanistic details on how to achieve this using a methodology called Direct Coupling Analysis (DCA). DCA has been shown to produce highly accurate estimates of amino-acid pairs that have direct reciprocal constraints in evolution. Specifically, we provide instructions and protocols on how to use the algorithmic implementations of DCA starting from data extraction to predicted-contact visualization in contact maps or representative protein structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Göbel U, Sander C, Schneider R, Valencia A (1994) Correlated mutations and residue contacts in proteins. Proteins Struct Funct Genet 18:309–317
Article PubMed Google Scholar
Lockless SW, Ranganathan R (1999) Evolutionarily conserved pathways of energetic connectivity in protein families. Science 286:295–299
Article CAS PubMed Google Scholar
Fariselli P, Casadio R (1999) A neural network based predictor of residue contacts in proteins. Protein Eng 12(1):15–21
Article CAS PubMed Google Scholar
Fariselli P, Olmea O, Valencia A, Casadio R (2001) Prediction of contact maps with neural networks and correlated mutations. Protein Eng 14(11):835–843
Article CAS PubMed Google Scholar
Pollastri G, Baldi P (2002) Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics 18 Suppl 1:S62–S70
Google Scholar
Hamilton N, Burrage K, Ragan MA, Huber T (2004) Protein contact prediction using patterns of correlation. Proteins Struct Funct Bioinformatics 56(4):679–684
Article CAS Google Scholar
Morcos F et al (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci USA 108(49):E1293–E1301
Article CAS PubMed Central PubMed Google Scholar
Lunt B et al (2010) Inference of direct residue contacts in two-component signaling. Methods Enzymol 471:17–41
Article CAS PubMed Google Scholar
Weigt M, White RA, Szurmant H, Hoch JA, Hwa T (2009) Identification of direct residue contacts in protein-protein interaction by message passing. Proc Natl Acad Sci USA 106:67–72
Article CAS PubMed Central PubMed Google Scholar
Burger L, van Nimwegen E (2010) Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Comput Biol 6:e1000633
Article PubMed Central PubMed Google Scholar
Taylor WR, Sadowski MI (2011) Structural constraints on the covariance matrix derived from multiple aligned protein sequences. PLoS One 6(12):e28265
Article CAS PubMed Central PubMed Google Scholar
Balakrishnan S, Kamisetty H, Carbonell JG, Lee SI, Langmead CJ (2011) Learning generative models for protein fold families. Proteins 79(4):1061–1078
Article CAS PubMed Google Scholar
Jones DT, Buchan DW, Cozzetto D, Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28(2):184–190
Article CAS PubMed Google Scholar
Dago AE et al (2012) Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis. Proc Natl Acad Sci USA 109(26):E1733–1742
Google Scholar
Schug A, Weigt M, Onuchic JN, Hwa T, Szurmant H (2009) High-resolution protein complexes from integrating genomic information with molecular simulation. Proc Natl Acad Sci USA 106:22124–22129
Article CAS PubMed Central PubMed Google Scholar
Schug A, Weigt M, Hoch J, Onuchic J (2010) Computational modeling of phosphotransfer complexes in two-component signaling. Methods Enzymol 471:43–58
Article CAS PubMed Google Scholar
Sulkowska JI, Morcos F, Weigt M, Hwa T, Onuchic JN (2012) Genomics-aided structure prediction. Proc Natl Acad Sci USA 109(26):10340–10345
Google Scholar
Marks DS et al (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS One 6:e28766
Article CAS PubMed Central PubMed Google Scholar
Hopf TA et al (2012) Three-dimensional structures of membrane proteins from genomic sequencing. Cell 149(7):1607–1621
Article CAS PubMed Central PubMed Google Scholar
Nugent T, Jones DT (2012) Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis. Proc Natl Acad Sci USA 109(24):E1540–E1547
Article CAS PubMed Central PubMed Google Scholar
Finn RD et al (2010) The Pfam protein families database. Nucleic Acids Res 38:D211–D222
Article CAS PubMed Central PubMed Google Scholar
Berman HM et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242
Article CAS PubMed Central PubMed Google Scholar
Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7(10):e1002195
Article CAS PubMed Central PubMed Google Scholar
Pettersen EF et al (2004) UCSF Chimera—a visualization system for exploratory research and analysis. J Comput Chem 25:1605–1612
Article CAS PubMed Google Scholar
Clementi C, Nymeyer H, Onuchic JN (2000) Topological and energetic factors: what determines the structural details of the transition state ensemble and “en-route” intermediates for protein folding? An investigation for small globular proteins. J Mol Biol 298:937–953
Article CAS PubMed Google Scholar
Marks DS, Hopf TA, Sander C (2012) Protein structure prediction from sequence variation. Nat Biotechnol 30(11):1072–1080
Article CAS PubMed Google Scholar
Dill KA, MacCallum JL (2012) The protein-folding problem, 50 years on. Science 338(6110):1042–1046
Article CAS PubMed Google Scholar

Download references

Acknowledgments

This work was supported by the Center for Theoretical Biological Physics sponsored by the NSF (Grant PHY-0822283) and by NSF-MCB-1214457. JNO is a CPRIT Scholar in Cancer Research sponsored by the Cancer Prevention and Research Institute of Texas.

Author information

Authors and Affiliations

Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
Faruck Morcos & José N. Onuchic
Center for Theoretical Biological Physics, University of California at San Diego, La Jolla, CA, USA
Terence Hwa
UMR7238—Laboratoire de Génomique des Microorganismes, Université Pierre et Marie Curie, Paris, France
Martin Weigt

Authors

Faruck Morcos
View author publications
You can also search for this author in PubMed Google Scholar
Terence Hwa
View author publications
You can also search for this author in PubMed Google Scholar
José N. Onuchic
View author publications
You can also search for this author in PubMed Google Scholar
Martin Weigt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Purdue University Dept. Biological Science, West Lafayette, Indiana, USA
Daisuke Kihara

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Morcos, F., Hwa, T., Onuchic, J.N., Weigt, M. (2014). Direct Coupling Analysis for Protein Contact Prediction. In: Kihara, D. (eds) Protein Structure Prediction. Methods in Molecular Biology, vol 1137. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0366-5_5

Download citation

DOI: https://doi.org/10.1007/978-1-4939-0366-5_5
Published: 07 February 2014
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-0365-8
Online ISBN: 978-1-4939-0366-5
eBook Packages: Springer Protocols

Publish with us

Policies and ethics