Skip to main content

Direct Coupling Analysis for Protein Contact Prediction

  • Protocol
  • First Online:
Protein Structure Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1137))

Abstract

During evolution, structure, and function of proteins are remarkably conserved, whereas amino-acid sequences vary strongly between homologous proteins. Structural conservation constrains sequence variability and forces different residues to coevolve, i.e., to show correlated patterns of amino-acid occurrences. However, residue correlation may result from direct coupling, e.g., by a contact in the folded protein, or be induced indirectly via intermediate residues. To use empirically observed correlations for predicting residue–residue contacts, direct and indirect effects have to be disentangled. Here we present mechanistic details on how to achieve this using a methodology called Direct Coupling Analysis (DCA). DCA has been shown to produce highly accurate estimates of amino-acid pairs that have direct reciprocal constraints in evolution. Specifically, we provide instructions and protocols on how to use the algorithmic implementations of DCA starting from data extraction to predicted-contact visualization in contact maps or representative protein structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Göbel U, Sander C, Schneider R, Valencia A (1994) Correlated mutations and residue contacts in proteins. Proteins Struct Funct Genet 18:309–317

    Article  PubMed  Google Scholar 

  2. Lockless SW, Ranganathan R (1999) Evolutionarily conserved pathways of energetic connectivity in protein families. Science 286:295–299

    Article  CAS  PubMed  Google Scholar 

  3. Fariselli P, Casadio R (1999) A neural network based predictor of residue contacts in proteins. Protein Eng 12(1):15–21

    Article  CAS  PubMed  Google Scholar 

  4. Fariselli P, Olmea O, Valencia A, Casadio R (2001) Prediction of contact maps with neural networks and correlated mutations. Protein Eng 14(11):835–843

    Article  CAS  PubMed  Google Scholar 

  5. Pollastri G, Baldi P (2002) Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics 18 Suppl 1:S62–S70

    Google Scholar 

  6. Hamilton N, Burrage K, Ragan MA, Huber T (2004) Protein contact prediction using patterns of correlation. Proteins Struct Funct Bioinformatics 56(4):679–684

    Article  CAS  Google Scholar 

  7. Morcos F et al (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci USA 108(49):E1293–E1301

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  8. Lunt B et al (2010) Inference of direct residue contacts in two-component signaling. Methods Enzymol 471:17–41

    Article  CAS  PubMed  Google Scholar 

  9. Weigt M, White RA, Szurmant H, Hoch JA, Hwa T (2009) Identification of direct residue contacts in protein-protein interaction by message passing. Proc Natl Acad Sci USA 106:67–72

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  10. Burger L, van Nimwegen E (2010) Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Comput Biol 6:e1000633

    Article  PubMed Central  PubMed  Google Scholar 

  11. Taylor WR, Sadowski MI (2011) Structural constraints on the covariance matrix derived from multiple aligned protein sequences. PLoS One 6(12):e28265

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  12. Balakrishnan S, Kamisetty H, Carbonell JG, Lee SI, Langmead CJ (2011) Learning generative models for protein fold families. Proteins 79(4):1061–1078

    Article  CAS  PubMed  Google Scholar 

  13. Jones DT, Buchan DW, Cozzetto D, Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28(2):184–190

    Article  CAS  PubMed  Google Scholar 

  14. Dago AE et al (2012) Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis. Proc Natl Acad Sci USA 109(26):E1733–1742

    Google Scholar 

  15. Schug A, Weigt M, Onuchic JN, Hwa T, Szurmant H (2009) High-resolution protein complexes from integrating genomic information with molecular simulation. Proc Natl Acad Sci USA 106:22124–22129

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  16. Schug A, Weigt M, Hoch J, Onuchic J (2010) Computational modeling of phosphotransfer complexes in two-component signaling. Methods Enzymol 471:43–58

    Article  CAS  PubMed  Google Scholar 

  17. Sulkowska JI, Morcos F, Weigt M, Hwa T, Onuchic JN (2012) Genomics-aided structure prediction. Proc Natl Acad Sci USA 109(26):10340–10345

    Google Scholar 

  18. Marks DS et al (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS One 6:e28766

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  19. Hopf TA et al (2012) Three-dimensional structures of membrane proteins from genomic sequencing. Cell 149(7):1607–1621

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  20. Nugent T, Jones DT (2012) Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis. Proc Natl Acad Sci USA 109(24):E1540–E1547

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  21. Finn RD et al (2010) The Pfam protein families database. Nucleic Acids Res 38:D211–D222

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  22. Berman HM et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  23. Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7(10):e1002195

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  24. Pettersen EF et al (2004) UCSF Chimera—a visualization system for exploratory research and analysis. J Comput Chem 25:1605–1612

    Article  CAS  PubMed  Google Scholar 

  25. Clementi C, Nymeyer H, Onuchic JN (2000) Topological and energetic factors: what determines the structural details of the transition state ensemble and “en-route” intermediates for protein folding? An investigation for small globular proteins. J Mol Biol 298:937–953

    Article  CAS  PubMed  Google Scholar 

  26. Marks DS, Hopf TA, Sander C (2012) Protein structure prediction from sequence variation. Nat Biotechnol 30(11):1072–1080

    Article  CAS  PubMed  Google Scholar 

  27. Dill KA, MacCallum JL (2012) The protein-folding problem, 50 years on. Science 338(6110):1042–1046

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

This work was supported by the Center for Theoretical Biological Physics sponsored by the NSF (Grant PHY-0822283) and by NSF-MCB-1214457. JNO is a CPRIT Scholar in Cancer Research sponsored by the Cancer Prevention and Research Institute of Texas.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Morcos, F., Hwa, T., Onuchic, J.N., Weigt, M. (2014). Direct Coupling Analysis for Protein Contact Prediction. In: Kihara, D. (eds) Protein Structure Prediction. Methods in Molecular Biology, vol 1137. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0366-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-0366-5_5

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-0365-8

  • Online ISBN: 978-1-4939-0366-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics