Skip to main content

2018 | OriginalPaper | Buchkapitel

13. Application of Methods from Information Theory in Protein-Interaction Analysis

verfasst von : Arno G. Stefani, Achim Sandmann, Andreas Burkovski, Johannes B. Huber, Heinrich Sticht, Christophe Jardin

Erschienen in: Information- and Communication Theory in Molecular Biology

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The interaction of proteins with other biomolecules plays a central role in various aspects of the structural and functional organization of the cell. Their elucidation is crucial to understand processes such as metabolic control, signal transduction, and gene regulation. However, an experimental structural characterization of all of them is impractical, and only a small fraction of the potential complexes will be amenable to direct experimental analysis. Docking represents a versatile and powerful method to predict the geometry of protein–protein complexes. However, despite significant methodical advances, the identification of good docking solutions among a large number of false solutions still remains a difficult task. The present work allowed to adapt the formalism of mutual information (MI) from information theory to protein docking. In this context, we have developed a method, which finds a lower bound for the MI between a binary and an arbitrary finite random variable with joint distributions that have a variational distance not greater than a known value to a known joint distribution. This lower bound can be applied to MI estimation with confidence intervals. Different from previous results, these confidence intervals do not need any assumptions on the distribution or the sample size. An MI-based optimization protocol in conjunction with a clustering procedure was used to define reduced amino acids alphabets describing the interface properties of protein complexes. The reduced alphabets were subsequently converted into a scoring function for the evaluation of docking solutions, which is available for public use via a web service. The approach outlined above has recently been extended to the analysis of protein–DNA complexes by taking also into account geometrical parameters of the DNA.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
A finite random variable is a discrete random variable with finite symbol alphabet.
 
2
The superscripts 1 and 2 are indices and should not be confused with powers.
 
3
Please notice: \(\mathbf {R}\) corresponds to \(r_{XY}\), not to \(R_{XY}\).
 
4
The letter \(\mathrm {l}\) in \(\varepsilon _\mathrm {l}\) and \(\mathbf {Q}^\mathrm {l}\) stand for lower value and should not be confused with the digit 1.
 
5
The letter \(\mathrm {l}\) in \(\varepsilon _\mathrm {ld}\) and \(\mathbf {Q}^\mathrm {ld}\) again stands for lower value and the letter \(\mathrm {d}\) for determinant.
 
6
The letter \(\mathrm {u}\) in \(\varepsilon _\mathrm {ud}\) and \(\mathbf {Q}^\mathrm {ud}\) stands for upper value and the letter \(\mathrm {d}\) again for determinant.
 
7
The superscripts 1 and 2 of \(q_{Y|X}^1\), \(q_{Y|X}^2\) are indices, not powers.
 
Literatur
Zurück zum Zitat Jardin C et al (2013) An information-theoretic classification of amino acids for the assessment of interfaces in protein-protein docking. J Mol Model 19(9):3901–3910CrossRef Jardin C et al (2013) An information-theoretic classification of amino acids for the assessment of interfaces in protein-protein docking. J Mol Model 19(9):3901–3910CrossRef
Zurück zum Zitat Othersen OG et al (2012) Application of information theory to feature selection in protein docking. J Mol Model 18(4):1285–1297CrossRef Othersen OG et al (2012) Application of information theory to feature selection in protein docking. J Mol Model 18(4):1285–1297CrossRef
Zurück zum Zitat Stefani AG et al (2012) Towards confidence intervals for the mutual information between two binary random variables. In: Proceedings of the 9th international workshop on computational systems biology, pp 105–105 Stefani AG et al (2012) Towards confidence intervals for the mutual information between two binary random variables. In: Proceedings of the 9th international workshop on computational systems biology, pp 105–105
Zurück zum Zitat Stefani AG et al (2013) A lower bound for the confidence interval of the mutual information of high dimensional random variables. In: Proceedings of the 10th international workshop on computational systems biology, pp. 136–136 Stefani AG et al (2013) A lower bound for the confidence interval of the mutual information of high dimensional random variables. In: Proceedings of the 10th international workshop on computational systems biology, pp. 136–136
Zurück zum Zitat Stefani AG et al (2014a) A tight lower bound on the mutual information of a binary and an arbitrary finite random variable as a function of the variational distance. In: Australian communications theory workshop (AusCTW), pp 1–4 Stefani AG et al (2014a) A tight lower bound on the mutual information of a binary and an arbitrary finite random variable as a function of the variational distance. In: Australian communications theory workshop (AusCTW), pp 1–4
Zurück zum Zitat Stefani AG (2017, to appear) Nonparametric and nonasymptotic confidence intervals for estimation of mutual information with applications in protein–protein docking analysis. Ph.D. thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg Stefani AG (2017, to appear) Nonparametric and nonasymptotic confidence intervals for estimation of mutual information with applications in protein–protein docking analysis. Ph.D. thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg
Zurück zum Zitat Achtert E et al (2012) Evaluation of clusterings - metrics and visual support. In: IEEE 28th International Conference on Data Engineering (ICDE 2012), Washington, DC, USA (Arlington, Virginia), 1–5 April 2012 Achtert E et al (2012) Evaluation of clusterings - metrics and visual support. In: IEEE 28th International Conference on Data Engineering (ICDE 2012), Washington, DC, USA (Arlington, Virginia), 1–5 April 2012
Zurück zum Zitat Boyd S, Vandenberghe L (2004) Convex Optimization. Cambridge University Press, New YorkCrossRefMATH Boyd S, Vandenberghe L (2004) Convex Optimization. Cambridge University Press, New YorkCrossRefMATH
Zurück zum Zitat Cover TM, Thomas JA (2006) Elements of information theory, 2nd. Wiley, New YorkMATH Cover TM, Thomas JA (2006) Elements of information theory, 2nd. Wiley, New YorkMATH
Zurück zum Zitat Grant M, Boyd S (2014) CVX: matlab software for disciplined convex programming, version 2.1 Grant M, Boyd S (2014) CVX: matlab software for disciplined convex programming, version 2.1
Zurück zum Zitat Melo F, Marti-Renom MA (2006) Accuracy of sequence alignment and fold assessment using reduced amino acid alphabets. Proteins Struct Function Bioinform 63(4):986–995. doi:10.1002/prot.20881 CrossRef Melo F, Marti-Renom MA (2006) Accuracy of sequence alignment and fold assessment using reduced amino acid alphabets. Proteins Struct Function Bioinform 63(4):986–995. doi:10.​1002/​prot.​20881 CrossRef
Zurück zum Zitat Pierce B, Weng Z (2007) ZRANK: reranking protein docking predictions with an optimized energy function. Proteins Struct Function Bioinform 67(4):1078–1086. doi:10.1002/prot.21373 CrossRef Pierce B, Weng Z (2007) ZRANK: reranking protein docking predictions with an optimized energy function. Proteins Struct Function Bioinform 67(4):1078–1086. doi:10.​1002/​prot.​21373 CrossRef
Zurück zum Zitat Vacic V, Iakoucheva LM, Radivojac P (2006) Two sample logo: a graphical representation of the differences between two sets of sequence alignments. Bioinformatics 22(12):1536–1537CrossRef Vacic V, Iakoucheva LM, Radivojac P (2006) Two sample logo: a graphical representation of the differences between two sets of sequence alignments. Bioinformatics 22(12):1536–1537CrossRef
Zurück zum Zitat Weissman T et al (2003) Inequalities for the \(L_{1}\) deviation of the empirical distribution. Technical report HPL-2003-97 (R.1). Palo Alto: HP Laboratories Weissman T et al (2003) Inequalities for the \(L_{1}\) deviation of the empirical distribution. Technical report HPL-2003-97 (R.1). Palo Alto: HP Laboratories
Zurück zum Zitat Yang Y, Zhou Y (2008) Specific interactions for ab initio folding of protein terminal regions with secondary structures. Proteins Struct Function Bioinform 72(2):793–803. doi:10.1002/prot.21968 Yang Y, Zhou Y (2008) Specific interactions for ab initio folding of protein terminal regions with secondary structures. Proteins Struct Function Bioinform 72(2):793–803. doi:10.​1002/​prot.​21968
Metadaten
Titel
Application of Methods from Information Theory in Protein-Interaction Analysis
verfasst von
Arno G. Stefani
Achim Sandmann
Andreas Burkovski
Johannes B. Huber
Heinrich Sticht
Christophe Jardin
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-54729-9_13

Neuer Inhalt