Skip to main content

2018 | OriginalPaper | Buchkapitel

Application of Graph Clustering and Visualisation Methods to Analysis of Biomolecular Data

verfasst von : Edgars Celms, Kārlis Čerāns, Kārlis Freivalds, Paulis Ķikusts, Lelde Lāce, Gatis Melkus, Mārtiņš Opmanis, Dārta Rituma, Pēteris Ručevskis, Juris Vīksna

Erschienen in: Databases and Information Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper we present an approach based on integrated use of graph clustering and visualisation methods for semi-supervised discovery of biologically significant features in biomolecular data sets. We describe several clustering algorithms that have been custom designed for analysis of biomolecular data and feature an iterated two step approach involving initial computation of thresholds and other parameters used in clustering algorithms, which is followed by identification of connected graph components, and, if needed, by adjustment of clustering parameters for processing of individual subgraphs.
We demonstrate the applications of these algorithms to two concrete use cases: (1) analysis of protein coexpression in colorectal cancer cell lines; and (2) protein homology identification from, both sequence and structural similarity, data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
2.
Zurück zum Zitat Choudhari, J., et al.: Genomic determinants of protein abundance variation in colorectal cancer cells. Cell Rep. 20, 2201–2214 (2017)CrossRef Choudhari, J., et al.: Genomic determinants of protein abundance variation in colorectal cancer cells. Cell Rep. 20, 2201–2214 (2017)CrossRef
3.
Zurück zum Zitat Enright, A., et al.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002)CrossRef Enright, A., et al.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002)CrossRef
7.
Zurück zum Zitat Grishin, N.: Fold change in evolution of protein structures. Struct. Biol. 134, 167–185 (2001)CrossRef Grishin, N.: Fold change in evolution of protein structures. Struct. Biol. 134, 167–185 (2001)CrossRef
8.
Zurück zum Zitat Higgins, D., Sievers, F.: Clustal Omega, accurate alignment of very large numbers of sequences. Methods Mol. Biol. 1079, 105–116 (2014)CrossRef Higgins, D., Sievers, F.: Clustal Omega, accurate alignment of very large numbers of sequences. Methods Mol. Biol. 1079, 105–116 (2014)CrossRef
9.
Zurück zum Zitat Higgins, D., et al.: ClustalW and ClustalX version 2.0. Bioinformatics 23, 2947–2948 (2007)CrossRef Higgins, D., et al.: ClustalW and ClustalX version 2.0. Bioinformatics 23, 2947–2948 (2007)CrossRef
10.
Zurück zum Zitat Jonsson, P., et al.: Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis. BMC Bioinform. 7(1), 2 (2006)CrossRef Jonsson, P., et al.: Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis. BMC Bioinform. 7(1), 2 (2006)CrossRef
11.
Zurück zum Zitat Kurbatova, N., Mancinska, L., Viksna, J.: Protein structure comparison based on fold evolution. Lect. Notes Inform. 115, 78–89 (2007) Kurbatova, N., Mancinska, L., Viksna, J.: Protein structure comparison based on fold evolution. Lect. Notes Inform. 115, 78–89 (2007)
12.
Zurück zum Zitat Kurbatova, N., Viksna, J.: Exploration of evolutionary relations between protein structures. Commun. Comput. Inf. Sci. 13, 154–166 (2008) Kurbatova, N., Viksna, J.: Exploration of evolutionary relations between protein structures. Commun. Comput. Inf. Sci. 13, 154–166 (2008)
13.
Zurück zum Zitat Langfelder, P., Horwath, S.: WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9, 559 (2008)CrossRef Langfelder, P., Horwath, S.: WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9, 559 (2008)CrossRef
14.
Zurück zum Zitat Maddi, A., Eslahchi, C.: Discovering overlapped protein complexes from weighted PPI networks by removing inter-module hubs. Sci. Rep. 7, 3247 (2017)CrossRef Maddi, A., Eslahchi, C.: Discovering overlapped protein complexes from weighted PPI networks by removing inter-module hubs. Sci. Rep. 7, 3247 (2017)CrossRef
15.
Zurück zum Zitat Nepusz, T., Yu, H., Paccanaro, A.: Detecting overlapping protein complexes in protein-protein interaction networks. Nat. Methods 9, 471–472 (2012)CrossRef Nepusz, T., Yu, H., Paccanaro, A.: Detecting overlapping protein complexes in protein-protein interaction networks. Nat. Methods 9, 471–472 (2012)CrossRef
16.
Zurück zum Zitat Orengo, C., et al.: New functional families in CATH to improve the mapping of conserved functional sites to 3D structures. Nucleic Acids Res. 44, 490–498 (2013) Orengo, C., et al.: New functional families in CATH to improve the mapping of conserved functional sites to 3D structures. Nucleic Acids Res. 44, 490–498 (2013)
17.
Zurück zum Zitat Pearson, R.: Effective protein sequence comparison. Methods Enzymol. 266, 227–258 (1996)CrossRef Pearson, R.: Effective protein sequence comparison. Methods Enzymol. 266, 227–258 (1996)CrossRef
18.
Zurück zum Zitat Petryszak, R., et al.: Expression Atlas update - an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Res. 44(D1), 746–752 (2016)CrossRef Petryszak, R., et al.: Expression Atlas update - an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Res. 44(D1), 746–752 (2016)CrossRef
19.
Zurück zum Zitat Pirim, H., Eksioglu, B., Perkins, A.: Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic. Comput. Biol. Med. 62, 94–102 (2015)CrossRef Pirim, H., Eksioglu, B., Perkins, A.: Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic. Comput. Biol. Med. 62, 94–102 (2015)CrossRef
20.
Zurück zum Zitat Rung, J., Schlitt, T., Brazma, A., Freivalds, K., Vilo, J.: Building and analysing genome-wide gene disruption networks. Bioinformatics 18, S202–S210 (2002)CrossRef Rung, J., Schlitt, T., Brazma, A., Freivalds, K., Vilo, J.: Building and analysing genome-wide gene disruption networks. Bioinformatics 18, S202–S210 (2002)CrossRef
21.
22.
Zurück zum Zitat Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)CrossRef Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)CrossRef
24.
25.
Zurück zum Zitat Vihrovs, J., Prusis, K., Freivalds, K., Rucevskis, P., Krebs, V.: A potential field function for overlapping point set and graph cluster visualization. Commun. Comput. Inf. Sci. 550, 136–152 (2015) Vihrovs, J., Prusis, K., Freivalds, K., Rucevskis, P., Krebs, V.: A potential field function for overlapping point set and graph cluster visualization. Commun. Comput. Inf. Sci. 550, 136–152 (2015)
26.
Zurück zum Zitat Viksna, J., Gilbert, D.: Assessment of the probabilities for evolutionary structural changes in protein folds. Bioinformatics 23, 832–841 (2007)CrossRef Viksna, J., Gilbert, D.: Assessment of the probabilities for evolutionary structural changes in protein folds. Bioinformatics 23, 832–841 (2007)CrossRef
Metadaten
Titel
Application of Graph Clustering and Visualisation Methods to Analysis of Biomolecular Data
verfasst von
Edgars Celms
Kārlis Čerāns
Kārlis Freivalds
Paulis Ķikusts
Lelde Lāce
Gatis Melkus
Mārtiņš Opmanis
Dārta Rituma
Pēteris Ručevskis
Juris Vīksna
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-97571-9_20

Premium Partner