Skip to main content

2018 | OriginalPaper | Buchkapitel

Deep Learning the Protein Function in Protein Interaction Networks

verfasst von : Kire Trivodaliev, Martin Josifoski, Slobodan Kalajdziski

Erschienen in: ICT Innovations 2018. Engineering and Life Sciences

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

One of the essential challenges in proteomics is the computational function prediction. In Protein Interaction Networks (PINs) this problem is one of proper labeling of corresponding nodes. In this paper a novel three-step approach for supervised protein function learning in PINs is proposed. The first step derives continuous vector representation for the PIN nodes using semi-supervised learning. The vectors are constructed so that they maximize the likelihood of preservation of the graph topology locally and globally. The next step is to binarize the PIN graph nodes (proteins) i.e. for each protein function derived from Gene Ontology (GO) determine the positive and negative set of nodes. The challenge of determining the negative node sets is solved by random walking the GO acyclic graph weighted by a semantic similarity metric. A simple deep learning six-layer model is built for the protein function learning as the final step. Experiments are performed using a highly reliable human protein interaction network. Results indicate that the proposed approach can be very successful in determining protein function since the Area Under the Curve values are high (>0.79) even though the experimental setup is very simple, and its performance is comparable with state-of-the-art competing methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cao, M., et al.: Going the distance for protein function prediction: a new distance metric for protein interaction networks. PLoS ONE 8, e76339 (2013)CrossRef Cao, M., et al.: Going the distance for protein function prediction: a new distance metric for protein interaction networks. PLoS ONE 8, e76339 (2013)CrossRef
2.
Zurück zum Zitat Cao, S., Lu, W., Xu, Q.: Grarep: learning graph representations with global structural information. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 891–900. ACM (2015) Cao, S., Lu, W., Xu, Q.: Grarep: learning graph representations with global structural information. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 891–900. ACM (2015)
3.
Zurück zum Zitat Cesa-Bianchi, N., Re, M., Valentini, G.: Synergy of multi-label hierarchical ensembles, data fusion, and cost-sensitive methods for gene functional inference. Mach. Learn. 88, 209–241 (2012)MathSciNetCrossRef Cesa-Bianchi, N., Re, M., Valentini, G.: Synergy of multi-label hierarchical ensembles, data fusion, and cost-sensitive methods for gene functional inference. Mach. Learn. 88, 209–241 (2012)MathSciNetCrossRef
4.
Zurück zum Zitat Consortium, G.O.: Expansion of the Gene Ontology knowledgebase and resources. Nucl. Acids Res. 45, D331–D338 (2016) Consortium, G.O.: Expansion of the Gene Ontology knowledgebase and resources. Nucl. Acids Res. 45, D331–D338 (2016)
5.
Zurück zum Zitat Friedberg, I.: Automated protein function prediction—the genomic challenge. Brief. Bioinform. 7, 225–242 (2006)CrossRef Friedberg, I.: Automated protein function prediction—the genomic challenge. Brief. Bioinform. 7, 225–242 (2006)CrossRef
6.
Zurück zum Zitat Fu, G., Wang, J., Yang, B., Yu, G.: NegGOA: negative GO annotations selection using ontology structure. Bioinformatics 32, 2996–3004 (2016)CrossRef Fu, G., Wang, J., Yang, B., Yu, G.: NegGOA: negative GO annotations selection using ontology structure. Bioinformatics 32, 2996–3004 (2016)CrossRef
7.
Zurück zum Zitat Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016) Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016)
8.
Zurück zum Zitat Guan, Y., Myers, C.L., Hess, D.C., Barutcuoglu, Z., Caudy, A.A., Troyanskaya, O.G.: Predicting gene function in a hierarchical context with an ensemble of classifiers. Genome Biol. 9, S3 (2008)CrossRef Guan, Y., Myers, C.L., Hess, D.C., Barutcuoglu, Z., Caudy, A.A., Troyanskaya, O.G.: Predicting gene function in a hierarchical context with an ensemble of classifiers. Genome Biol. 9, S3 (2008)CrossRef
9.
Zurück zum Zitat Hakes, L., Lovell, S.C., Oliver, S.G., Robertson, D.L.: Specificity in protein interactions and its relationship with sequence diversity and coevolution. Proc. Natl. Acad. Sci. 104, 7999–8004 (2007)CrossRef Hakes, L., Lovell, S.C., Oliver, S.G., Robertson, D.L.: Specificity in protein interactions and its relationship with sequence diversity and coevolution. Proc. Natl. Acad. Sci. 104, 7999–8004 (2007)CrossRef
10.
Zurück zum Zitat Hishigaki, H., Nakai, K., Ono, T., Tanigami, A., Takagi, T.: Assessment of prediction accuracy of protein function from protein–protein interaction data. Yeast 18, 523–531 (2001)CrossRef Hishigaki, H., Nakai, K., Ono, T., Tanigami, A., Takagi, T.: Assessment of prediction accuracy of protein function from protein–protein interaction data. Yeast 18, 523–531 (2001)CrossRef
11.
Zurück zum Zitat Hu, H., Yan, X., Huang, Y., Han, J., Zhou, X.J.: Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics 21, i213–i221 (2005)CrossRef Hu, H., Yan, X., Huang, Y., Han, J., Zhou, X.J.: Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics 21, i213–i221 (2005)CrossRef
12.
Zurück zum Zitat Hu, L., Huang, T., Shi, X., Lu, W.-C., Cai, Y.-D., Chou, K.-C.: Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties. PLoS ONE 6, e14556 (2011)CrossRef Hu, L., Huang, T., Shi, X., Lu, W.-C., Cai, Y.-D., Chou, K.-C.: Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties. PLoS ONE 6, e14556 (2011)CrossRef
13.
Zurück zum Zitat Hulsman, M., Dimitrakopoulos, C., de Ridder, J.: Scale-space measures for graph topology link protein network architecture to function. Bioinformatics 30, i237–i245 (2014)CrossRef Hulsman, M., Dimitrakopoulos, C., de Ridder, J.: Scale-space measures for graph topology link protein network architecture to function. Bioinformatics 30, i237–i245 (2014)CrossRef
14.
Zurück zum Zitat Jiang, Y., et al.: An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol. 17, 184 (2016)CrossRef Jiang, Y., et al.: An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol. 17, 184 (2016)CrossRef
15.
Zurück zum Zitat Li, Z., et al.: Large-scale identification of human protein function using topological features of interaction network. Sci. Rep. 6, 37179 (2016)CrossRef Li, Z., et al.: Large-scale identification of human protein function using topological features of interaction network. Sci. Rep. 6, 37179 (2016)CrossRef
16.
Zurück zum Zitat McDermott, J., Bumgarner, R., Samudrala, R.: Functional annotation from predicted protein interaction networks. Bioinformatics 21, 3217–3226 (2005)CrossRef McDermott, J., Bumgarner, R., Samudrala, R.: Functional annotation from predicted protein interaction networks. Bioinformatics 21, 3217–3226 (2005)CrossRef
17.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781 (2013)
18.
Zurück zum Zitat Mostafavi, S., Morris, Q.: Using the gene ontology hierarchy when predicting gene function. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 419–427. AUAI Press (2009) Mostafavi, S., Morris, Q.: Using the gene ontology hierarchy when predicting gene function. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 419–427. AUAI Press (2009)
19.
Zurück zum Zitat Mukhopadhyay, A., Ray, S., De, M.: Detecting protein complexes in a PPI network: a gene ontology based multi-objective evolutionary approach. Mol. BioSystems 8, 3036–3048 (2012)CrossRef Mukhopadhyay, A., Ray, S., De, M.: Detecting protein complexes in a PPI network: a gene ontology based multi-objective evolutionary approach. Mol. BioSystems 8, 3036–3048 (2012)CrossRef
20.
Zurück zum Zitat Nabieva, E., Jim, K., Agarwal, A., Chazelle, B., Singh, M.: Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21, i302–i310 (2005)CrossRef Nabieva, E., Jim, K., Agarwal, A., Chazelle, B., Singh, M.: Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21, i302–i310 (2005)CrossRef
21.
Zurück zum Zitat Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014) Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014)
22.
Zurück zum Zitat Schaefer, M.H., Fontaine, J.-F., Vinayagam, A., Porras, P., Wanker, E.E., Andrade-Navarro, M.A.: HIPPIE: integrating protein interaction networks with experiment based quality scores. PLoS ONE 7, e31826 (2012)CrossRef Schaefer, M.H., Fontaine, J.-F., Vinayagam, A., Porras, P., Wanker, E.E., Andrade-Navarro, M.A.: HIPPIE: integrating protein interaction networks with experiment based quality scores. PLoS ONE 7, e31826 (2012)CrossRef
23.
Zurück zum Zitat Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077. International World Wide Web Conferences Steering Committee (2015) Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077. International World Wide Web Conferences Steering Committee (2015)
24.
Zurück zum Zitat Trivodaliev, K., Bogojeska, A., Kocarev, L.: Exploring function prediction in protein interaction networks via clustering methods. PLoS ONE 9, e99755 (2014)CrossRef Trivodaliev, K., Bogojeska, A., Kocarev, L.: Exploring function prediction in protein interaction networks via clustering methods. PLoS ONE 9, e99755 (2014)CrossRef
26.
Zurück zum Zitat Trivodaliev, K., Kalajdziski, S., Ivanoska, I., Stojkoska, B.R., Kocarev, L.: SHOPIN: semantic homogeneity optimization in protein interaction networks. In: Advances in Protein Chemistry and Structural Biology, vol. 101, pp. 323–349. Elsevier (2015) Trivodaliev, K., Kalajdziski, S., Ivanoska, I., Stojkoska, B.R., Kocarev, L.: SHOPIN: semantic homogeneity optimization in protein interaction networks. In: Advances in Protein Chemistry and Structural Biology, vol. 101, pp. 323–349. Elsevier (2015)
27.
Zurück zum Zitat Valentini, G.: Hierarchical ensemble methods for protein function prediction. ISRN Bioinform. 2014, 1–31 (2014)CrossRef Valentini, G.: Hierarchical ensemble methods for protein function prediction. ISRN Bioinform. 2014, 1–31 (2014)CrossRef
28.
Zurück zum Zitat Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234. ACM (2016) Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234. ACM (2016)
29.
Zurück zum Zitat Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. In: IJCAI, pp. 2111–2117 (2015) Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. In: IJCAI, pp. 2111–2117 (2015)
30.
Zurück zum Zitat Youngs, N., Penfold-Brown, D., Bonneau, R., Shasha, D.: Negative example selection for protein function prediction: the NoGO database. PLoS Comput. Biol. 10, e1003644 (2014)CrossRef Youngs, N., Penfold-Brown, D., Bonneau, R., Shasha, D.: Negative example selection for protein function prediction: the NoGO database. PLoS Comput. Biol. 10, e1003644 (2014)CrossRef
31.
Zurück zum Zitat Youngs, N., Penfold-Brown, D., Drew, K., Shasha, D., Bonneau, R.: Parametric Bayesian priors and better choice of negative examples improve protein function prediction. Bioinformatics 29, 1190–1198 (2013)CrossRef Youngs, N., Penfold-Brown, D., Drew, K., Shasha, D., Bonneau, R.: Parametric Bayesian priors and better choice of negative examples improve protein function prediction. Bioinformatics 29, 1190–1198 (2013)CrossRef
32.
Zurück zum Zitat Zhang, Y., Lin, H., Yang, Z., Wang, J., Li, Y., Xu, B.: Protein complex prediction in large ontology attributed protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 10, 729–741 (2013)CrossRef Zhang, Y., Lin, H., Yang, Z., Wang, J., Li, Y., Xu, B.: Protein complex prediction in large ontology attributed protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 10, 729–741 (2013)CrossRef
Metadaten
Titel
Deep Learning the Protein Function in Protein Interaction Networks
verfasst von
Kire Trivodaliev
Martin Josifoski
Slobodan Kalajdziski
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-00825-3_16