Skip to main content
Top

2017 | OriginalPaper | Chapter

Predicting Essential Proteins Based on Gene Expression Data, Subcellular Localization and PPI Data

Authors : Xiujuan Lei, Siguo Wang, Linqiang Pan

Published in: Bio-inspired Computing: Theories and Applications

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Predicting essential proteins is indispensable for understanding the minimal requirements of cellular survival and development. In recent years, many methods combined with the topological features of PPI networks have been proposed. However, most of these approaches ignored the intrinsic characteristics of biological attributes. This paper integrates Gene expression data, Subcellular localization and PPI networks to identify essential proteins, named GSP. We use local average connectivity and edge clustering coefficient unite with gene expression data to measure centralities of nodes. Compared with non-essential proteins, essential proteins appear more frequently in some subcellular localizations such as Nucleus and considering that different compartments play different roles, thus we integrate subcellular localization information to identify essential proteins. The computational experiment results on the yeast PPI networks show that the proposed method GSP outperforms other state-of-art methods including DC, EC, IC, SC, NC, LAC, PeC, WDC and UDoNC.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Glass, J.I., Hutchison, C.A., Smith, H.O., Venter, J.C.: A systems biology tour de force for a near-minimal bacterium. Mol. Syst. Biol. 5, 330 (2009)CrossRef Glass, J.I., Hutchison, C.A., Smith, H.O., Venter, J.C.: A systems biology tour de force for a near-minimal bacterium. Mol. Syst. Biol. 5, 330 (2009)CrossRef
2.
go back to reference Zhang, R., Lin, Y.: DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 37, 455–458 (2009)CrossRef Zhang, R., Lin, Y.: DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 37, 455–458 (2009)CrossRef
3.
go back to reference Li, M., Zheng, R.Q., Li, Q., Wang, J.X., Wu, F.X., Zhang, Z.H.: Prioritizing disease genes by using search engine algorithm. Curr. Bioinform. 11, 195–202 (2016)CrossRef Li, M., Zheng, R.Q., Li, Q., Wang, J.X., Wu, F.X., Zhang, Z.H.: Prioritizing disease genes by using search engine algorithm. Curr. Bioinform. 11, 195–202 (2016)CrossRef
4.
go back to reference Lan, W., Wang, J.X., Li, M., Peng, W., Wu, F.X.: Computational approaches for prioritizing candidate disease genes based on PPI networks. Tsinghua Sci. Technol. 20, 500–512 (2015)CrossRefMathSciNet Lan, W., Wang, J.X., Li, M., Peng, W., Wu, F.X.: Computational approaches for prioritizing candidate disease genes based on PPI networks. Tsinghua Sci. Technol. 20, 500–512 (2015)CrossRefMathSciNet
5.
go back to reference Giaever, G., Chu, A.M., Ni, L., Connelly, C., Riles, L., Vronneau, S., Dow, S., Lucaudanila, A., Anderson, K., Andr, B.: Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387 (2002)CrossRef Giaever, G., Chu, A.M., Ni, L., Connelly, C., Riles, L., Vronneau, S., Dow, S., Lucaudanila, A., Anderson, K., Andr, B.: Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387 (2002)CrossRef
6.
go back to reference Cullen, L.M., Arndt, G.M.: Genome-wide screening for gene function using RNAi in mammalian cells. Immunol. Cell Biol. 83, 217–223 (2005)CrossRef Cullen, L.M., Arndt, G.M.: Genome-wide screening for gene function using RNAi in mammalian cells. Immunol. Cell Biol. 83, 217–223 (2005)CrossRef
7.
go back to reference Roemer, T., Jiang, B., Davison, J., Ketela, T., Veillette, K., Breton, A., Tandia, F., Linteau, A., Sillaots, S., Marta, C.: Large-scale essential gene identification in candida albicans and applications to antifungal drug discovery. Mol. Microbiol. 50, 167–181 (2003)CrossRef Roemer, T., Jiang, B., Davison, J., Ketela, T., Veillette, K., Breton, A., Tandia, F., Linteau, A., Sillaots, S., Marta, C.: Large-scale essential gene identification in candida albicans and applications to antifungal drug discovery. Mol. Microbiol. 50, 167–181 (2003)CrossRef
8.
go back to reference Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., Rothberg, J.M.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)CrossRef Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., Rothberg, J.M.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)CrossRef
9.
go back to reference Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Natl. Acad. Sci. U. S. A. 98, 4569–4574 (2001)CrossRef Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Natl. Acad. Sci. U. S. A. 98, 4569–4574 (2001)CrossRef
10.
go back to reference Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A., Taylor, P., Bennett, K., Boutilier, K.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180 (2002)CrossRef Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A., Taylor, P., Bennett, K., Boutilier, K.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180 (2002)CrossRef
11.
go back to reference Mering, C.V., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein—[Ndash]—-protein interactions. Nature 417, 399–403 (2002)CrossRef Mering, C.V., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein—[Ndash]—-protein interactions. Nature 417, 399–403 (2002)CrossRef
12.
go back to reference Jeong, H., Mason, S.P., Barabsi, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)CrossRef Jeong, H., Mason, S.P., Barabsi, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)CrossRef
13.
go back to reference Bonacich, P.: Power and centrality: a family of measures. Am. J. Soc. 92, 1170–1182 (1987)CrossRef Bonacich, P.: Power and centrality: a family of measures. Am. J. Soc. 92, 1170–1182 (1987)CrossRef
14.
go back to reference Snee, R.D.: Validation of regression models: methods and examples. Technometrics 19, 415–428 (1977)CrossRefMATH Snee, R.D.: Validation of regression models: methods and examples. Technometrics 19, 415–428 (1977)CrossRefMATH
15.
go back to reference Hahn, M.W., Kern, A.D.: Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol. Biol. Evol. 22, 803–806 (2005)CrossRef Hahn, M.W., Kern, A.D.: Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol. Biol. Evol. 22, 803–806 (2005)CrossRef
16.
go back to reference Joy, M.P., Brock, A., Ingber, D.E., Huang, S.: High-betweenness proteins in the yeast protein interaction network. Biomed. Res. Int. 2005, 96 (2005) Joy, M.P., Brock, A., Ingber, D.E., Huang, S.: High-betweenness proteins in the yeast protein interaction network. Biomed. Res. Int. 2005, 96 (2005)
17.
go back to reference Estrada, E., Rodrguez-Velzquez, J.A.: Subgraph centrality in complex networks. Phys. Rev. E 71, 056103–056103 (2005)CrossRefMathSciNet Estrada, E., Rodrguez-Velzquez, J.A.: Subgraph centrality in complex networks. Phys. Rev. E 71, 056103–056103 (2005)CrossRefMathSciNet
18.
go back to reference Li, M., Wang, J., Wang, H., Pan, Y.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 1070 (2012)CrossRef Li, M., Wang, J., Wang, H., Pan, Y.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 1070 (2012)CrossRef
19.
go back to reference Li, M., Wang, J., Chen, X., Wang, H., Pan, Y.: A local average connectivity-based method for identifying essential proteins from the network level. Comput. Biol. Chem. 35, 143 (2011)CrossRefMATHMathSciNet Li, M., Wang, J., Chen, X., Wang, H., Pan, Y.: A local average connectivity-based method for identifying essential proteins from the network level. Comput. Biol. Chem. 35, 143 (2011)CrossRefMATHMathSciNet
20.
go back to reference Li, M., Lu, Y., Wang, J., Wu, F.X., Pan, Y.: A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 372 (2015)CrossRef Li, M., Lu, Y., Wang, J., Wu, F.X., Pan, Y.: A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 372 (2015)CrossRef
21.
go back to reference Wang, J., Li, M., Chen, J., Pan, Y.: A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 607–620 (2011)CrossRef Wang, J., Li, M., Chen, J., Pan, Y.: A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 607–620 (2011)CrossRef
22.
go back to reference Wang, J.X., Chen, J.E., Min, L., Hu, B., Gang, C.: Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinform. 9, 1–16 (2008)CrossRef Wang, J.X., Chen, J.E., Min, L., Hu, B., Gang, C.: Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinform. 9, 1–16 (2008)CrossRef
23.
go back to reference Zhao, B., Wang, J., Li, M., Wu, F.X., Pan, Y.: Detecting protein complexes basedon uncertain graph model. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 486–497 (2014)CrossRef Zhao, B., Wang, J., Li, M., Wu, F.X., Pan, Y.: Detecting protein complexes basedon uncertain graph model. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 486–497 (2014)CrossRef
24.
go back to reference Peng, W., Wang, J., Zhao, B., Wang, L.: Identification of protein complexes using weighted PageRank-Nibble algorithm and core-attachment structure. IEEE/ACM T. Comput. Biol. Bioinform. 12, 179–192 (2015)CrossRef Peng, W., Wang, J., Zhao, B., Wang, L.: Identification of protein complexes using weighted PageRank-Nibble algorithm and core-attachment structure. IEEE/ACM T. Comput. Biol. Bioinform. 12, 179–192 (2015)CrossRef
25.
go back to reference Michael, H., Grant, B.K., Artem, C.: The use of gene ontology terms for predicting highly-connected ‘hub’ nodes in protein-protein interaction networks. BMC Syst. Biol. 2, 1–14 (2008)CrossRef Michael, H., Grant, B.K., Artem, C.: The use of gene ontology terms for predicting highly-connected ‘hub’ nodes in protein-protein interaction networks. BMC Syst. Biol. 2, 1–14 (2008)CrossRef
26.
go back to reference Acencio, M.L., Lemke, N.: Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinform. 10, 290 (2009)CrossRef Acencio, M.L., Lemke, N.: Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinform. 10, 290 (2009)CrossRef
27.
go back to reference Li, M., Zhang, H., Wang, J.X., Pan, Y.: A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst. Biol. 6, 15 (2012)CrossRef Li, M., Zhang, H., Wang, J.X., Pan, Y.: A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst. Biol. 6, 15 (2012)CrossRef
28.
go back to reference Tang, X., Wang, J., Zhong, J., Pan, Y.: Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 407 (2014)CrossRef Tang, X., Wang, J., Zhong, J., Pan, Y.: Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 407 (2014)CrossRef
29.
go back to reference Peng, X., Wang, J., Wang, J., Wu, F.X., Pan, Y.: Rechecking the centrality-lethality rule in the scope of protein subcellular localization interaction networks. Plos One 10, e0130743 (2015)CrossRef Peng, X., Wang, J., Wang, J., Wu, F.X., Pan, Y.: Rechecking the centrality-lethality rule in the scope of protein subcellular localization interaction networks. Plos One 10, e0130743 (2015)CrossRef
30.
go back to reference Peng, W., Wang, J., Cheng, Y., Lu, Y., Wu, F., Pan, Y.: UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 276–288 (2015)CrossRef Peng, W., Wang, J., Cheng, Y., Lu, Y., Wu, F., Pan, Y.: UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 276–288 (2015)CrossRef
31.
go back to reference Chao, Q., Sun, Y., Dong, Y.: A new method for identifying essential proteins based on network topology properties and protein complexes. Plos One 11, e0161042 (2016)CrossRef Chao, Q., Sun, Y., Dong, Y.: A new method for identifying essential proteins based on network topology properties and protein complexes. Plos One 11, e0161042 (2016)CrossRef
32.
go back to reference Luo, J., Kuang, L.: A new method for predicting essential proteins based on dynamic network topology and complex information. Comput. Biol. Chem. 52, 34–42 (2014)CrossRef Luo, J., Kuang, L.: A new method for predicting essential proteins based on dynamic network topology and complex information. Comput. Biol. Chem. 52, 34–42 (2014)CrossRef
33.
go back to reference Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Natl. Acad. Sci. U. S. A. 101, 2658–2663 (2004)CrossRef Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Natl. Acad. Sci. U. S. A. 101, 2658–2663 (2004)CrossRef
34.
go back to reference Xenarios, I., Salwnski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303 (2002)CrossRef Xenarios, I., Salwnski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303 (2002)CrossRef
35.
go back to reference Tu, B.P., Mcknight, S.L.: Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science 310, 1152 (2005)CrossRef Tu, B.P., Mcknight, S.L.: Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science 310, 1152 (2005)CrossRef
36.
go back to reference Binder, J.X., Pletscher-Frankild, S., Tsafou, K., Stolte, C., ODonoghue, S.I., Schneider, R., Jensen, L.J.: Compartments: Unification and Visualization Of Protein Subcellular Localization Evidence. Database, (2014–01-01) 2014, bau012 (2014) Binder, J.X., Pletscher-Frankild, S., Tsafou, K., Stolte, C., ODonoghue, S.I., Schneider, R., Jensen, L.J.: Compartments: Unification and Visualization Of Protein Subcellular Localization Evidence. Database, (2014–01-01) 2014, bau012 (2014)
37.
go back to reference Mewes, H.W., Amid, C., Arnold, R., Frishman, D., Güldener, U., Mannhaupt, G., Münsterkötter, M., Pagel, P., Strack, N., Stümpflen, V., Warfsmann, J.: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 34, D169 (2006)CrossRef Mewes, H.W., Amid, C., Arnold, R., Frishman, D., Güldener, U., Mannhaupt, G., Münsterkötter, M., Pagel, P., Strack, N., Stümpflen, V., Warfsmann, J.: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 34, D169 (2006)CrossRef
38.
go back to reference Isseltarver, L., Christie, K.R., Dolinski, K., Andrada, R., Balakrishnan, R., Ball, C.A., Binkley, G., Dong, S., Dwight, S.S., Fisk, D.G.: Saccharomyces genome database. Methods Enzymol. 350, 329 (2002)CrossRef Isseltarver, L., Christie, K.R., Dolinski, K., Andrada, R., Balakrishnan, R., Ball, C.A., Binkley, G., Dong, S., Dwight, S.S., Fisk, D.G.: Saccharomyces genome database. Methods Enzymol. 350, 329 (2002)CrossRef
Metadata
Title
Predicting Essential Proteins Based on Gene Expression Data, Subcellular Localization and PPI Data
Authors
Xiujuan Lei
Siguo Wang
Linqiang Pan
Copyright Year
2017
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7179-9_8

Premium Partner