Skip to main content
Top

2018 | OriginalPaper | Chapter

Predicting Disease Genes from Clinical Single Sample-Based PPI Networks

Authors : Ping Luo, Li-Ping Tian, Bolin Chen, Qianghua Xiao, Fang-Xiang Wu

Published in: Bioinformatics and Biomedical Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Experimentally identifying disease genes is time-consuming and expensive, and thus it is appealing to develop computational methods for predicting disease genes. Many existing methods predict new disease genes from protein-protein interaction (PPI) networks. However, PPIs are changing during cells’ lifetime and thus only using the static PPI networks may degrade the performance of algorithms. In this study, we propose an algorithm for predicting disease genes based on centrality features extracted from clinical single sample-based PPI networks (dgCSN). Our dgCSN first constructs a single sample-based network from a universal static PPI network and the clinical gene expression of each case sample, and fuses them into a network according to the frequency of each edge appearing in all single sample-based networks. Then, centrality-based features are extracted from the fused network to capture the property of each gene. Finally, regression analysis is performed to predict the probability of each gene being disease-associated. The experiments show that our dgCSN achieves the AUC values of 0.893 and 0.807 on Breast Cancer and Alzheimer’s disease, respectively, which are better than two competing methods. Further analysis on the top 10 prioritized genes also demonstrate that dgCSN is effective for predicting new disease genes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Moody, S.E., Boehm, J.S., Barbie, D.A., Hahn, W.C.: Functional genomics and cancer drug target discovery. Curr. Opin. Mol. Ther. 12(3), 284–293 (2010) Moody, S.E., Boehm, J.S., Barbie, D.A., Hahn, W.C.: Functional genomics and cancer drug target discovery. Curr. Opin. Mol. Ther. 12(3), 284–293 (2010)
2.
go back to reference Yang, P., Li, X., Wu, M., Kwoh, C.K., Ng, S.K.: Inferring gene-phenotype associations via global protein complex network propagation. PLoS ONE 6(7), e21502 (2011)CrossRef Yang, P., Li, X., Wu, M., Kwoh, C.K., Ng, S.K.: Inferring gene-phenotype associations via global protein complex network propagation. PLoS ONE 6(7), e21502 (2011)CrossRef
3.
go back to reference Chen, B., Shang, X., Li, M., Wang, J., Wu, F.X.: A two-step logistic regression algorithm for identifying individual-cancer-related genes. In: 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 195–200. IEEE (2015) Chen, B., Shang, X., Li, M., Wang, J., Wu, F.X.: A two-step logistic regression algorithm for identifying individual-cancer-related genes. In: 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 195–200. IEEE (2015)
4.
go back to reference Chen, B., Shang, X., Li, M., Wang, J., Wu, F.X.: Identifying individual-cancer-related genes by rebalancing the training samples. IEEE Trans. Nanobiosci. 15(4), 309–315 (2016)CrossRef Chen, B., Shang, X., Li, M., Wang, J., Wu, F.X.: Identifying individual-cancer-related genes by rebalancing the training samples. IEEE Trans. Nanobiosci. 15(4), 309–315 (2016)CrossRef
5.
go back to reference Tang, X., Hu, X., Yang, X., Sun, Y.: A algorithm for identifying disease genes by incorporating the subcellular localization information into the protein-protein interaction networks. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 308–311. IEEE (2016) Tang, X., Hu, X., Yang, X., Sun, Y.: A algorithm for identifying disease genes by incorporating the subcellular localization information into the protein-protein interaction networks. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 308–311. IEEE (2016)
6.
go back to reference Yang, P., Li, X.L., Mei, J.P., Kwoh, C.K., Ng, S.K.: Positive-unlabeled learning for disease gene identification. Bioinformatics 28(20), 2640–2647 (2012)CrossRef Yang, P., Li, X.L., Mei, J.P., Kwoh, C.K., Ng, S.K.: Positive-unlabeled learning for disease gene identification. Bioinformatics 28(20), 2640–2647 (2012)CrossRef
7.
go back to reference Jia, P., Zheng, S., Long, J., Zheng, W., Zhao, Z.: dmGWAS: dense module searching for genome-wide association studies in protein-protein interaction networks. Bioinformatics 27(1), 95–102 (2011)CrossRef Jia, P., Zheng, S., Long, J., Zheng, W., Zhao, Z.: dmGWAS: dense module searching for genome-wide association studies in protein-protein interaction networks. Bioinformatics 27(1), 95–102 (2011)CrossRef
8.
go back to reference Aerts, S., Lambrechts, D., Maity, S., Van Loo, P., Coessens, B., De Smet, F., Tranchevent, L.C., De Moor, B., Marynen, P., Hassan, B., et al.: Gene prioritization through genomic data fusion. Nat. Biotechnol. 24(5), 537–544 (2006)CrossRef Aerts, S., Lambrechts, D., Maity, S., Van Loo, P., Coessens, B., De Smet, F., Tranchevent, L.C., De Moor, B., Marynen, P., Hassan, B., et al.: Gene prioritization through genomic data fusion. Nat. Biotechnol. 24(5), 537–544 (2006)CrossRef
11.
go back to reference Hou, L., Chen, M., Zhang, C.K., Cho, J., Zhao, H.: Guilt by rewiring: gene prioritization through network rewiring in genome wide association studies. Hum. Mol. Genet. 23(10), 2780–2790 (2014)CrossRef Hou, L., Chen, M., Zhang, C.K., Cho, J., Zhao, H.: Guilt by rewiring: gene prioritization through network rewiring in genome wide association studies. Hum. Mol. Genet. 23(10), 2780–2790 (2014)CrossRef
12.
go back to reference Luo, P., Tian, L.P., Ruan, J., Wu, F.X.: Identifying disease genes from PPI networks weighted by gene expression under different conditions. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1259–1264. IEEE (2016) Luo, P., Tian, L.P., Ruan, J., Wu, F.X.: Identifying disease genes from PPI networks weighted by gene expression under different conditions. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1259–1264. IEEE (2016)
13.
go back to reference Wang, J., Peng, X., Li, M., Pan, Y.: Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics 13(2), 301–312 (2013)CrossRef Wang, J., Peng, X., Li, M., Pan, Y.: Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics 13(2), 301–312 (2013)CrossRef
14.
go back to reference Meng, X., Li, M., Wang, J., Wu, F.X., Pan, Y.: Construction of the spatial and temporal active protein interaction network for identifying protein complexes. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 631–636. IEEE (2016) Meng, X., Li, M., Wang, J., Wu, F.X., Pan, Y.: Construction of the spatial and temporal active protein interaction network for identifying protein complexes. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 631–636. IEEE (2016)
15.
go back to reference Chen, B., Fan, W., Liu, J., Wu, F.X.: Identifying protein complexes and functional modules from static PPI networks to dynamic PPI networks. Brief. Bioinform. 15(2), 177–194 (2013)CrossRef Chen, B., Fan, W., Liu, J., Wu, F.X.: Identifying protein complexes and functional modules from static PPI networks to dynamic PPI networks. Brief. Bioinform. 15(2), 177–194 (2013)CrossRef
16.
go back to reference Chen, B., Wang, J., Li, M., Wu, F.X.: Identifying disease genes by integrating multiple data sources. BMC Med. Genomics 7(Suppl. 2), S2 (2014)CrossRef Chen, B., Wang, J., Li, M., Wu, F.X.: Identifying disease genes by integrating multiple data sources. BMC Med. Genomics 7(Suppl. 2), S2 (2014)CrossRef
17.
go back to reference Chen, B., Li, M., Wang, J., Wu, F.X.: Disease gene identification by using graph kernels and Markov random fields. Sci. China Life Sci. 57(11), 1054–1063 (2014)CrossRef Chen, B., Li, M., Wang, J., Wu, F.X.: Disease gene identification by using graph kernels and Markov random fields. Sci. China Life Sci. 57(11), 1054–1063 (2014)CrossRef
18.
go back to reference Chen, B., Li, M., Wang, J., Shang, X., Wu, F.X.: A fast and high performance multiple data integration algorithm for identifying human disease genes. BMC Med. Genomics 8(Suppl. 3), S2 (2015)CrossRef Chen, B., Li, M., Wang, J., Shang, X., Wu, F.X.: A fast and high performance multiple data integration algorithm for identifying human disease genes. BMC Med. Genomics 8(Suppl. 3), S2 (2015)CrossRef
19.
go back to reference Köhler, S., Bauer, S., Horn, D., Robinson, P.N.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)CrossRef Köhler, S., Bauer, S., Horn, D., Robinson, P.N.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)CrossRef
20.
go back to reference Hoff, P.D., Raftery, A.E., Handcock, M.S.: Latent space approaches to social network analysis. J. Am. Stat. Assoc. 97(460), 1090–1098 (2002)MathSciNetCrossRefMATH Hoff, P.D., Raftery, A.E., Handcock, M.S.: Latent space approaches to social network analysis. J. Am. Stat. Assoc. 97(460), 1090–1098 (2002)MathSciNetCrossRefMATH
21.
go back to reference Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Nat. Acad. Sci. U.S.A. 101(9), 2658–2663 (2004)CrossRef Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Nat. Acad. Sci. U.S.A. 101(9), 2658–2663 (2004)CrossRef
22.
go back to reference Wang, J., Li, M., Wang, H., Pan, Y.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinf. 9(4), 1070–1080 (2012)CrossRef Wang, J., Li, M., Wang, H., Pan, Y.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinf. 9(4), 1070–1080 (2012)CrossRef
23.
go back to reference Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
24.
go back to reference McKusick, V., et al.: Online mendelian inheritance in man (OMIM). Mckusick-Nathans Institute for Genetic Medicine, Johns Hopkins University. National Center for Biotechnology Information, National Library of Medicine, Bethesda (2004). http://www.ncbi.nlm.nih.gov/omim/ McKusick, V., et al.: Online mendelian inheritance in man (OMIM). Mckusick-Nathans Institute for Genetic Medicine, Johns Hopkins University. National Center for Biotechnology Information, National Library of Medicine, Bethesda (2004). http://​www.​ncbi.​nlm.​nih.​gov/​omim/​
25.
go back to reference Luo, P., Tian, L.P., Ruan, J., Wu, F.: Disease gene prediction by integrating PPI networks, clinical RNA-Seq data and OMIM data. IEEE/ACM Trans. Comput. Biol. Bioinf. (2017) Luo, P., Tian, L.P., Ruan, J., Wu, F.: Disease gene prediction by integrating PPI networks, clinical RNA-Seq data and OMIM data. IEEE/ACM Trans. Comput. Biol. Bioinf. (2017)
27.
go back to reference Grossman, R.L., Heath, A.P., Ferretti, V., Varmus, H.E., Lowy, D.R., Kibbe, W.A., Staudt, L.M.: Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375(12), 1109–1112 (2016)CrossRef Grossman, R.L., Heath, A.P., Ferretti, V., Varmus, H.E., Lowy, D.R., Kibbe, W.A., Staudt, L.M.: Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375(12), 1109–1112 (2016)CrossRef
28.
go back to reference Scheckel, C., Drapeau, E., Frias, M.A., Park, C.Y., Fak, J., Zucker-Scharff, I., Kou, Y., Haroutunian, V., Ma’ayan, A., Buxbaum, J.D., et al.: Regulatory consequences of neuronal ELAV-like protein binding to coding and non-coding RNAs in human brain. Elife 5, e10421 (2016)CrossRef Scheckel, C., Drapeau, E., Frias, M.A., Park, C.Y., Fak, J., Zucker-Scharff, I., Kou, Y., Haroutunian, V., Ma’ayan, A., Buxbaum, J.D., et al.: Regulatory consequences of neuronal ELAV-like protein binding to coding and non-coding RNAs in human brain. Elife 5, e10421 (2016)CrossRef
29.
go back to reference Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15(12), 550 (2014)CrossRef Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15(12), 550 (2014)CrossRef
30.
go back to reference Dillies, M.A., Rau, A., Aubert, J., Hennequet-Antier, C., Jeanmougin, M., Servant, N., Keime, C., Marot, G., Castel, D., Estelle, J., et al.: A comprehensive evaluation of normalization methods for illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 14(6), 671–683 (2013)CrossRef Dillies, M.A., Rau, A., Aubert, J., Hennequet-Antier, C., Jeanmougin, M., Servant, N., Keime, C., Marot, G., Castel, D., Estelle, J., et al.: A comprehensive evaluation of normalization methods for illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 14(6), 671–683 (2013)CrossRef
31.
go back to reference Li, T., Wernersson, R., Hansen, R.B., Horn, H., Mercer, J., Slodkowicz, G., Workman, C.T., Rigina, O., Rapacki, K., Stærfeldt, H.H., et al.: A scored human protein-protein interaction network to catalyze genomic interpretation. Nat. Methods 14(1), 61–64 (2016)CrossRef Li, T., Wernersson, R., Hansen, R.B., Horn, H., Mercer, J., Slodkowicz, G., Workman, C.T., Rigina, O., Rapacki, K., Stærfeldt, H.H., et al.: A scored human protein-protein interaction network to catalyze genomic interpretation. Nat. Methods 14(1), 61–64 (2016)CrossRef
32.
go back to reference Chen, Y., Wang, W., Zhou, Y., Shields, R., Chanda, S.K., Elston, R.C., Li, J.: In silico gene prioritization by integrating multiple data sources. PLoS ONE 6(6), e21137 (2011)CrossRef Chen, Y., Wang, W., Zhou, Y., Shields, R., Chanda, S.K., Elston, R.C., Li, J.: In silico gene prioritization by integrating multiple data sources. PLoS ONE 6(6), e21137 (2011)CrossRef
33.
go back to reference Erten, S., Bebek, G., Ewing, R.M., Koyutürk, M.: DADA: degree-aware algorithms for network-based disease gene prioritization. BioData Min. 4(1), 19 (2011)CrossRef Erten, S., Bebek, G., Ewing, R.M., Koyutürk, M.: DADA: degree-aware algorithms for network-based disease gene prioritization. BioData Min. 4(1), 19 (2011)CrossRef
34.
go back to reference Chen, J., Bardes, E.E., Aronow, B.J., Jegga, A.G.: ToppGene suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 37(Suppl. 2), W305–W311 (2009)CrossRef Chen, J., Bardes, E.E., Aronow, B.J., Jegga, A.G.: ToppGene suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 37(Suppl. 2), W305–W311 (2009)CrossRef
35.
go back to reference Weber, A.M., Ryan, A.J.: ATM and ATR as therapeutic targets in cancer. Pharmacol. Ther. 149, 124–138 (2015)CrossRef Weber, A.M., Ryan, A.J.: ATM and ATR as therapeutic targets in cancer. Pharmacol. Ther. 149, 124–138 (2015)CrossRef
36.
go back to reference Soria-Bretones, I., Sáez, C., Ruíz-Borrego, M., Japón, M.A., Huertas, P.: Prognostic value of CtIP/RBBP8 expression in breast cancer. Cancer Med. 2(6), 774–783 (2013)CrossRef Soria-Bretones, I., Sáez, C., Ruíz-Borrego, M., Japón, M.A., Huertas, P.: Prognostic value of CtIP/RBBP8 expression in breast cancer. Cancer Med. 2(6), 774–783 (2013)CrossRef
37.
go back to reference Stotani, S., Giordanetto, F., Medda, F.: DYRK1A inhibition as potential treatment for Alzheimers disease. Future Med. Chem. 8(6), 681–696 (2016)CrossRef Stotani, S., Giordanetto, F., Medda, F.: DYRK1A inhibition as potential treatment for Alzheimers disease. Future Med. Chem. 8(6), 681–696 (2016)CrossRef
Metadata
Title
Predicting Disease Genes from Clinical Single Sample-Based PPI Networks
Authors
Ping Luo
Li-Ping Tian
Bolin Chen
Qianghua Xiao
Fang-Xiang Wu
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-78723-7_21

Premium Partner