Skip to main content
Top

2016 | OriginalPaper | Chapter

Gene-Disease Prioritization Through Cost-Sensitive Graph-Based Methodologies

Authors : Marco Frasca, Simone Bassis

Published in: Bioinformatics and Biomedical Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Finding genes associated with human genetic disorders is one of the most challenging problems in bio-medicine. In this context, to guide researchers in detecting the most reliable candidate causative-genes for the disease of interest, gene prioritization methods represent a necessary support to automatically rank genes according to their involvement in the disease under study. This problem is characterized by highly unbalanced classes (few causative and much more non-causative genes) and requires the adoption of cost-sensitive techniques to achieve reliable solutions. In this work we propose a network-based methodology for disease-gene prioritization designed to expressly cope with the data imbalance. Its validation over a benchmark composed of 708 selected medical subject headings (MeSH) diseases, shows that our approach is competitive with state-of-art methodologies, and its reduced time complexity makes its application feasible on large-size datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
2
Actually the number of predictors, including the two-way interaction term (i.e. the product of the two features), is equal to 3.
 
Literature
1.
go back to reference Lehne, B., Lewis, C.M., Schlitt, T.: From SNPs to genes: disease association at the gene level. PLoS ONE 6(6), e20133 (2011)CrossRef Lehne, B., Lewis, C.M., Schlitt, T.: From SNPs to genes: disease association at the gene level. PLoS ONE 6(6), e20133 (2011)CrossRef
2.
go back to reference Manolio, T.A.: Genomewide association studies and assessment of the risk of disease. N. Engl. J. Med. 363(2), 166–176 (2010)CrossRef Manolio, T.A.: Genomewide association studies and assessment of the risk of disease. N. Engl. J. Med. 363(2), 166–176 (2010)CrossRef
3.
go back to reference Brnigen, D., et al.: An unbiased evaluation of gene prioritization tools. Bioinformatics 28(23), 3081–3088 (2012)CrossRef Brnigen, D., et al.: An unbiased evaluation of gene prioritization tools. Bioinformatics 28(23), 3081–3088 (2012)CrossRef
4.
go back to reference Ashburner, M., et al.: Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25(1), 25–29 (2000)CrossRef Ashburner, M., et al.: Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25(1), 25–29 (2000)CrossRef
5.
go back to reference Navlakha, S., Kingsford, C.: The power of protein interaction networks for associating genes with diseases. Bioinformatics 26(8), 1057–1063 (2010)CrossRef Navlakha, S., Kingsford, C.: The power of protein interaction networks for associating genes with diseases. Bioinformatics 26(8), 1057–1063 (2010)CrossRef
6.
go back to reference Vanunu, O., Sharan, R.: A propagation-based algorithm for inferring gene-disease associations. In: Proceedings of the German Conference on Bioinformatics, GCB, September 9–12, Dresden, Germany (2008) Vanunu, O., Sharan, R.: A propagation-based algorithm for inferring gene-disease associations. In: Proceedings of the German Conference on Bioinformatics, GCB, September 9–12, Dresden, Germany (2008)
7.
go back to reference Kohler, S., et al.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)CrossRef Kohler, S., et al.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)CrossRef
8.
go back to reference Antanaviciute, A., et al.: Ova: integrating molecular and physical phenotype data from multiple biomedical domain ontologies with variant filtering for enhanced variant prioritization. Bioinformatics 31(23), 3822–3829 (2015) Antanaviciute, A., et al.: Ova: integrating molecular and physical phenotype data from multiple biomedical domain ontologies with variant filtering for enhanced variant prioritization. Bioinformatics 31(23), 3822–3829 (2015)
9.
go back to reference Valentini, G., et al.: An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif. Intell. Med. 61(2), 63–78 (2014)MathSciNetCrossRef Valentini, G., et al.: An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif. Intell. Med. 61(2), 63–78 (2014)MathSciNetCrossRef
10.
go back to reference Frasca, M., et al.: UNIPred: unbalance-aware network integration and prediction of protein functions. J. Comput. Biol. 22(12), 1057–1074 (2015)CrossRef Frasca, M., et al.: UNIPred: unbalance-aware network integration and prediction of protein functions. J. Comput. Biol. 22(12), 1057–1074 (2015)CrossRef
11.
go back to reference Amberger, J., Bocchini, C., Hamosh, A.: A new face and new challenges for online mendelian inheritance in man (OMIM). Hum. Mutat. 32(5), 564–567 (2011)CrossRef Amberger, J., Bocchini, C., Hamosh, A.: A new face and new challenges for online mendelian inheritance in man (OMIM). Hum. Mutat. 32(5), 564–567 (2011)CrossRef
12.
go back to reference Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pp. 973–978 (2001) Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pp. 973–978 (2001)
13.
go back to reference Frasca, M., et al.: A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw. 43, 84–98 (2013)CrossRefMATH Frasca, M., et al.: A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw. 43, 84–98 (2013)CrossRefMATH
14.
go back to reference Frasca, M.: Automated gene function prediction through gene multifunctionality in biological networks. Neurocomputing 162, 48–56 (2015)CrossRef Frasca, M.: Automated gene function prediction through gene multifunctionality in biological networks. Neurocomputing 162, 48–56 (2015)CrossRef
15.
go back to reference Bertoni, A., Frasca, M., Valentini, G.: COSNet: a cost sensitive neural network for semi-supervised learning in graphs. In: Hofmann, T., Malerba, D., Vazirgiannis, M., Gunopulos, D. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 219–234. Springer, Heidelberg (2011)CrossRef Bertoni, A., Frasca, M., Valentini, G.: COSNet: a cost sensitive neural network for semi-supervised learning in graphs. In: Hofmann, T., Malerba, D., Vazirgiannis, M., Gunopulos, D. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 219–234. Springer, Heidelberg (2011)CrossRef
17.
go back to reference Davis, A.P., et al.: Comparative toxicogenomics database: a knowledgebase and discovery tool for chemical-gene-disease networks. Nucleic Acids Res. 37(Database issue), D786–D792 (2009)CrossRef Davis, A.P., et al.: Comparative toxicogenomics database: a knowledgebase and discovery tool for chemical-gene-disease networks. Nucleic Acids Res. 37(Database issue), D786–D792 (2009)CrossRef
18.
go back to reference Wu, G., Feng, X., Stein, L.: A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 11(5), R53+ (2010)CrossRef Wu, G., Feng, X., Stein, L.: A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 11(5), R53+ (2010)CrossRef
19.
go back to reference Lee, I., et al.: Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 21(7), 1109–1121 (2011)CrossRef Lee, I., et al.: Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 21(7), 1109–1121 (2011)CrossRef
20.
go back to reference Segal, E., et al.: A module map showing conditional activity of expression modules in cancer. Nat. Genet. 36(3), 1090–1098 (2004)CrossRef Segal, E., et al.: A module map showing conditional activity of expression modules in cancer. Nat. Genet. 36(3), 1090–1098 (2004)CrossRef
21.
go back to reference Chatr-aryamontri, A., et al.: The biogrid interaction database: 2013 update. Nucleic Acids Res. 41(Database–Issue), 816–823 (2013)CrossRef Chatr-aryamontri, A., et al.: The biogrid interaction database: 2013 update. Nucleic Acids Res. 41(Database–Issue), 816–823 (2013)CrossRef
22.
go back to reference Hellevik, O.: Linear versus logistic regression when the dependent variable is a dichotomy. Qual. Quant. 43(1), 59–74 (2009)CrossRef Hellevik, O.: Linear versus logistic regression when the dependent variable is a dichotomy. Qual. Quant. 43(1), 59–74 (2009)CrossRef
23.
go back to reference Van Del Paal, B.: A comparison of different methods for modelling rare events data. Master thesis in statistical data analysis, Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium (2013–2014) Van Del Paal, B.: A comparison of different methods for modelling rare events data. Master thesis in statistical data analysis, Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium (2013–2014)
24.
go back to reference Derby, N.: An introduction to the analysis of rare events. In: SA16 Proceedings of the 2011 Midwest SAS Users Group Conference, Kansas City, KS (2011) Derby, N.: An introduction to the analysis of rare events. In: SA16 Proceedings of the 2011 Midwest SAS Users Group Conference, Kansas City, KS (2011)
25.
go back to reference He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef
26.
go back to reference Dmochowski, J.P., Sajda, P., Parra, L.C.: Maximum likelihood in cost-sensitive learning: model specification, approximations, and upper bounds. J. Mach. Learn. Res. 11, 3313–3332 (2010)MathSciNetMATH Dmochowski, J.P., Sajda, P., Parra, L.C.: Maximum likelihood in cost-sensitive learning: model specification, approximations, and upper bounds. J. Mach. Learn. Res. 11, 3313–3332 (2010)MathSciNetMATH
27.
go back to reference Lovász, L.: Random walks on graphs: a survey. In: Miklós, D., Sós, V.T., Szőnyi, T. (eds.) Combinatorics, Paul Erdős is Eighty, vol. 2, pp. 353–398. János Bolyai Mathematical Society, Budapest (1996) Lovász, L.: Random walks on graphs: a survey. In: Miklós, D., Sós, V.T., Szőnyi, T. (eds.) Combinatorics, Paul Erdős is Eighty, vol. 2, pp. 353–398. János Bolyai Mathematical Society, Budapest (1996)
28.
go back to reference Schwikowski, B., Uetz, P., Fields, S.: A network of protein-protein interactions in yeast. Nat. Biotechnol. 18(12), 1257–1261 (2000)CrossRef Schwikowski, B., Uetz, P., Fields, S.: A network of protein-protein interactions in yeast. Nat. Biotechnol. 18(12), 1257–1261 (2000)CrossRef
Metadata
Title
Gene-Disease Prioritization Through Cost-Sensitive Graph-Based Methodologies
Authors
Marco Frasca
Simone Bassis
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-31744-1_64

Premium Partner