Skip to main content

2016 | OriginalPaper | Buchkapitel

Gene-Disease Prioritization Through Cost-Sensitive Graph-Based Methodologies

verfasst von : Marco Frasca, Simone Bassis

Erschienen in: Bioinformatics and Biomedical Engineering

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Finding genes associated with human genetic disorders is one of the most challenging problems in bio-medicine. In this context, to guide researchers in detecting the most reliable candidate causative-genes for the disease of interest, gene prioritization methods represent a necessary support to automatically rank genes according to their involvement in the disease under study. This problem is characterized by highly unbalanced classes (few causative and much more non-causative genes) and requires the adoption of cost-sensitive techniques to achieve reliable solutions. In this work we propose a network-based methodology for disease-gene prioritization designed to expressly cope with the data imbalance. Its validation over a benchmark composed of 708 selected medical subject headings (MeSH) diseases, shows that our approach is competitive with state-of-art methodologies, and its reduced time complexity makes its application feasible on large-size datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
Actually the number of predictors, including the two-way interaction term (i.e. the product of the two features), is equal to 3.
 
Literatur
1.
Zurück zum Zitat Lehne, B., Lewis, C.M., Schlitt, T.: From SNPs to genes: disease association at the gene level. PLoS ONE 6(6), e20133 (2011)CrossRef Lehne, B., Lewis, C.M., Schlitt, T.: From SNPs to genes: disease association at the gene level. PLoS ONE 6(6), e20133 (2011)CrossRef
2.
Zurück zum Zitat Manolio, T.A.: Genomewide association studies and assessment of the risk of disease. N. Engl. J. Med. 363(2), 166–176 (2010)CrossRef Manolio, T.A.: Genomewide association studies and assessment of the risk of disease. N. Engl. J. Med. 363(2), 166–176 (2010)CrossRef
3.
Zurück zum Zitat Brnigen, D., et al.: An unbiased evaluation of gene prioritization tools. Bioinformatics 28(23), 3081–3088 (2012)CrossRef Brnigen, D., et al.: An unbiased evaluation of gene prioritization tools. Bioinformatics 28(23), 3081–3088 (2012)CrossRef
4.
Zurück zum Zitat Ashburner, M., et al.: Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25(1), 25–29 (2000)CrossRef Ashburner, M., et al.: Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25(1), 25–29 (2000)CrossRef
5.
Zurück zum Zitat Navlakha, S., Kingsford, C.: The power of protein interaction networks for associating genes with diseases. Bioinformatics 26(8), 1057–1063 (2010)CrossRef Navlakha, S., Kingsford, C.: The power of protein interaction networks for associating genes with diseases. Bioinformatics 26(8), 1057–1063 (2010)CrossRef
6.
Zurück zum Zitat Vanunu, O., Sharan, R.: A propagation-based algorithm for inferring gene-disease associations. In: Proceedings of the German Conference on Bioinformatics, GCB, September 9–12, Dresden, Germany (2008) Vanunu, O., Sharan, R.: A propagation-based algorithm for inferring gene-disease associations. In: Proceedings of the German Conference on Bioinformatics, GCB, September 9–12, Dresden, Germany (2008)
7.
Zurück zum Zitat Kohler, S., et al.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)CrossRef Kohler, S., et al.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)CrossRef
8.
Zurück zum Zitat Antanaviciute, A., et al.: Ova: integrating molecular and physical phenotype data from multiple biomedical domain ontologies with variant filtering for enhanced variant prioritization. Bioinformatics 31(23), 3822–3829 (2015) Antanaviciute, A., et al.: Ova: integrating molecular and physical phenotype data from multiple biomedical domain ontologies with variant filtering for enhanced variant prioritization. Bioinformatics 31(23), 3822–3829 (2015)
9.
Zurück zum Zitat Valentini, G., et al.: An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif. Intell. Med. 61(2), 63–78 (2014)MathSciNetCrossRef Valentini, G., et al.: An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif. Intell. Med. 61(2), 63–78 (2014)MathSciNetCrossRef
10.
Zurück zum Zitat Frasca, M., et al.: UNIPred: unbalance-aware network integration and prediction of protein functions. J. Comput. Biol. 22(12), 1057–1074 (2015)CrossRef Frasca, M., et al.: UNIPred: unbalance-aware network integration and prediction of protein functions. J. Comput. Biol. 22(12), 1057–1074 (2015)CrossRef
11.
Zurück zum Zitat Amberger, J., Bocchini, C., Hamosh, A.: A new face and new challenges for online mendelian inheritance in man (OMIM). Hum. Mutat. 32(5), 564–567 (2011)CrossRef Amberger, J., Bocchini, C., Hamosh, A.: A new face and new challenges for online mendelian inheritance in man (OMIM). Hum. Mutat. 32(5), 564–567 (2011)CrossRef
12.
Zurück zum Zitat Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pp. 973–978 (2001) Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pp. 973–978 (2001)
13.
Zurück zum Zitat Frasca, M., et al.: A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw. 43, 84–98 (2013)CrossRefMATH Frasca, M., et al.: A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Netw. 43, 84–98 (2013)CrossRefMATH
14.
Zurück zum Zitat Frasca, M.: Automated gene function prediction through gene multifunctionality in biological networks. Neurocomputing 162, 48–56 (2015)CrossRef Frasca, M.: Automated gene function prediction through gene multifunctionality in biological networks. Neurocomputing 162, 48–56 (2015)CrossRef
15.
Zurück zum Zitat Bertoni, A., Frasca, M., Valentini, G.: COSNet: a cost sensitive neural network for semi-supervised learning in graphs. In: Hofmann, T., Malerba, D., Vazirgiannis, M., Gunopulos, D. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 219–234. Springer, Heidelberg (2011)CrossRef Bertoni, A., Frasca, M., Valentini, G.: COSNet: a cost sensitive neural network for semi-supervised learning in graphs. In: Hofmann, T., Malerba, D., Vazirgiannis, M., Gunopulos, D. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 219–234. Springer, Heidelberg (2011)CrossRef
17.
Zurück zum Zitat Davis, A.P., et al.: Comparative toxicogenomics database: a knowledgebase and discovery tool for chemical-gene-disease networks. Nucleic Acids Res. 37(Database issue), D786–D792 (2009)CrossRef Davis, A.P., et al.: Comparative toxicogenomics database: a knowledgebase and discovery tool for chemical-gene-disease networks. Nucleic Acids Res. 37(Database issue), D786–D792 (2009)CrossRef
18.
Zurück zum Zitat Wu, G., Feng, X., Stein, L.: A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 11(5), R53+ (2010)CrossRef Wu, G., Feng, X., Stein, L.: A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 11(5), R53+ (2010)CrossRef
19.
Zurück zum Zitat Lee, I., et al.: Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 21(7), 1109–1121 (2011)CrossRef Lee, I., et al.: Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 21(7), 1109–1121 (2011)CrossRef
20.
Zurück zum Zitat Segal, E., et al.: A module map showing conditional activity of expression modules in cancer. Nat. Genet. 36(3), 1090–1098 (2004)CrossRef Segal, E., et al.: A module map showing conditional activity of expression modules in cancer. Nat. Genet. 36(3), 1090–1098 (2004)CrossRef
21.
Zurück zum Zitat Chatr-aryamontri, A., et al.: The biogrid interaction database: 2013 update. Nucleic Acids Res. 41(Database–Issue), 816–823 (2013)CrossRef Chatr-aryamontri, A., et al.: The biogrid interaction database: 2013 update. Nucleic Acids Res. 41(Database–Issue), 816–823 (2013)CrossRef
22.
Zurück zum Zitat Hellevik, O.: Linear versus logistic regression when the dependent variable is a dichotomy. Qual. Quant. 43(1), 59–74 (2009)CrossRef Hellevik, O.: Linear versus logistic regression when the dependent variable is a dichotomy. Qual. Quant. 43(1), 59–74 (2009)CrossRef
23.
Zurück zum Zitat Van Del Paal, B.: A comparison of different methods for modelling rare events data. Master thesis in statistical data analysis, Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium (2013–2014) Van Del Paal, B.: A comparison of different methods for modelling rare events data. Master thesis in statistical data analysis, Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium (2013–2014)
24.
Zurück zum Zitat Derby, N.: An introduction to the analysis of rare events. In: SA16 Proceedings of the 2011 Midwest SAS Users Group Conference, Kansas City, KS (2011) Derby, N.: An introduction to the analysis of rare events. In: SA16 Proceedings of the 2011 Midwest SAS Users Group Conference, Kansas City, KS (2011)
25.
Zurück zum Zitat He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef
26.
Zurück zum Zitat Dmochowski, J.P., Sajda, P., Parra, L.C.: Maximum likelihood in cost-sensitive learning: model specification, approximations, and upper bounds. J. Mach. Learn. Res. 11, 3313–3332 (2010)MathSciNetMATH Dmochowski, J.P., Sajda, P., Parra, L.C.: Maximum likelihood in cost-sensitive learning: model specification, approximations, and upper bounds. J. Mach. Learn. Res. 11, 3313–3332 (2010)MathSciNetMATH
27.
Zurück zum Zitat Lovász, L.: Random walks on graphs: a survey. In: Miklós, D., Sós, V.T., Szőnyi, T. (eds.) Combinatorics, Paul Erdős is Eighty, vol. 2, pp. 353–398. János Bolyai Mathematical Society, Budapest (1996) Lovász, L.: Random walks on graphs: a survey. In: Miklós, D., Sós, V.T., Szőnyi, T. (eds.) Combinatorics, Paul Erdős is Eighty, vol. 2, pp. 353–398. János Bolyai Mathematical Society, Budapest (1996)
28.
Zurück zum Zitat Schwikowski, B., Uetz, P., Fields, S.: A network of protein-protein interactions in yeast. Nat. Biotechnol. 18(12), 1257–1261 (2000)CrossRef Schwikowski, B., Uetz, P., Fields, S.: A network of protein-protein interactions in yeast. Nat. Biotechnol. 18(12), 1257–1261 (2000)CrossRef
Metadaten
Titel
Gene-Disease Prioritization Through Cost-Sensitive Graph-Based Methodologies
verfasst von
Marco Frasca
Simone Bassis
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-31744-1_64

Premium Partner