Skip to main content

2020 | OriginalPaper | Buchkapitel

Isoform-Disease Association Prediction by Data Fusion

verfasst von : Qiuyue Huang, Jun Wang, Xiangliang Zhang, Guoxian Yu

Erschienen in: Bioinformatics Research and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Alternative splicing enables a gene spliced into different isoforms, which are closely related with diverse developmental abnormalities. Identifying the isoform-disease associations helps to uncover the underlying pathology of various complex diseases, and to develop precise treatments and drugs for these diseases. Although many approaches have been proposed for predicting gene-disease associations and isoform functions, few efforts have been made toward predicting isoform-disease associations in large-scale, the main bottleneck is the lack of ground-truth isoform-disease associations. To bridge this gap, we propose a multi-instance learning inspired computational approach called IDAPred to fuse genomics and transcriptomics data for isoform-disease association prediction. Given the bag-instance relationship between gene and its spliced isoforms, IDAPred introduces a dispatch and aggregation term to dispatch gene-disease associations to individual isoforms, and reversely aggregate these dispatched associations to affiliated genes. Next, it fuses different genomics and transcriptomics data to replenish gene-disease associations and to induce a linear classifier for predicting isoform-disease associations in a coherent way. In addition, to alleviate the bias toward observed gene-disease associations, it adds a regularization term to differentiate the currently observed associations from the unobserved (potential) ones. Experimental results show that IDAPred significantly outperforms the related state-of-the-art methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004) Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
2.
Zurück zum Zitat Carbonneau, M.A., Cheplygina, V., Granger, E., Gagnon, G.: Multiple instance learning: a survey of problem characteristics and applications. Pattern Recogn. 77, 329–353 (2018) Carbonneau, M.A., Cheplygina, V., Granger, E., Gagnon, G.: Multiple instance learning: a survey of problem characteristics and applications. Pattern Recogn. 77, 329–353 (2018)
3.
Zurück zum Zitat Chen, H., Shaw, D., Zeng, J., Bu, D., Jiang, T.: Diffuse: predicting isoform functions from sequences and expression profiles via deep learning. Bioinformatics 35(14), i284–i294 (2019)PubMedPubMedCentral Chen, H., Shaw, D., Zeng, J., Bu, D., Jiang, T.: Diffuse: predicting isoform functions from sequences and expression profiles via deep learning. Bioinformatics 35(14), i284–i294 (2019)PubMedPubMedCentral
5.
Zurück zum Zitat Consortium, E.P., et al.: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57 (2012) Consortium, E.P., et al.: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57 (2012)
6.
Zurück zum Zitat Eksi, R., et al.: Systematically differentiating functions for alternatively spliced isoforms through integrating rna-seq data. PLoS Comput. Biol. 9(11), e1003314 (2013)PubMedPubMedCentral Eksi, R., et al.: Systematically differentiating functions for alternatively spliced isoforms through integrating rna-seq data. PLoS Comput. Biol. 9(11), e1003314 (2013)PubMedPubMedCentral
7.
Zurück zum Zitat Ellis, J.D., et al.: Tissue-specific alternative splicing remodels protein-protein interaction networks. Mol. Cell 46(6), 884–892 (2012)PubMed Ellis, J.D., et al.: Tissue-specific alternative splicing remodels protein-protein interaction networks. Mol. Cell 46(6), 884–892 (2012)PubMed
8.
Zurück zum Zitat Gaudet, P., Dessimoz, C.: Gene ontology: pitfalls, biases, and remedies. In: The Gene Ontology Handbook, pp. 189–205. Humana Press, New York (2017) Gaudet, P., Dessimoz, C.: Gene ontology: pitfalls, biases, and remedies. In: The Gene Ontology Handbook, pp. 189–205. Humana Press, New York (2017)
9.
Zurück zum Zitat Holman, L., Head, M.L., Lanfear, R., Jennions, M.D.: Evidence of experimental bias in the life sciences: why we need blind data recording. PLoS Biol. 13(7), e1002190 (2015)PubMedPubMedCentral Holman, L., Head, M.L., Lanfear, R., Jennions, M.D.: Evidence of experimental bias in the life sciences: why we need blind data recording. PLoS Biol. 13(7), e1002190 (2015)PubMedPubMedCentral
10.
Zurück zum Zitat Holtzman, D.M., et al.: Apolipoprotein E isoform-dependent amyloid deposition and neuritic degeneration in a mouse model of Alzheimer’s disease. Proc. Nat. Acad. Sci. 97(6), 2892–2897 (2000)PubMed Holtzman, D.M., et al.: Apolipoprotein E isoform-dependent amyloid deposition and neuritic degeneration in a mouse model of Alzheimer’s disease. Proc. Nat. Acad. Sci. 97(6), 2892–2897 (2000)PubMed
11.
Zurück zum Zitat Jiang, Y., et al.: An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol. 17(1), 184 (2016)PubMedPubMedCentral Jiang, Y., et al.: An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol. 17(1), 184 (2016)PubMedPubMedCentral
12.
Zurück zum Zitat Kim, D., Langmead, B., Salzberg, S.L.: HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12(4), 357 (2015)PubMedPubMedCentral Kim, D., Langmead, B., Salzberg, S.L.: HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12(4), 357 (2015)PubMedPubMedCentral
13.
Zurück zum Zitat Li, H.D., Menon, R., Omenn, G.S., Guan, Y.: The emerging era of genomic data integration for analyzing splice isoform function. Trends Genet. 30(8), 340–347 (2014)PubMedPubMedCentral Li, H.D., Menon, R., Omenn, G.S., Guan, Y.: The emerging era of genomic data integration for analyzing splice isoform function. Trends Genet. 30(8), 340–347 (2014)PubMedPubMedCentral
14.
Zurück zum Zitat Li, W., et al.: High-resolution functional annotation of human transcriptome: predicting isoform functions by a novel multiple instance-based label propagation method. Nucleic Acids Res. 42(6), e39–e39 (2014)PubMed Li, W., et al.: High-resolution functional annotation of human transcriptome: predicting isoform functions by a novel multiple instance-based label propagation method. Nucleic Acids Res. 42(6), e39–e39 (2014)PubMed
15.
Zurück zum Zitat Lundberg, A.K., Jonasson, L., Hansson, G.K., Mailer, R.K.: Activation-induced FOXP3 isoform profile in peripheral CD4+ T cells is associated with coronary artery disease. Atherosclerosis 267, 27–33 (2017)PubMed Lundberg, A.K., Jonasson, L., Hansson, G.K., Mailer, R.K.: Activation-induced FOXP3 isoform profile in peripheral CD4+ T cells is associated with coronary artery disease. Atherosclerosis 267, 27–33 (2017)PubMed
16.
Zurück zum Zitat Luo, P., Li, Y., Tian, L.P., Wu, F.X.: Enhancing the prediction of disease-gene associations with multimodal deep learning. Bioinformatics 35(19), 3735–3742 (2019)PubMed Luo, P., Li, Y., Tian, L.P., Wu, F.X.: Enhancing the prediction of disease-gene associations with multimodal deep learning. Bioinformatics 35(19), 3735–3742 (2019)PubMed
17.
Zurück zum Zitat Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: NeurIPS, pp. 570–576 (1998) Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: NeurIPS, pp. 570–576 (1998)
18.
Zurück zum Zitat Natarajan, N., Dhillon, I.S.: Inductive matrix completion for predicting gene-disease associations. Bioinformatics 30(12), i60–i68 (2014)PubMedPubMedCentral Natarajan, N., Dhillon, I.S.: Inductive matrix completion for predicting gene-disease associations. Bioinformatics 30(12), i60–i68 (2014)PubMedPubMedCentral
19.
Zurück zum Zitat Neagoe, C., et al.: Titin isoform switch in ischemic human heart disease. Circulation 106(11), 1333–1341 (2002)PubMed Neagoe, C., et al.: Titin isoform switch in ischemic human heart disease. Circulation 106(11), 1333–1341 (2002)PubMed
20.
Zurück zum Zitat Pan, Q., Shai, O., Lee, L.J., Frey, B.J., Blencowe, B.J.: Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40(12), 1413 (2008)PubMed Pan, Q., Shai, O., Lee, L.J., Frey, B.J., Blencowe, B.J.: Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40(12), 1413 (2008)PubMed
21.
Zurück zum Zitat Pertea, M., Pertea, G.M., Antonescu, C.M., Chang, T.C., Mendell, J.T., Salzberg, S.L.: Stringtie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33(3), 290 (2015)PubMedPubMedCentral Pertea, M., Pertea, G.M., Antonescu, C.M., Chang, T.C., Mendell, J.T., Salzberg, S.L.: Stringtie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33(3), 290 (2015)PubMedPubMedCentral
22.
Zurück zum Zitat Piñero, J., et al.: The disgenet knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48(D1), D845–D855 (2020)PubMed Piñero, J., et al.: The disgenet knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48(D1), D845–D855 (2020)PubMed
23.
Zurück zum Zitat Pletscher-Frankild, S., Pallejà, A., Tsafou, K., Binder, J.X., Jensen, L.J.: Diseases: text mining and data integration of disease-gene associations. Methods 74, 83–89 (2015)PubMed Pletscher-Frankild, S., Pallejà, A., Tsafou, K., Binder, J.X., Jensen, L.J.: Diseases: text mining and data integration of disease-gene associations. Methods 74, 83–89 (2015)PubMed
24.
Zurück zum Zitat Sanan, D.A., et al.: Apolipoprotein E associates with beta amyloid peptide of Alzheimer’s disease to form novel monofibrils. isoform apoE4 associates more efficiently than apoE3. J. Clin. Invest. 94(2), 860–869 (1994)PubMedPubMedCentral Sanan, D.A., et al.: Apolipoprotein E associates with beta amyloid peptide of Alzheimer’s disease to form novel monofibrils. isoform apoE4 associates more efficiently than apoE3. J. Clin. Invest. 94(2), 860–869 (1994)PubMedPubMedCentral
25.
Zurück zum Zitat Schriml, L.M., et al.: Disease ontology: a backbone for disease semantic integration. Nucleic Acids Res. 40(D1), D940–D946 (2012)PubMed Schriml, L.M., et al.: Disease ontology: a backbone for disease semantic integration. Nucleic Acids Res. 40(D1), D940–D946 (2012)PubMed
26.
Zurück zum Zitat Shaw, D., Chen, H., Jiang, T.: Deepisofun: a deep domain adaptation approach to predict isoform functions. Bioinformatics 35(15), 2535–2544 (2019)PubMed Shaw, D., Chen, H., Jiang, T.: Deepisofun: a deep domain adaptation approach to predict isoform functions. Bioinformatics 35(15), 2535–2544 (2019)PubMed
27.
Zurück zum Zitat Shen, J., et al.: Predicting protein-protein interactions based only on sequences information. Proc. Nat. Acad. Sci. 104(11), 4337–4341 (2007)PubMed Shen, J., et al.: Predicting protein-protein interactions based only on sequences information. Proc. Nat. Acad. Sci. 104(11), 4337–4341 (2007)PubMed
28.
Zurück zum Zitat Skotheim, R.I., Nees, M.: Alternative splicing in cancer: noise, functional, or systematic? Int. J. Biochem. Cell Biol. 39(7–8), 1432–1449 (2007)PubMed Skotheim, R.I., Nees, M.: Alternative splicing in cancer: noise, functional, or systematic? Int. J. Biochem. Cell Biol. 39(7–8), 1432–1449 (2007)PubMed
29.
30.
Zurück zum Zitat Strittmatter, W.J., et al.: Binding of human apolipoprotein E to synthetic amyloid beta peptide: isoform-specific effects and implications for late-onset Alzheimer disease. Proc. Nat. Acad. Sci. 90(17), 8098–8102 (1993)PubMed Strittmatter, W.J., et al.: Binding of human apolipoprotein E to synthetic amyloid beta peptide: isoform-specific effects and implications for late-onset Alzheimer disease. Proc. Nat. Acad. Sci. 90(17), 8098–8102 (1993)PubMed
31.
Zurück zum Zitat Sun, P.G., Gao, L., Han, S.: Prediction of human disease-related gene clusters by clustering analysis. Int. J. Biol. Sci. 7(1), 61 (2011)PubMedPubMedCentral Sun, P.G., Gao, L., Han, S.: Prediction of human disease-related gene clusters by clustering analysis. Int. J. Biol. Sci. 7(1), 61 (2011)PubMedPubMedCentral
32.
Zurück zum Zitat Vanunu, O., Magger, O., Ruppin, E., Shlomi, T., Sharan, R.: Associating genes and protein complexes with disease via network propagation. PLoS Comput. Biol. 6(1), e1000641 (2010)PubMedPubMedCentral Vanunu, O., Magger, O., Ruppin, E., Shlomi, T., Sharan, R.: Associating genes and protein complexes with disease via network propagation. PLoS Comput. Biol. 6(1), e1000641 (2010)PubMedPubMedCentral
33.
34.
Zurück zum Zitat Wang, K., Wang, J., Domeniconi, C., Zhang, X., Yu, G.: Differentiating isoform functions with collaborative matrix factorization. Bioinformatics 36(6), 1864–1871 (2020)PubMed Wang, K., Wang, J., Domeniconi, C., Zhang, X., Yu, G.: Differentiating isoform functions with collaborative matrix factorization. Bioinformatics 36(6), 1864–1871 (2020)PubMed
35.
Zurück zum Zitat Wang, X., Gulbahce, N., Yu, H.: Network-based methods for human disease gene prediction. Brief. Funct. Genomics 10(5), 280–293 (2011)PubMed Wang, X., Gulbahce, N., Yu, H.: Network-based methods for human disease gene prediction. Brief. Funct. Genomics 10(5), 280–293 (2011)PubMed
36.
Zurück zum Zitat Xing, Y., Yu, G., Domeniconi, C., Wang, J., Zhang, Z., Guo, M.: Multi-view multi-instance multi-label learning based on collaborative matrix factorization. In: AAAI, pp. 5508–5515 (2019) Xing, Y., Yu, G., Domeniconi, C., Wang, J., Zhang, Z., Guo, M.: Multi-view multi-instance multi-label learning based on collaborative matrix factorization. In: AAAI, pp. 5508–5515 (2019)
37.
Zurück zum Zitat Xiong, H.Y., et al.: The human splicing code reveals new insights into the genetic determinants of disease. Science 347(6218), 1254806 (2015)PubMed Xiong, H.Y., et al.: The human splicing code reveals new insights into the genetic determinants of disease. Science 347(6218), 1254806 (2015)PubMed
39.
Zurück zum Zitat Yu, G., Rangwala, H., Domeniconi, C., Zhang, G., Yu, Z.: Protein function prediction using multilabel ensemble classification. IEEE/ACM Trans. Comput. Biol. Bioinf. 10(4), 1045–1057 (2013) Yu, G., Rangwala, H., Domeniconi, C., Zhang, G., Yu, Z.: Protein function prediction using multilabel ensemble classification. IEEE/ACM Trans. Comput. Biol. Bioinf. 10(4), 1045–1057 (2013)
40.
Zurück zum Zitat Yu, G., Wang, K., Domeniconi, C., Guo, M., Wang, J.: Isoform function prediction based on bi-random walks on a heterogeneous network. Bioinformatics 36(1), 303–310 (2020)PubMed Yu, G., Wang, K., Domeniconi, C., Guo, M., Wang, J.: Isoform function prediction based on bi-random walks on a heterogeneous network. Bioinformatics 36(1), 303–310 (2020)PubMed
41.
Zurück zum Zitat Zhou, Z.H., Zhang, M.L., Huang, S.J., Li, Y.F.: Multi-instance multi-label learning. Artif. Intell. 176(1), 2291–2320 (2012) Zhou, Z.H., Zhang, M.L., Huang, S.J., Li, Y.F.: Multi-instance multi-label learning. Artif. Intell. 176(1), 2291–2320 (2012)
Metadaten
Titel
Isoform-Disease Association Prediction by Data Fusion
verfasst von
Qiuyue Huang
Jun Wang
Xiangliang Zhang
Guoxian Yu
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-57821-3_5