Skip to main content

2020 | OriginalPaper | Buchkapitel

EpIntMC: Detecting Epistatic Interactions Using Multiple Clusterings

verfasst von : Huiling Zhang, Guoxian Yu, Wei Ren, Maozu Guo, Jun Wang

Erschienen in: Bioinformatics Research and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Detecting epistatic interaction between multiple single nucleotide polymorphisms (SNPs) is crucial to identify susceptibility genes associated with complex human diseases. Stepwise search approaches have been extensively studied to greatly reduce the search space for follow-up SNP interactions detection. However, most of these stepwise methods are prone to filter out significant polymorphism combinations and thus have a low detection power. In this paper, we propose a two-stage approach called EpIntMC, which uses multiple clusterings to significantly shrink the search space and reduce the risk of filtering out significant combinations for the follow-up detection. EpIntMC firstly introduces a matrix factorization based approach to generate multiple diverse clusterings to group SNPs into different clusters from different aspects, which helps to more comprehensively explore the genotype data and reduce the chance of filtering out potential candidates overlooked by a single clustering. In the search stage, EpIntMC applies Entropy score to screen SNPs in each cluster, and uses Jaccard similarity to merge the most similar clusters into candidate sets. After that, EpIntMC uses exhaustive search on these candidate sets to precisely detect epsitatic interactions. Extensive simulation experiments show that EpIntMC has a higher (comparable) power than related competitive solutions, and results on Wellcome Trust Case Control Consortium (WTCCC) dataset also expresses its effectiveness.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Bailey, J.: Alternative clustering analysis: a review. In: Data Clustering, pp. 535–550. Chapman and Hall/CRC (2018) Bailey, J.: Alternative clustering analysis: a review. In: Data Clustering, pp. 535–550. Chapman and Hall/CRC (2018)
4.
Zurück zum Zitat Balding, D.J.: A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7(10), 781 (2006)PubMed Balding, D.J.: A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7(10), 781 (2006)PubMed
5.
Zurück zum Zitat Bermejo, J.L., et al.: Exploring the association between genetic variation in the SUMO isopeptidase gene USPL1 and breast cancer through integration of data from the population-based genica study and external genetic databases. Int. J. Cancer 133(2), 362–372 (2013)PubMed Bermejo, J.L., et al.: Exploring the association between genetic variation in the SUMO isopeptidase gene USPL1 and breast cancer through integration of data from the population-based genica study and external genetic databases. Int. J. Cancer 133(2), 362–372 (2013)PubMed
6.
Zurück zum Zitat Burton, P.R., et al.: Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants. Nat. Genet. 39(11), 1329 (2007)PubMed Burton, P.R., et al.: Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants. Nat. Genet. 39(11), 1329 (2007)PubMed
7.
Zurück zum Zitat Cao, X., Yu, G., Liu, J., Jia, L., Wang, J.: ClusterMI: detecting high-order SNP interactions based on clustering and mutual information. Int. J. Mol. Sci. 19(8), 2267 (2018)PubMedCentral Cao, X., Yu, G., Liu, J., Jia, L., Wang, J.: ClusterMI: detecting high-order SNP interactions based on clustering and mutual information. Int. J. Mol. Sci. 19(8), 2267 (2018)PubMedCentral
8.
Zurück zum Zitat Cao, X., Yu, G., Ren, W., Guo, M., Wang, J.: DualWMDR: detecting epistatic interaction with dual screening and multifactor dimensionality reduction. Hum. Mutat. 40, 719–734 (2020) Cao, X., Yu, G., Ren, W., Guo, M., Wang, J.: DualWMDR: detecting epistatic interaction with dual screening and multifactor dimensionality reduction. Hum. Mutat. 40, 719–734 (2020)
9.
Zurück zum Zitat Chattopadhyay, A.S., Hsiao, C.L., Chang, C.C., Lian, I.B., Fann, C.S.: Summarizing techniques that combine three non-parametric scores to detect disease-associated 2-way SNP-SNP interactions. Gene 533(1), 304–312 (2014) Chattopadhyay, A.S., Hsiao, C.L., Chang, C.C., Lian, I.B., Fann, C.S.: Summarizing techniques that combine three non-parametric scores to detect disease-associated 2-way SNP-SNP interactions. Gene 533(1), 304–312 (2014)
10.
Zurück zum Zitat Culverhouse, R., Suarez, B.K., Lin, J., Reich, T.: A perspective on epistasis: limits of models displaying no main effect. Am. J. Hum. Genet. 70(2), 461–471 (2002)PubMedPubMedCentral Culverhouse, R., Suarez, B.K., Lin, J., Reich, T.: A perspective on epistasis: limits of models displaying no main effect. Am. J. Hum. Genet. 70(2), 461–471 (2002)PubMedPubMedCentral
11.
Zurück zum Zitat Ding, C.H., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. TPAMI 32(1), 45–55 (2010) Ding, C.H., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. TPAMI 32(1), 45–55 (2010)
12.
Zurück zum Zitat Guo, X., Meng, Y., Yu, N., Pan, Y.: Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform. 15(1), 102 (2014) Guo, X., Meng, Y., Yu, N., Pan, Y.: Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform. 15(1), 102 (2014)
13.
Zurück zum Zitat Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999) Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
14.
Zurück zum Zitat Lee, H., Goodarzi, H., Tavazoie, S.F., Alarcón, C.R.: TMEM2 is a SOX4-regulated gene that mediates metastatic migration and invasion in breast cancer. Cancer Res. 76(17), 4994–5005 (2016)PubMedPubMedCentral Lee, H., Goodarzi, H., Tavazoie, S.F., Alarcón, C.R.: TMEM2 is a SOX4-regulated gene that mediates metastatic migration and invasion in breast cancer. Cancer Res. 76(17), 4994–5005 (2016)PubMedPubMedCentral
15.
Zurück zum Zitat Li, W., Reich, J.: A complete enumeration and classification of two-locus disease models. Hum. Hered. 50(6), 334–349 (2000)PubMed Li, W., Reich, J.: A complete enumeration and classification of two-locus disease models. Hum. Hered. 50(6), 334–349 (2000)PubMed
16.
Zurück zum Zitat Liu, J., Yu, G., Jiang, Y., Wang, J.: HiSeeker: detecting high-order SNP interactions based on pairwise SNP combinations. Genes 8(6), 153 (2017)PubMedCentral Liu, J., Yu, G., Jiang, Y., Wang, J.: HiSeeker: detecting high-order SNP interactions based on pairwise SNP combinations. Genes 8(6), 153 (2017)PubMedCentral
19.
Zurück zum Zitat Marchini, J., Donnelly, P., Cardon, L.R.: Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 37(4), 413 (2005)PubMed Marchini, J., Donnelly, P., Cardon, L.R.: Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 37(4), 413 (2005)PubMed
20.
Zurück zum Zitat Moore, J.H., Asselbergs, F.W., Williams, S.M.: Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4), 445–455 (2010)PubMedPubMedCentral Moore, J.H., Asselbergs, F.W., Williams, S.M.: Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4), 445–455 (2010)PubMedPubMedCentral
21.
Zurück zum Zitat Niel, C., Sinoquet, C., Dina, C., Rocheleau, G.: A survey about methods dedicated to epistasis detection. Front. Genet. 6, 285 (2015)PubMedPubMedCentral Niel, C., Sinoquet, C., Dina, C., Rocheleau, G.: A survey about methods dedicated to epistasis detection. Front. Genet. 6, 285 (2015)PubMedPubMedCentral
22.
Zurück zum Zitat Ritchie, M.D., et al.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69(1), 138–147 (2001)PubMedPubMedCentral Ritchie, M.D., et al.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69(1), 138–147 (2001)PubMedPubMedCentral
23.
Zurück zum Zitat Sun, K., et al.: Oxidized ATM-mediated glycolysis enhancement in breast cancer-associated fibroblasts contributes to tumor invasion through lactate as metabolic coupling. EBioMedicine 41, 370–383 (2019)PubMedPubMedCentral Sun, K., et al.: Oxidized ATM-mediated glycolysis enhancement in breast cancer-associated fibroblasts contributes to tumor invasion through lactate as metabolic coupling. EBioMedicine 41, 370–383 (2019)PubMedPubMedCentral
24.
Zurück zum Zitat Vivekanandhan, S., Mukhopadhyay, D.: Divergent roles of Plexin D1 in cancer. Biochimica et Biophysica Acta (BBA)-Rev. Cancer 1872(1), 103–110 (2019) Vivekanandhan, S., Mukhopadhyay, D.: Divergent roles of Plexin D1 in cancer. Biochimica et Biophysica Acta (BBA)-Rev. Cancer 1872(1), 103–110 (2019)
25.
Zurück zum Zitat Wan, X., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87(3), 325–340 (2010)PubMedPubMedCentral Wan, X., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87(3), 325–340 (2010)PubMedPubMedCentral
26.
Zurück zum Zitat Wang, J., Wang, X., Yu, G., Domeniconi, C., Yu, Z., Zhang, Z.: Discovering multiple co-clusterings with matrix factorization. IEEE Trans. Cybern. 99(1), 1–14 (2020) Wang, J., Wang, X., Yu, G., Domeniconi, C., Yu, Z., Zhang, Z.: Discovering multiple co-clusterings with matrix factorization. IEEE Trans. Cybern. 99(1), 1–14 (2020)
27.
Zurück zum Zitat Wang, X., Wang, J., Domeniconi, C., Yu, G., Xiao, G., Guo, M.: Multiple independent subspace clusterings. In: AAAI, pp. 5353–5360 (2019) Wang, X., Wang, J., Domeniconi, C., Yu, G., Xiao, G., Guo, M.: Multiple independent subspace clusterings. In: AAAI, pp. 5353–5360 (2019)
29.
Zurück zum Zitat Wei, S., Wang, J., Yu, G., Zhang, X., et al.: Multi-view multiple clusterings using deep matrix factorization. In: AAAI, pp. 1–8 (2020) Wei, S., Wang, J., Yu, G., Zhang, X., et al.: Multi-view multiple clusterings using deep matrix factorization. In: AAAI, pp. 1–8 (2020)
30.
Zurück zum Zitat Welter, D., et al.: The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42(D1), D1001–D1006 (2013)PubMedPubMedCentral Welter, D., et al.: The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42(D1), D1001–D1006 (2013)PubMedPubMedCentral
31.
Zurück zum Zitat Xie, M., Li, J., Jiang, T.: Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28(1), 5–12 (2011)PubMedPubMedCentral Xie, M., Li, J., Jiang, T.: Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28(1), 5–12 (2011)PubMedPubMedCentral
32.
Zurück zum Zitat Yang, C.H., Chuang, L.Y., Lin, Y.D.: CMDR based differential evolution identifies the epistatic interaction in genome-wide association studies. Bioinformatics 33(15), 2354–2362 (2017)PubMed Yang, C.H., Chuang, L.Y., Lin, Y.D.: CMDR based differential evolution identifies the epistatic interaction in genome-wide association studies. Bioinformatics 33(15), 2354–2362 (2017)PubMed
33.
Zurück zum Zitat Yang, C.H., Chuang, L.Y., Lin, Y.D.: Multiobjective multifactor dimensionality reduction to detect SNP-SNP interactions. Bioinformatics 34(13), 2228–2236 (2018)PubMed Yang, C.H., Chuang, L.Y., Lin, Y.D.: Multiobjective multifactor dimensionality reduction to detect SNP-SNP interactions. Bioinformatics 34(13), 2228–2236 (2018)PubMed
34.
Zurück zum Zitat Yao, S., Yu, G., Wang, J., Domeniconi, C., Zhang, X.: Multi-view multiple clustering. In: IJCAI, pp. 4121–4127 (2019) Yao, S., Yu, G., Wang, J., Domeniconi, C., Zhang, X.: Multi-view multiple clustering. In: IJCAI, pp. 4121–4127 (2019)
35.
Zurück zum Zitat Yao, S., Yu, G., Wang, X., Wang, J., Domeniconi, C., Guo, M.: Discovering multiple co-clusterings in subspaces. In: SDM, pp. 423–431 (2019) Yao, S., Yu, G., Wang, X., Wang, J., Domeniconi, C., Guo, M.: Discovering multiple co-clusterings in subspaces. In: SDM, pp. 423–431 (2019)
36.
Zurück zum Zitat Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167 (2007)PubMed Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167 (2007)PubMed
37.
Zurück zum Zitat Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. B 67(2), 301–320 (2005) Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. B 67(2), 301–320 (2005)
Metadaten
Titel
EpIntMC: Detecting Epistatic Interactions Using Multiple Clusterings
verfasst von
Huiling Zhang
Guoxian Yu
Wei Ren
Maozu Guo
Jun Wang
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-57821-3_6