Skip to main content
Top
Published in: Cluster Computing 3/2017

26-05-2017

Speed and accuracy improvement of higher-order epistasis detection on CUDA-enabled GPUs

Authors: Daniel Jünger, Christian Hundt, Jorge González Domínguez, Bertil Schmidt

Published in: Cluster Computing | Issue 3/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The discovery of higher-order epistatic interactions is an important task in the field of genome wide association studies which allows for the identification of complex interaction patterns between multiple genetic markers. Some existing bruteforce approaches explore the whole space of k-interactions in an exhaustive manner resulting in almost intractable execution times. Computational cost can be reduced drastically by restricting the search space with suitable preprocessing filters which prune unpromising candidates. Other approaches mitigate the execution time by employing massively parallel accelerators in order to benefit from the vast computational resources of these architectures. In this paper, we combine a novel preprocessing filter, namely SingleMI, with massively parallel computation on modern GPUs to further accelerate epistasis discovery. Our implementation improves both the runtime and accuracy when compared to a previous GPU counterpart that employs mutual information clustering for prefiltering. SingleMI is open source software and publicly available at: https://​github.​com/​sleeepyjack/​singlemi/​.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Buckles, B.P., Lybanon, M.: Algorithm 515: generation of a vector from the lexicographical index [G6]. ACM Trans. Math. Softw. 3(2), 180–182 (1977)CrossRef Buckles, B.P., Lybanon, M.: Algorithm 515: generation of a vector from the lexicographical index [G6]. ACM Trans. Math. Softw. 3(2), 180–182 (1977)CrossRef
2.
go back to reference Cattaert, T., Calle, M.L., Dudek, S.M., Hohn, J.M., Lishout, F.V., Urrea, V., Ritchie, M.D., Steel, K.V.: Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise. Ann. Hum. Genet. 75(1), 78–89 (2011)CrossRef Cattaert, T., Calle, M.L., Dudek, S.M., Hohn, J.M., Lishout, F.V., Urrea, V., Ritchie, M.D., Steel, K.V.: Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise. Ann. Hum. Genet. 75(1), 78–89 (2011)CrossRef
3.
go back to reference Cordell, H.J.: Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 11(20), 2463–2468 (2002)CrossRef Cordell, H.J.: Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 11(20), 2463–2468 (2002)CrossRef
4.
go back to reference Cordell, H.J.: Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10(6), 392–404 (2009)CrossRef Cordell, H.J.: Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10(6), 392–404 (2009)CrossRef
5.
go back to reference Culverhouse, R.: The use of the restricted partition method with case-control data. Hum. Hered. 63(2), 93–100 (2007)CrossRef Culverhouse, R.: The use of the restricted partition method with case-control data. Hum. Hered. 63(2), 93–100 (2007)CrossRef
7.
go back to reference Easton, D.F., Pooley, K.A., et al.: Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447(7148), 1087–1093 (2007)CrossRef Easton, D.F., Pooley, K.A., et al.: Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447(7148), 1087–1093 (2007)CrossRef
8.
go back to reference Frayling, T.M., Timpson, N.J., et al.: A common variant in the fto gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316(5826), 889–894 (2007)CrossRef Frayling, T.M., Timpson, N.J., et al.: A common variant in the fto gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316(5826), 889–894 (2007)CrossRef
9.
go back to reference González-Domínguez, J., Schmidt, B.: GPU-accelerated exhaustive search for third-order epistatic interactions in case-control studies. J. Comput. Sci. 8, 93–100 (2015)CrossRef González-Domínguez, J., Schmidt, B.: GPU-accelerated exhaustive search for third-order epistatic interactions in case-control studies. J. Comput. Sci. 8, 93–100 (2015)CrossRef
10.
go back to reference González-Domínguez, J., Ramos, S., Touriño, J., Schmidt, B.: Parallel pairwise epistasis detection on heterogeneous computing architectures. IEEE Trans. Parallel Distrib. Syst. 27(8), 2329–2340 (2016)CrossRef González-Domínguez, J., Ramos, S., Touriño, J., Schmidt, B.: Parallel pairwise epistasis detection on heterogeneous computing architectures. IEEE Trans. Parallel Distrib. Syst. 27(8), 2329–2340 (2016)CrossRef
11.
go back to reference Goudey, B., Abedini, M., Hopper, J.L., Inouye, M., Makalic, E., Schmidt, D.F., Wagner, J., Zhou, Z., Zobel, J., Reumann, M.: High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in genome wide association studies. Health Inf. Sci. Syst. 3(Suppl 1), S3 (2015)CrossRef Goudey, B., Abedini, M., Hopper, J.L., Inouye, M., Makalic, E., Schmidt, D.F., Wagner, J., Zhou, Z., Zobel, J., Reumann, M.: High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in genome wide association studies. Health Inf. Sci. Syst. 3(Suppl 1), S3 (2015)CrossRef
12.
go back to reference Gui, J., Andrew, A.S., Andrews, P., et al.: A robust multifactor dimensionality reduction method for detecting gene-gene interactions with application to the genetic analysis of bladder cancer susceptibility. Ann. Hum. Genet. 75(1), 20–28 (2011)CrossRef Gui, J., Andrew, A.S., Andrews, P., et al.: A robust multifactor dimensionality reduction method for detecting gene-gene interactions with application to the genetic analysis of bladder cancer susceptibility. Ann. Hum. Genet. 75(1), 20–28 (2011)CrossRef
13.
go back to reference Guo, X., Meng, Y., Yu, N., Pan, Y.: Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform. 15(1), 102 (2014)CrossRef Guo, X., Meng, Y., Yu, N., Pan, Y.: Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform. 15(1), 102 (2014)CrossRef
14.
go back to reference Hu, X., Liu, Q., Zhang, Z., Li, Z., Wang, S., He, L., Shi, Y.: SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder. Cell Res. 20(7), 854–857 (2010)CrossRef Hu, X., Liu, Q., Zhang, Z., Li, Z., Wang, S., He, L., Shi, Y.: SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder. Cell Res. 20(7), 854–857 (2010)CrossRef
16.
go back to reference Jünger, D., Hundt, C., González-Domínguez, J., Schmidt, B.: Ultra-fast detection of higher-order epistatic interactions on gpus. In: 4th International Workshop on Parallelism in Bioinformatics (PBio 2016), Grenoble, France (2016) Jünger, D., Hundt, C., González-Domínguez, J., Schmidt, B.: Ultra-fast detection of higher-order epistatic interactions on gpus. In: 4th International Workshop on Parallelism in Bioinformatics (PBio 2016), Grenoble, France (2016)
17.
go back to reference Kam-Thong, T., Czamara, D., Tsuda, K., Borgwardt, K., Lewis, C., Erhardt-Lehmann, A., Hemmer, B., Rieckmann, P., Daake, M., Weber, F., Wolf, C., Ziegler, A., Pütz, B., Holsboer, F., Schölkopf, B., Müller-Myhsok, B.: EPIBLASTER-fast exhaustive two-locus epistasis detection strategy using graphical processing units. Eur. J. Hum. Genet. 19(4), 465–471 (2011)CrossRef Kam-Thong, T., Czamara, D., Tsuda, K., Borgwardt, K., Lewis, C., Erhardt-Lehmann, A., Hemmer, B., Rieckmann, P., Daake, M., Weber, F., Wolf, C., Ziegler, A., Pütz, B., Holsboer, F., Schölkopf, B., Müller-Myhsok, B.: EPIBLASTER-fast exhaustive two-locus epistasis detection strategy using graphical processing units. Eur. J. Hum. Genet. 19(4), 465–471 (2011)CrossRef
18.
go back to reference Kässens, J.C., Wienbrandt, L., González-Domínguez, J., Schmidt, B., Schimmler, M.: High-speed exhaustive 3-locus interaction epistasis analysis on FPGAs. J. Comput. Sci. 9, 131–136 (2015)CrossRef Kässens, J.C., Wienbrandt, L., González-Domínguez, J., Schmidt, B., Schimmler, M.: High-speed exhaustive 3-locus interaction epistasis analysis on FPGAs. J. Comput. Sci. 9, 131–136 (2015)CrossRef
19.
go back to reference Leem, S., Jeong, H.H., et al.: Fast detection of high-order epistatic interactions in genome-wide association studies using information theoretic measure. Comput. Biol. Chem. 50, 19–28 (2014)MathSciNetCrossRef Leem, S., Jeong, H.H., et al.: Fast detection of high-order epistatic interactions in genome-wide association studies using information theoretic measure. Comput. Biol. Chem. 50, 19–28 (2014)MathSciNetCrossRef
21.
go back to reference Meng, Y.A., Yu, Y., Cupples, L.A., Farrer, L.A., Lunetta, K.L.: Performance of random forest when SNPs are in linkage disequilibrium. BMC Bioinform. 10(1), 1 (2009)CrossRef Meng, Y.A., Yu, Y., Cupples, L.A., Farrer, L.A., Lunetta, K.L.: Performance of random forest when SNPs are in linkage disequilibrium. BMC Bioinform. 10(1), 1 (2009)CrossRef
22.
go back to reference Nelson, M.R., Kardia, S.L., Ferrel, L.E., Sing, C.F.: A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11(3), 458–470 (2001)CrossRef Nelson, M.R., Kardia, S.L., Ferrel, L.E., Sing, C.F.: A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11(3), 458–470 (2001)CrossRef
23.
go back to reference Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly, M.J., Sham, P.C.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007)CrossRef Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly, M.J., Sham, P.C.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007)CrossRef
25.
go back to reference Sluga, D., Curk, T., Zupan, B., Lotric, U.: Heterogeneous computing architecture for fast detection of SNP-SNP interactions. BMC Bioinform. 15(1), 216 (2014)CrossRef Sluga, D., Curk, T., Zupan, B., Lotric, U.: Heterogeneous computing architecture for fast detection of SNP-SNP interactions. BMC Bioinform. 15(1), 216 (2014)CrossRef
26.
go back to reference Tuo, S., Zhang, J., Yuan, X., Zhang, Y., Liu, Z.: FHSA-SED: two-locus model detection for genome-wide association study with harmony search algorithm. PLoS ONE 11(3), 1–27 (2016)CrossRef Tuo, S., Zhang, J., Yuan, X., Zhang, Y., Liu, Z.: FHSA-SED: two-locus model detection for genome-wide association study with harmony search algorithm. PLoS ONE 11(3), 1–27 (2016)CrossRef
27.
go back to reference Wan, X., Yang, C., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87(3), 325–340 (2010)CrossRef Wan, X., Yang, C., et al.: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am. J. Hum. Genet. 87(3), 325–340 (2010)CrossRef
28.
go back to reference Wan, X., Yang, C., et al.: Predictive rule inference for epistatic interaction detection in genome-wide association studies. Bioinformatics 26(1), 30–37 (2010)CrossRef Wan, X., Yang, C., et al.: Predictive rule inference for epistatic interaction detection in genome-wide association studies. Bioinformatics 26(1), 30–37 (2010)CrossRef
29.
go back to reference Wang, Y., Liu, G., Feng, M., Wong, L.: An empirical comparison of several recent epistatic interaction detection methods. Bioinformatics 27(21), 2936–2943 (2011)CrossRef Wang, Y., Liu, G., Feng, M., Wong, L.: An empirical comparison of several recent epistatic interaction detection methods. Bioinformatics 27(21), 2936–2943 (2011)CrossRef
30.
go back to reference Xie, M., Li, J., Jiang, T.: Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28(1), 5–12 (2012)CrossRef Xie, M., Li, J., Jiang, T.: Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28(1), 5–12 (2012)CrossRef
31.
go back to reference Yang, Y., Houle, A.M., Letendre, J., Richter, A.: RET Gly691ser mutation is associated with primary vesicoureteral reflux in the French-Canadian population from Quebec. Hum. Mutat. 29(5), 695–702 (2008)CrossRef Yang, Y., Houle, A.M., Letendre, J., Richter, A.: RET Gly691ser mutation is associated with primary vesicoureteral reflux in the French-Canadian population from Quebec. Hum. Mutat. 29(5), 695–702 (2008)CrossRef
32.
go back to reference Yang, C., He, Z., Wan, X., Yang, Q., Xue, H., Weichuan, Y.: SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4), 504–511 (2009)CrossRef Yang, C., He, Z., Wan, X., Yang, Q., Xue, H., Weichuan, Y.: SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4), 504–511 (2009)CrossRef
33.
go back to reference Yung, L.S., Yang, C., Wan, X., Yu, W.: GBOOST: a GPU-based tool for detecting genegene interactions in genomewide case control studies. Bioinformatics 27(9), 1309–1310 (2011)CrossRef Yung, L.S., Yang, C., Wan, X., Yu, W.: GBOOST: a GPU-based tool for detecting genegene interactions in genomewide case control studies. Bioinformatics 27(9), 1309–1310 (2011)CrossRef
34.
go back to reference Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167–1173 (2007)CrossRef Zhang, Y., Liu, J.S.: Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39(9), 1167–1173 (2007)CrossRef
Metadata
Title
Speed and accuracy improvement of higher-order epistasis detection on CUDA-enabled GPUs
Authors
Daniel Jünger
Christian Hundt
Jorge González Domínguez
Bertil Schmidt
Publication date
26-05-2017
Publisher
Springer US
Published in
Cluster Computing / Issue 3/2017
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-017-0938-9

Other articles of this Issue 3/2017

Cluster Computing 3/2017 Go to the issue

Premium Partner