Skip to main content
Top
Published in: International Journal of Parallel Programming 4/2013

01-08-2013

GPU-Friendly Parallel Genome Matching with Tiled Access and Reduced State Transition Table

Authors: Yunho Oh, Doohwan Oh, Won W. Ro

Published in: International Journal of Parallel Programming | Issue 4/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we propose a new parallel genome matching algorithm using graphics processing units (GPUs). Our proposed approach is based on the Aho–Corasick algorithm and it was developed based on a consideration of the architectural features of existing GPUs with a hundred or more cores. Thus, we provide an appropriate task partitioning method that runs on multiple threads and we fully utilize the cache memory and the shared memory structures available in GPUs. Especially, we propose a tiled access method for rapid data transfer from the global memory to the shared memory. We also provide new models for cache-friendly state transition table to improve performance of pattern matching operations on GPUs. The maximum throughput we achieved in various experiments was 15.3 Gbps. Moreover, we showed that our proposed design outperformed an earlier approach with a 15.4 % performance improvement.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
2.
3.
go back to reference NVIDIA CUDA Programming Guide 3.0. NVIDIA (2009) NVIDIA CUDA Programming Guide 3.0. NVIDIA (2009)
6.
go back to reference Baker, Z.K., Prasanna, V.K.: A Methodology for Synthesis of Efficient Intrusion Detection Systems on FPGAs. In: Proceedings of 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’04), pp. 135–144. IEEE Computer Society (2004) Baker, Z.K., Prasanna, V.K.: A Methodology for Synthesis of Efficient Intrusion Detection Systems on FPGAs. In: Proceedings of 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’04), pp. 135–144. IEEE Computer Society (2004)
7.
go back to reference Bakhoda, A., Yuan, G.L., Fung, W.W.L., Wong, H., Aamodt, T.M.: Analyzing CUDA Workloads Using a Detailed GPU Simulator. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009), pp. 163–174. IEEE Computer Society (2009) Bakhoda, A., Yuan, G.L., Fung, W.W.L., Wong, H., Aamodt, T.M.: Analyzing CUDA Workloads Using a Detailed GPU Simulator. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009), pp. 163–174. IEEE Computer Society (2009)
8.
go back to reference Boeva, V., Clement, J., Regnier, M., Roytberg, M., Makeev, V.: Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules. Algorithms Mol Biol 2(1), 13 (2007)CrossRef Boeva, V., Clement, J., Regnier, M., Roytberg, M., Makeev, V.: Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules. Algorithms Mol Biol 2(1), 13 (2007)CrossRef
9.
go back to reference Bos, H., Huang, K.: A Network Intrusion Detection System on IXP1200 Network Processors with Support for Large Rule Sets. Leiden Universiteit, Technical report (2004) Bos, H., Huang, K.: A Network Intrusion Detection System on IXP1200 Network Processors with Support for Large Rule Sets. Leiden Universiteit, Technical report (2004)
10.
go back to reference Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)MATHCrossRef Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)MATHCrossRef
11.
go back to reference Brudno, M., Morgenstern, B.: Fast and sensitive alignment of large genomic sequences. In: Proceedings of 2002 Bioinformatics Conference, pp. 138–147. IEEE Computer Society (2002) Brudno, M., Morgenstern, B.: Fast and sensitive alignment of large genomic sequences. In: Proceedings of 2002 Bioinformatics Conference, pp. 138–147. IEEE Computer Society (2002)
12.
go back to reference Buhler, J., Keich, U., Sun, Y.: Designing seeds for similarity search in genomic DNA. In: Proceedings of 7th Annual International Conference on Research in Computational Molecular Biology, RECOMB ’03, pp. 67–75. ACM (2003) Buhler, J., Keich, U., Sun, Y.: Designing seeds for similarity search in genomic DNA. In: Proceedings of 7th Annual International Conference on Research in Computational Molecular Biology, RECOMB ’03, pp. 67–75. ACM (2003)
13.
go back to reference Castelo, A.T., Martins, W., Gao, G.R.: TROLL-tandem repeat occurrence locator. Bioinformatics 18(4), 634–636 (2002)CrossRef Castelo, A.T., Martins, W., Gao, G.R.: TROLL-tandem repeat occurrence locator. Bioinformatics 18(4), 634–636 (2002)CrossRef
14.
go back to reference Dandass, Y., Burgess, S., Lawrence, M., Bridges, S.: Accelerating string set matching in FPGA hardware for bioinformatics research. BMC Bioinformatics 9(1), 197–207 (2008)CrossRef Dandass, Y., Burgess, S., Lawrence, M., Bridges, S.: Accelerating string set matching in FPGA hardware for bioinformatics research. BMC Bioinformatics 9(1), 197–207 (2008)CrossRef
15.
go back to reference Hong, S., Kim, H.: An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA ’09, pp. 152–163. ACM, New York (2009) Hong, S., Kim, H.: An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA ’09, pp. 152–163. ACM, New York (2009)
16.
17.
go back to reference Majumder, A., Rastogi, R., Vanama, S.: Scalable regular expression matching on data streams. In: Proceedings of 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD ’08), pp. 161–172. ACM (2008) Majumder, A., Rastogi, R., Vanama, S.: Scalable regular expression matching on data streams. In: Proceedings of 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD ’08), pp. 161–172. ACM (2008)
18.
go back to reference Michael, M., Dieterich, C., Vingron, M.: Siteblast-rapid and sensitive local alignment of genomic sequences employing motif anchors. Bioinformatics 21(9), 2093–2094 (2005)CrossRef Michael, M., Dieterich, C., Vingron, M.: Siteblast-rapid and sensitive local alignment of genomic sequences employing motif anchors. Bioinformatics 21(9), 2093–2094 (2005)CrossRef
20.
go back to reference NVIDIA: Tuning CUDA Applications for Fermi 1.3 (2010) NVIDIA: Tuning CUDA Applications for Fermi 1.3 (2010)
21.
go back to reference Scarpazza, D.: Top-performance tokenization and small-ruleset regular expression matching. Int. J. Parallel Program. 39(1), 3–32 (2011)CrossRef Scarpazza, D.: Top-performance tokenization and small-ruleset regular expression matching. Int. J. Parallel Program. 39(1), 3–32 (2011)CrossRef
22.
go back to reference Scarpazza, D., Villa, O., Petrini, F.: High-speed string searching against large dictionaries on the cell/B.E. Processor. In: Proceedings of IEEE International Symposium on Parallel and Distributed Processing, 2008 (IPDPS 2008), pp. 1–12. IEEE Computer Society (2008) Scarpazza, D., Villa, O., Petrini, F.: High-speed string searching against large dictionaries on the cell/B.E. Processor. In: Proceedings of IEEE International Symposium on Parallel and Distributed Processing, 2008 (IPDPS 2008), pp. 1–12. IEEE Computer Society (2008)
23.
go back to reference Tan, L., Sherwood, T.: A high throughput string matching architecture for intrusion detection and prevention. In: Proceedings of 32nd Annual International Symposium on Computer Architecture (ISCA ’05), pp. 112–122. IEEE Computer Society (2005) Tan, L., Sherwood, T.: A high throughput string matching architecture for intrusion detection and prevention. In: Proceedings of 32nd Annual International Symposium on Computer Architecture (ISCA ’05), pp. 112–122. IEEE Computer Society (2005)
24.
go back to reference Trapnell, C., Schatz, M.C.: Optimizing data intensive GPGPU computations for DNA sequence alignment. Parallel Comput. 35(8–9), 429–440 (2009)CrossRef Trapnell, C., Schatz, M.C.: Optimizing data intensive GPGPU computations for DNA sequence alignment. Parallel Comput. 35(8–9), 429–440 (2009)CrossRef
25.
go back to reference Tumeo, A., Villa, O.: Accelerating DNA analysis applications on GPU clusters. In: Proceedings of 2010 IEEE 8th Symposium on Application Specific Processors (SASP), pp. 71–76. IEEE Computer Society (2010) Tumeo, A., Villa, O.: Accelerating DNA analysis applications on GPU clusters. In: Proceedings of 2010 IEEE 8th Symposium on Application Specific Processors (SASP), pp. 71–76. IEEE Computer Society (2010)
26.
go back to reference Tumeo, A., Villa, O., Sciuto, D.: Efficient pattern matching on GPUs for intrusion detection systems. In: Proceedings of 7th ACM International Conference on Computing Frontiers, CF ’10, pp. 87–88. ACM (2010) Tumeo, A., Villa, O., Sciuto, D.: Efficient pattern matching on GPUs for intrusion detection systems. In: Proceedings of 7th ACM International Conference on Computing Frontiers, CF ’10, pp. 87–88. ACM (2010)
28.
go back to reference Vasiliadis, G., Antonatos, S., Polychronakis, M., Markatos, E.P., Ioannidis, S.: Gnort: high performance network intrusion detection using graphics processors. In: Proceedings of 11th International Symposium on Recent Advances in Intrusion Detection (RAID ’08), pp. 116–134. Springer-Verlag (2008) Vasiliadis, G., Antonatos, S., Polychronakis, M., Markatos, E.P., Ioannidis, S.: Gnort: high performance network intrusion detection using graphics processors. In: Proceedings of 11th International Symposium on Recent Advances in Intrusion Detection (RAID ’08), pp. 116–134. Springer-Verlag (2008)
29.
go back to reference Vespa, L., Weng, N., Ramaswamy, R.: MS-DFA: multiple-stride pattern matching for scalable deep packet inspection. Comput J 54(2), 285–303 (2011)CrossRef Vespa, L., Weng, N., Ramaswamy, R.: MS-DFA: multiple-stride pattern matching for scalable deep packet inspection. Comput J 54(2), 285–303 (2011)CrossRef
30.
go back to reference Villa, O., Chavarria-Miranda, D., Maschhoff, K.: Input-independent, scalable and fast string matching on the cray XMT. In: Proceedings of IEEE International Symposium on Parallel Distributed Processing, 2009. IPDPS 2009, pp. 1–12 (2009) Villa, O., Chavarria-Miranda, D., Maschhoff, K.: Input-independent, scalable and fast string matching on the cray XMT. In: Proceedings of IEEE International Symposium on Parallel Distributed Processing, 2009. IPDPS 2009, pp. 1–12 (2009)
31.
go back to reference Wu, S., Manber, U.: Agrep - A fast approximate pattern-matching tool. In: Proceedings of USENIX Technical Conference, pp. 153–162. USENIX (1992) Wu, S., Manber, U.: Agrep - A fast approximate pattern-matching tool. In: Proceedings of USENIX Technical Conference, pp. 153–162. USENIX (1992)
32.
go back to reference Xu, C., Kirk, S., Jenkins, S.: Tiling for performance tuning on different models of GPUs. In: Proceedings of 2009 Second International Symposium on Information Science and Engineering (ISISE), pp. 500–504 (2009) Xu, C., Kirk, S., Jenkins, S.: Tiling for performance tuning on different models of GPUs. In: Proceedings of 2009 Second International Symposium on Information Science and Engineering (ISISE), pp. 500–504 (2009)
33.
go back to reference Yang, Y.H.E., Jiang, W., Prasanna, V.K.: Compact architecture for high-throughput regular expression matching on FPGA. In: Proceedings of 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS ’08), pp. 30–39. ACM (2008) Yang, Y.H.E., Jiang, W., Prasanna, V.K.: Compact architecture for high-throughput regular expression matching on FPGA. In: Proceedings of 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS ’08), pp. 30–39. ACM (2008)
34.
go back to reference Zhongqiang, C., Yuan, Z., Zhongrong, C., Alex, D.: A digest and pattern matching-based intrusion detection engine. Comput. J. 52(6), 699–723 (2009)CrossRef Zhongqiang, C., Yuan, Z., Zhongrong, C., Alex, D.: A digest and pattern matching-based intrusion detection engine. Comput. J. 52(6), 699–723 (2009)CrossRef
Metadata
Title
GPU-Friendly Parallel Genome Matching with Tiled Access and Reduced State Transition Table
Authors
Yunho Oh
Doohwan Oh
Won W. Ro
Publication date
01-08-2013
Publisher
Springer US
Published in
International Journal of Parallel Programming / Issue 4/2013
Print ISSN: 0885-7458
Electronic ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-012-0234-5

Premium Partner