Skip to main content
Erschienen in: International Journal of Parallel Programming 4/2013

01.08.2013

GPU-Friendly Parallel Genome Matching with Tiled Access and Reduced State Transition Table

verfasst von: Yunho Oh, Doohwan Oh, Won W. Ro

Erschienen in: International Journal of Parallel Programming | Ausgabe 4/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we propose a new parallel genome matching algorithm using graphics processing units (GPUs). Our proposed approach is based on the Aho–Corasick algorithm and it was developed based on a consideration of the architectural features of existing GPUs with a hundred or more cores. Thus, we provide an appropriate task partitioning method that runs on multiple threads and we fully utilize the cache memory and the shared memory structures available in GPUs. Especially, we propose a tiled access method for rapid data transfer from the global memory to the shared memory. We also provide new models for cache-friendly state transition table to improve performance of pattern matching operations on GPUs. The maximum throughput we achieved in various experiments was 15.3 Gbps. Moreover, we showed that our proposed design outperformed an earlier approach with a 15.4 % performance improvement.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
2.
Zurück zum Zitat Clam AntiVirus 0.96 User Manual (2007) Clam AntiVirus 0.96 User Manual (2007)
3.
Zurück zum Zitat NVIDIA CUDA Programming Guide 3.0. NVIDIA (2009) NVIDIA CUDA Programming Guide 3.0. NVIDIA (2009)
4.
6.
Zurück zum Zitat Baker, Z.K., Prasanna, V.K.: A Methodology for Synthesis of Efficient Intrusion Detection Systems on FPGAs. In: Proceedings of 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’04), pp. 135–144. IEEE Computer Society (2004) Baker, Z.K., Prasanna, V.K.: A Methodology for Synthesis of Efficient Intrusion Detection Systems on FPGAs. In: Proceedings of 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’04), pp. 135–144. IEEE Computer Society (2004)
7.
Zurück zum Zitat Bakhoda, A., Yuan, G.L., Fung, W.W.L., Wong, H., Aamodt, T.M.: Analyzing CUDA Workloads Using a Detailed GPU Simulator. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009), pp. 163–174. IEEE Computer Society (2009) Bakhoda, A., Yuan, G.L., Fung, W.W.L., Wong, H., Aamodt, T.M.: Analyzing CUDA Workloads Using a Detailed GPU Simulator. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009), pp. 163–174. IEEE Computer Society (2009)
8.
Zurück zum Zitat Boeva, V., Clement, J., Regnier, M., Roytberg, M., Makeev, V.: Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules. Algorithms Mol Biol 2(1), 13 (2007)CrossRef Boeva, V., Clement, J., Regnier, M., Roytberg, M., Makeev, V.: Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules. Algorithms Mol Biol 2(1), 13 (2007)CrossRef
9.
Zurück zum Zitat Bos, H., Huang, K.: A Network Intrusion Detection System on IXP1200 Network Processors with Support for Large Rule Sets. Leiden Universiteit, Technical report (2004) Bos, H., Huang, K.: A Network Intrusion Detection System on IXP1200 Network Processors with Support for Large Rule Sets. Leiden Universiteit, Technical report (2004)
10.
Zurück zum Zitat Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)MATHCrossRef Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)MATHCrossRef
11.
Zurück zum Zitat Brudno, M., Morgenstern, B.: Fast and sensitive alignment of large genomic sequences. In: Proceedings of 2002 Bioinformatics Conference, pp. 138–147. IEEE Computer Society (2002) Brudno, M., Morgenstern, B.: Fast and sensitive alignment of large genomic sequences. In: Proceedings of 2002 Bioinformatics Conference, pp. 138–147. IEEE Computer Society (2002)
12.
Zurück zum Zitat Buhler, J., Keich, U., Sun, Y.: Designing seeds for similarity search in genomic DNA. In: Proceedings of 7th Annual International Conference on Research in Computational Molecular Biology, RECOMB ’03, pp. 67–75. ACM (2003) Buhler, J., Keich, U., Sun, Y.: Designing seeds for similarity search in genomic DNA. In: Proceedings of 7th Annual International Conference on Research in Computational Molecular Biology, RECOMB ’03, pp. 67–75. ACM (2003)
13.
Zurück zum Zitat Castelo, A.T., Martins, W., Gao, G.R.: TROLL-tandem repeat occurrence locator. Bioinformatics 18(4), 634–636 (2002)CrossRef Castelo, A.T., Martins, W., Gao, G.R.: TROLL-tandem repeat occurrence locator. Bioinformatics 18(4), 634–636 (2002)CrossRef
14.
Zurück zum Zitat Dandass, Y., Burgess, S., Lawrence, M., Bridges, S.: Accelerating string set matching in FPGA hardware for bioinformatics research. BMC Bioinformatics 9(1), 197–207 (2008)CrossRef Dandass, Y., Burgess, S., Lawrence, M., Bridges, S.: Accelerating string set matching in FPGA hardware for bioinformatics research. BMC Bioinformatics 9(1), 197–207 (2008)CrossRef
15.
Zurück zum Zitat Hong, S., Kim, H.: An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA ’09, pp. 152–163. ACM, New York (2009) Hong, S., Kim, H.: An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA ’09, pp. 152–163. ACM, New York (2009)
16.
17.
Zurück zum Zitat Majumder, A., Rastogi, R., Vanama, S.: Scalable regular expression matching on data streams. In: Proceedings of 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD ’08), pp. 161–172. ACM (2008) Majumder, A., Rastogi, R., Vanama, S.: Scalable regular expression matching on data streams. In: Proceedings of 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD ’08), pp. 161–172. ACM (2008)
18.
Zurück zum Zitat Michael, M., Dieterich, C., Vingron, M.: Siteblast-rapid and sensitive local alignment of genomic sequences employing motif anchors. Bioinformatics 21(9), 2093–2094 (2005)CrossRef Michael, M., Dieterich, C., Vingron, M.: Siteblast-rapid and sensitive local alignment of genomic sequences employing motif anchors. Bioinformatics 21(9), 2093–2094 (2005)CrossRef
20.
Zurück zum Zitat NVIDIA: Tuning CUDA Applications for Fermi 1.3 (2010) NVIDIA: Tuning CUDA Applications for Fermi 1.3 (2010)
21.
Zurück zum Zitat Scarpazza, D.: Top-performance tokenization and small-ruleset regular expression matching. Int. J. Parallel Program. 39(1), 3–32 (2011)CrossRef Scarpazza, D.: Top-performance tokenization and small-ruleset regular expression matching. Int. J. Parallel Program. 39(1), 3–32 (2011)CrossRef
22.
Zurück zum Zitat Scarpazza, D., Villa, O., Petrini, F.: High-speed string searching against large dictionaries on the cell/B.E. Processor. In: Proceedings of IEEE International Symposium on Parallel and Distributed Processing, 2008 (IPDPS 2008), pp. 1–12. IEEE Computer Society (2008) Scarpazza, D., Villa, O., Petrini, F.: High-speed string searching against large dictionaries on the cell/B.E. Processor. In: Proceedings of IEEE International Symposium on Parallel and Distributed Processing, 2008 (IPDPS 2008), pp. 1–12. IEEE Computer Society (2008)
23.
Zurück zum Zitat Tan, L., Sherwood, T.: A high throughput string matching architecture for intrusion detection and prevention. In: Proceedings of 32nd Annual International Symposium on Computer Architecture (ISCA ’05), pp. 112–122. IEEE Computer Society (2005) Tan, L., Sherwood, T.: A high throughput string matching architecture for intrusion detection and prevention. In: Proceedings of 32nd Annual International Symposium on Computer Architecture (ISCA ’05), pp. 112–122. IEEE Computer Society (2005)
24.
Zurück zum Zitat Trapnell, C., Schatz, M.C.: Optimizing data intensive GPGPU computations for DNA sequence alignment. Parallel Comput. 35(8–9), 429–440 (2009)CrossRef Trapnell, C., Schatz, M.C.: Optimizing data intensive GPGPU computations for DNA sequence alignment. Parallel Comput. 35(8–9), 429–440 (2009)CrossRef
25.
Zurück zum Zitat Tumeo, A., Villa, O.: Accelerating DNA analysis applications on GPU clusters. In: Proceedings of 2010 IEEE 8th Symposium on Application Specific Processors (SASP), pp. 71–76. IEEE Computer Society (2010) Tumeo, A., Villa, O.: Accelerating DNA analysis applications on GPU clusters. In: Proceedings of 2010 IEEE 8th Symposium on Application Specific Processors (SASP), pp. 71–76. IEEE Computer Society (2010)
26.
Zurück zum Zitat Tumeo, A., Villa, O., Sciuto, D.: Efficient pattern matching on GPUs for intrusion detection systems. In: Proceedings of 7th ACM International Conference on Computing Frontiers, CF ’10, pp. 87–88. ACM (2010) Tumeo, A., Villa, O., Sciuto, D.: Efficient pattern matching on GPUs for intrusion detection systems. In: Proceedings of 7th ACM International Conference on Computing Frontiers, CF ’10, pp. 87–88. ACM (2010)
28.
Zurück zum Zitat Vasiliadis, G., Antonatos, S., Polychronakis, M., Markatos, E.P., Ioannidis, S.: Gnort: high performance network intrusion detection using graphics processors. In: Proceedings of 11th International Symposium on Recent Advances in Intrusion Detection (RAID ’08), pp. 116–134. Springer-Verlag (2008) Vasiliadis, G., Antonatos, S., Polychronakis, M., Markatos, E.P., Ioannidis, S.: Gnort: high performance network intrusion detection using graphics processors. In: Proceedings of 11th International Symposium on Recent Advances in Intrusion Detection (RAID ’08), pp. 116–134. Springer-Verlag (2008)
29.
Zurück zum Zitat Vespa, L., Weng, N., Ramaswamy, R.: MS-DFA: multiple-stride pattern matching for scalable deep packet inspection. Comput J 54(2), 285–303 (2011)CrossRef Vespa, L., Weng, N., Ramaswamy, R.: MS-DFA: multiple-stride pattern matching for scalable deep packet inspection. Comput J 54(2), 285–303 (2011)CrossRef
30.
Zurück zum Zitat Villa, O., Chavarria-Miranda, D., Maschhoff, K.: Input-independent, scalable and fast string matching on the cray XMT. In: Proceedings of IEEE International Symposium on Parallel Distributed Processing, 2009. IPDPS 2009, pp. 1–12 (2009) Villa, O., Chavarria-Miranda, D., Maschhoff, K.: Input-independent, scalable and fast string matching on the cray XMT. In: Proceedings of IEEE International Symposium on Parallel Distributed Processing, 2009. IPDPS 2009, pp. 1–12 (2009)
31.
Zurück zum Zitat Wu, S., Manber, U.: Agrep - A fast approximate pattern-matching tool. In: Proceedings of USENIX Technical Conference, pp. 153–162. USENIX (1992) Wu, S., Manber, U.: Agrep - A fast approximate pattern-matching tool. In: Proceedings of USENIX Technical Conference, pp. 153–162. USENIX (1992)
32.
Zurück zum Zitat Xu, C., Kirk, S., Jenkins, S.: Tiling for performance tuning on different models of GPUs. In: Proceedings of 2009 Second International Symposium on Information Science and Engineering (ISISE), pp. 500–504 (2009) Xu, C., Kirk, S., Jenkins, S.: Tiling for performance tuning on different models of GPUs. In: Proceedings of 2009 Second International Symposium on Information Science and Engineering (ISISE), pp. 500–504 (2009)
33.
Zurück zum Zitat Yang, Y.H.E., Jiang, W., Prasanna, V.K.: Compact architecture for high-throughput regular expression matching on FPGA. In: Proceedings of 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS ’08), pp. 30–39. ACM (2008) Yang, Y.H.E., Jiang, W., Prasanna, V.K.: Compact architecture for high-throughput regular expression matching on FPGA. In: Proceedings of 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS ’08), pp. 30–39. ACM (2008)
34.
Zurück zum Zitat Zhongqiang, C., Yuan, Z., Zhongrong, C., Alex, D.: A digest and pattern matching-based intrusion detection engine. Comput. J. 52(6), 699–723 (2009)CrossRef Zhongqiang, C., Yuan, Z., Zhongrong, C., Alex, D.: A digest and pattern matching-based intrusion detection engine. Comput. J. 52(6), 699–723 (2009)CrossRef
Metadaten
Titel
GPU-Friendly Parallel Genome Matching with Tiled Access and Reduced State Transition Table
verfasst von
Yunho Oh
Doohwan Oh
Won W. Ro
Publikationsdatum
01.08.2013
Verlag
Springer US
Erschienen in
International Journal of Parallel Programming / Ausgabe 4/2013
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-012-0234-5