Skip to main content
Top
Published in: Annals of Data Science 4/2016

30-08-2016

N-Folded Parallel String Matching Mechanism

Authors: Butchi Raju Katari, S. Viswanadha Raju

Published in: Annals of Data Science | Issue 4/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A massive requirement of information vitalized the importance of managing enormous amount of data. It becomes a herculean task to fetch the anticipated data from large data storage as it includes text processing, text mining, pattern recognition, data cleaning etc., The need for concurrent events and coming up with high performance processing models to extract data is a challenge to the researchers. One of the solutions to this challenge is concurrent process to match string on processing models. While, some of the mechanisms do perform very well in practice. Frequent works have been published on this subject and research is still active in this area as the scope and opportunities to develop the new techniques is perennial. This paper proposes N-folded parallel string matching mechanism. This mechanism would be able to divide the input sequence files into various parts and the same would be distributed to the processors. Considering this mechanism as a model, experiments have been conducted considering chloroplast, mitochondria and different categories of plants genome sequence file as input for different sizes with seven possible patterns. The results of the experiment made evident that N-folded parallel string matching mechanism can reduce the processing time on a multi processor system.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Crochemore M, Rytter W, Crochemore M (1994) Text algorithms, vol 698. Oxford University Press, New York Crochemore M, Rytter W, Crochemore M (1994) Text algorithms, vol 698. Oxford University Press, New York
2.
go back to reference Simon Y, Inayatullah M (2004) Improving approximate matching capabilities for meta map transfer applications. In: Proceedings of symposium on principles and practice of programming in java, pp 143–147 Simon Y, Inayatullah M (2004) Improving approximate matching capabilities for meta map transfer applications. In: Proceedings of symposium on principles and practice of programming in java, pp 143–147
3.
go back to reference Fredriksson K, Grabowski S (2009) Average-optimal string matching. J Discret Algorithms 7(4):579–594CrossRef Fredriksson K, Grabowski S (2009) Average-optimal string matching. J Discret Algorithms 7(4):579–594CrossRef
4.
go back to reference Luis Russo L, Navarro G, Oliveira A, Morales P (2009) Approximate string matching with compressed indexes algorithm. Algorithms 2(3):1105–1136CrossRef Luis Russo L, Navarro G, Oliveira A, Morales P (2009) Approximate string matching with compressed indexes algorithm. Algorithms 2(3):1105–1136CrossRef
5.
go back to reference Ilie L, Navarro G, Tinta L (2010) The longest common extension problem, revisited and applications to approximate string searching. J Discret Algorithms 8(4):418–428CrossRef Ilie L, Navarro G, Tinta L (2010) The longest common extension problem, revisited and applications to approximate string searching. J Discret Algorithms 8(4):418–428CrossRef
6.
go back to reference Viswanadha Raju S et al. (2011) Recent advancement is parallel algorithms for string matching on computing models—a survey and experimental results. LNCS, Springer, pp 270–278, ISBN: 978-3-642-29279-8 Viswanadha Raju S et al. (2011) Recent advancement is parallel algorithms for string matching on computing models—a survey and experimental results. LNCS, Springer, pp 270–278, ISBN: 978-3-642-29279-8
7.
go back to reference Viswanadha Raju S et al. (2011) PDM data classification from STEP–an object oriented String matching approach. In: IEEE conference on application of information and communication technologies, pp 1–9. ISBN: 978-1-61284-831-0 Viswanadha Raju S et al. (2011) PDM data classification from STEP–an object oriented String matching approach. In: IEEE conference on application of information and communication technologies, pp 1–9. ISBN: 978-1-61284-831-0
8.
go back to reference Boyer RS, Moore JS (1977) A fast string searching algorithm, Carom. Assoc Comput Mach 20(10):262–277 Boyer RS, Moore JS (1977) A fast string searching algorithm, Carom. Assoc Comput Mach 20(10):262–277
9.
go back to reference Knuth D, Morris JH Jr, Pratt V (1977) Fast pattern matching in strings. SIAM J Comput 6(2):323–350CrossRef Knuth D, Morris JH Jr, Pratt V (1977) Fast pattern matching in strings. SIAM J Comput 6(2):323–350CrossRef
14.
go back to reference Sunday DM (1990) A very fast substring search algorithm. Commun ACM 33(8):132–142CrossRef Sunday DM (1990) A very fast substring search algorithm. Commun ACM 33(8):132–142CrossRef
15.
go back to reference Charras C, Lecrog T, Pehoushek JD (1998) A very fast string matching algorithm for small alphabets and long patterns. In: Combinatorial pattern matching. Springer, Berlin, pp 55–64 Charras C, Lecrog T, Pehoushek JD (1998) A very fast string matching algorithm for small alphabets and long patterns. In: Combinatorial pattern matching. Springer, Berlin, pp 55–64
16.
go back to reference Rasool A, Khare N (2013) Performance improvement of BMH and BMHS using PDJ (possible double jump) and MValue (match value). Int J Comput Appl 72(1):1–6 Rasool A, Khare N (2013) Performance improvement of BMH and BMHS using PDJ (possible double jump) and MValue (match value). Int J Comput Appl 72(1):1–6
17.
go back to reference Kim HJ, Lee S-W (2013) A hardware-based string matching using state transition compression for deep packet inspection. ETRI J 35(1):154–157CrossRef Kim HJ, Lee S-W (2013) A hardware-based string matching using state transition compression for deep packet inspection. ETRI J 35(1):154–157CrossRef
18.
go back to reference Dharmapurikar S, Krishnamurthy P, Sproull TS, Lockwood JW (2004) Deep packet inspection using parallel bloom filters. IEEE Micro 24(1):52–61CrossRef Dharmapurikar S, Krishnamurthy P, Sproull TS, Lockwood JW (2004) Deep packet inspection using parallel bloom filters. IEEE Micro 24(1):52–61CrossRef
19.
go back to reference Al-Mamory SO, Hamid A, Abdul-Razak A, Falah Z (2010) String matching enhancement for snort IDS. In: 2010 5th international conference on computer sciences and convergence information technology (ICCIT), pp 1020–1023 Al-Mamory SO, Hamid A, Abdul-Razak A, Falah Z (2010) String matching enhancement for snort IDS. In: 2010 5th international conference on computer sciences and convergence information technology (ICCIT), pp 1020–1023
20.
go back to reference Kun B, Nai-jie G, Kun T, Xiao-hu L, Gang L (2005) A practical distributed string matching algorithm architecture and implementation. World Acad Sci Eng Technol 10:1307–6884 Kun B, Nai-jie G, Kun T, Xiao-hu L, Gang L (2005) A practical distributed string matching algorithm architecture and implementation. World Acad Sci Eng Technol 10:1307–6884
21.
go back to reference Park JH, George KM (1999) Parallel string matching algorithms based on dataflow. In: IEEE Hawaii international conference on system sciences Park JH, George KM (1999) Parallel string matching algorithms based on dataflow. In: IEEE Hawaii international conference on system sciences
22.
go back to reference Kawulok J (2013) Approximate string matching for searching DNA sequences. Int J Biosci Biochem Bioinform 3(2):145–148 Kawulok J (2013) Approximate string matching for searching DNA sequences. Int J Biosci Biochem Bioinform 3(2):145–148
23.
go back to reference dos Reis CCT, Cruz O (2005) “Approximate String Matching Algorithm Using Parallel Methods for Molecular Sequence Camparisons. In IEEE data compression conference, pp 140–143 dos Reis CCT, Cruz O (2005) “Approximate String Matching Algorithm Using Parallel Methods for Molecular Sequence Camparisons. In IEEE data compression conference, pp 140–143
24.
go back to reference Zubair M et al. (2010) Text scanning approach for exact string matching. In: International conference on networking and information technology, pp 118–121 Zubair M et al. (2010) Text scanning approach for exact string matching. In: International conference on networking and information technology, pp 118–121
25.
go back to reference Tomohiro I, Inenaga S, Takeda M (2013) Palindrome pattern matching. Theor Comput Sci 483:162–170CrossRef Tomohiro I, Inenaga S, Takeda M (2013) Palindrome pattern matching. Theor Comput Sci 483:162–170CrossRef
26.
go back to reference Oh Y, Oh D, Ro WW (2013) GPU-friendly parallel genome matching with tiled access and reduced state transition table. Int J Parallel Prog 41(4):526–551CrossRef Oh Y, Oh D, Ro WW (2013) GPU-friendly parallel genome matching with tiled access and reduced state transition table. Int J Parallel Prog 41(4):526–551CrossRef
27.
go back to reference Nolte J, Horton P (2001) Parallel sequence matching with TACO’s distributed object groups—a case study from molecular biology. Clust Comput 4(1):71–77CrossRef Nolte J, Horton P (2001) Parallel sequence matching with TACO’s distributed object groups—a case study from molecular biology. Clust Comput 4(1):71–77CrossRef
28.
go back to reference Tseng K-K, Lin Y-D, Lee T-H, Lai Y-C (2005) A parallel automaton string matching with pre-hashing and root-indexing techniques for content filtering coprocessor. In: IEEE international conference on application-specific systems, pp 113–118 Tseng K-K, Lin Y-D, Lee T-H, Lai Y-C (2005) A parallel automaton string matching with pre-hashing and root-indexing techniques for content filtering coprocessor. In: IEEE international conference on application-specific systems, pp 113–118
29.
go back to reference Dandass YS, Burgess SC, Lawrence M, Bridges SM (2008) Accelerating string set matching in FPGA hardware for bioinformatics research. BMC Bioinform 9(1):1–11CrossRef Dandass YS, Burgess SC, Lawrence M, Bridges SM (2008) Accelerating string set matching in FPGA hardware for bioinformatics research. BMC Bioinform 9(1):1–11CrossRef
30.
go back to reference Hart SN, Sarangi V, Moore R, Baheti S, Bhavsar JD, Couch FJ, Kocher JP (2013) SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations. PLoS One 8(12):e83356CrossRef Hart SN, Sarangi V, Moore R, Baheti S, Bhavsar JD, Couch FJ, Kocher JP (2013) SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations. PLoS One 8(12):e83356CrossRef
31.
go back to reference Ounit R, Wanamaker S, Close TJ, Lonardi S (2015) CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers. BMC Genom 16(1):1CrossRef Ounit R, Wanamaker S, Close TJ, Lonardi S (2015) CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers. BMC Genom 16(1):1CrossRef
32.
go back to reference Alam KK, Chang JL, Burke DH (2015) FASTaptamer: a bioinformatic toolkit for high-throughput sequence analysis of combinatorial selections. Mol Therapy 4(3):e230 Alam KK, Chang JL, Burke DH (2015) FASTaptamer: a bioinformatic toolkit for high-throughput sequence analysis of combinatorial selections. Mol Therapy 4(3):e230
33.
go back to reference Yang X, Liu D, Lv N, Zhao F, Liu F, Zou J, Chen Y, Xiao X, Wu J, Liu P, Gao J (2015) TCRklass: a new k-string–based algorithm for human and mouse TCR repertoire characterization. J Immunol 194(1):446–454CrossRef Yang X, Liu D, Lv N, Zhao F, Liu F, Zou J, Chen Y, Xiao X, Wu J, Liu P, Gao J (2015) TCRklass: a new k-string–based algorithm for human and mouse TCR repertoire characterization. J Immunol 194(1):446–454CrossRef
34.
go back to reference Rao CS, Raju SV (2016) Next generation sequencing (NGS) database for tandem repeats with multiple pattern \(2^{\circ }\)-shaft multicore string matching. Genom Data 7:307–317CrossRef Rao CS, Raju SV (2016) Next generation sequencing (NGS) database for tandem repeats with multiple pattern \(2^{\circ }\)-shaft multicore string matching. Genom Data 7:307–317CrossRef
35.
go back to reference Sun D, Wang X (2016) A Q-gram filter for local alignment in large genomic database. Int J Hybrid Inf Technol 9(1):221–232CrossRef Sun D, Wang X (2016) A Q-gram filter for local alignment in large genomic database. Int J Hybrid Inf Technol 9(1):221–232CrossRef
Metadata
Title
N-Folded Parallel String Matching Mechanism
Authors
Butchi Raju Katari
S. Viswanadha Raju
Publication date
30-08-2016
Publisher
Springer Berlin Heidelberg
Published in
Annals of Data Science / Issue 4/2016
Print ISSN: 2198-5804
Electronic ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-016-0086-8

Other articles of this Issue 4/2016

Annals of Data Science 4/2016 Go to the issue