Skip to main content
Erschienen in: Discover Computing 4/2008

01.08.2008

Efficient algorithms for pattern matching with general gaps, character classes, and transposition invariance

verfasst von: Kimmo Fredriksson, Szymon Grabowski

Erschienen in: Discover Computing | Ausgabe 4/2008

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We develop efficient dynamic programming algorithms for pattern matching with general gaps and character classes. We consider patterns of the form p 0 g(a 0,b 0)p 1 g(a 1,b 1)…p m−1, where p i ⊂ Σ, Σ is some finite alphabet, and g(a i ,b i ) denotes a gap of length a i b i between symbols p i and p i+1. The text symbol t j matches p i iff t j  ∈ p i . Moreover, we require that if p i matches t j , then p i+1 should match one of the text symbols \( t_{j+a_i+1} \ldots t_{j+b_i+1}.\) Either or both of a i and b i can be negative. We also consider transposition invariant matching, i.e., the matching condition becomes t j  ∈ p i τ, for some constant τ determined by the algorithms. We give algorithms that have efficient average and worst case running times. The algorithms have important applications in music information retrieval and computational biology. We give experimental results showing that the algorithms work well in practice.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Baeza-Yates, R. A., & Gonnet, G. H. (1992). A new approach to text searching. Communications of the ACM, 35(10), 74–82.CrossRef Baeza-Yates, R. A., & Gonnet, G. H. (1992). A new approach to text searching. Communications of the ACM, 35(10), 74–82.CrossRef
Zurück zum Zitat Cantone, D., Cristofaro, S., & Faro, S. (2005a). An efficient algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In Proceesings of WEA’05. LNCS (Vol. 3503, pp. 428–439). Springer. Cantone, D., Cristofaro, S., & Faro, S. (2005a). An efficient algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In Proceesings of WEA’05. LNCS (Vol. 3503, pp. 428–439). Springer.
Zurück zum Zitat Cantone, D., Cristofaro, S., & Faro, S. (2005b). On tuning the (δ,α)-sequential-sampling algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In Proceedings of ISMIR’05. Cantone, D., Cristofaro, S., & Faro, S. (2005b). On tuning the (δ,α)-sequential-sampling algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In Proceedings of ISMIR’05.
Zurück zum Zitat Crochemore, M., Czumaj, A., Gąsieniec, L., Jarominek, S., Lecroq, T., Plandowski, W., & Rytter, W. (1994). Speeding up two string matching algorithms. Algorithmica, 12(4–5), 247–267.CrossRefMathSciNetMATH Crochemore, M., Czumaj, A., Gąsieniec, L., Jarominek, S., Lecroq, T., Plandowski, W., & Rytter, W. (1994). Speeding up two string matching algorithms. Algorithmica, 12(4–5), 247–267.CrossRefMathSciNetMATH
Zurück zum Zitat Crochemore, M., Iliopoulos, C., Makris, C., Rytter, W., Tsakalidis, A., & Tsichlas, K. (2002). Approximate string matching with gaps. Nordic Journal of Computing, 9(1), 54–65.MathSciNetMATH Crochemore, M., Iliopoulos, C., Makris, C., Rytter, W., Tsakalidis, A., & Tsichlas, K. (2002). Approximate string matching with gaps. Nordic Journal of Computing, 9(1), 54–65.MathSciNetMATH
Zurück zum Zitat Crochemore, M., Iliopoulos, C., Navarro, G., Pinzon, Y., & Salinger, A. (2005). Bit-parallel (δ,γ)-matching suffix automata. Journal of Discrete Algorithms (JDA), 3(2–4), 198–214.CrossRefMathSciNetMATH Crochemore, M., Iliopoulos, C., Navarro, G., Pinzon, Y., & Salinger, A. (2005). Bit-parallel (δ,γ)-matching suffix automata. Journal of Discrete Algorithms (JDA), 3(2–4), 198–214.CrossRefMathSciNetMATH
Zurück zum Zitat Fredriksson, K., & Grabowski, S. (2006). Efficient bit-parallel algorithms for (δ,α)-matching. In Proceedings of WEA’06. LNCS (Vol. 4007, pp. 170–181). Springer. Fredriksson, K., & Grabowski, S. (2006). Efficient bit-parallel algorithms for (δ,α)-matching. In Proceedings of WEA’06. LNCS (Vol. 4007, pp. 170–181). Springer.
Zurück zum Zitat Horspool, R. N. (1980). Practical fast searching in strings. Software Practice and Experience, 10(6), 501–506.CrossRef Horspool, R. N. (1980). Practical fast searching in strings. Software Practice and Experience, 10(6), 501–506.CrossRef
Zurück zum Zitat Johnson, D. B. (1982). A priority queue in which initialization and queue operations take O(log log D) time. Mathematical Systems Theory, 15, 295–309.CrossRefMATH Johnson, D. B. (1982). A priority queue in which initialization and queue operations take O(log log D) time. Mathematical Systems Theory, 15, 295–309.CrossRefMATH
Zurück zum Zitat Mäkinen, V. (2003). Parameterized approximate string matching and local-similarity- based point-pattern matching. PhD thesis, Department of Computer Science, University of Helsinki. Mäkinen, V. (2003). Parameterized approximate string matching and local-similarity- based point-pattern matching. PhD thesis, Department of Computer Science, University of Helsinki.
Zurück zum Zitat Mäkinen, V., Navarro, G., & Ukkonen, E. (2005). Transposition invariant string matching. Journal of Algorithms, 56(2), 124–153.CrossRefMathSciNetMATH Mäkinen, V., Navarro, G., & Ukkonen, E. (2005). Transposition invariant string matching. Journal of Algorithms, 56(2), 124–153.CrossRefMathSciNetMATH
Zurück zum Zitat Mehldau, G., & Myers, G. (1993). A system for pattern matching applications on biosequences. Computer Application in the Bioscience, 9(3), 299–314. Mehldau, G., & Myers, G. (1993). A system for pattern matching applications on biosequences. Computer Application in the Bioscience, 9(3), 299–314.
Zurück zum Zitat Myers, G. (1996). Approximate matching of network expression with spacers. Journal of Computational Biology, 3(1), 33–51.CrossRef Myers, G. (1996). Approximate matching of network expression with spacers. Journal of Computational Biology, 3(1), 33–51.CrossRef
Zurück zum Zitat Navarro, G., & Raffinot, M. (2003). Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. Journal of Computational Biology, 10(6), 903–923.CrossRef Navarro, G., & Raffinot, M. (2003). Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. Journal of Computational Biology, 10(6), 903–923.CrossRef
Zurück zum Zitat Paul, W., & Simon, J. (1980). Decision trees and random access machines. In ZUERICH: Proceedings of the Symposium Logik und Algorithmik (pp. 331–340). Paul, W., & Simon, J. (1980). Decision trees and random access machines. In ZUERICH: Proceedings of the Symposium Logik und Algorithmik (pp. 331–340).
Zurück zum Zitat Pinzón, Y. J., & Wang, S. (2005). Simple algorithm for pattern-matching with bounded gaps in genomic sequences. In Proceedings of ICNAAM’05 (pp. 827–831). Pinzón, Y. J., & Wang, S. (2005). Simple algorithm for pattern-matching with bounded gaps in genomic sequences. In Proceedings of ICNAAM’05 (pp. 827–831).
Metadaten
Titel
Efficient algorithms for pattern matching with general gaps, character classes, and transposition invariance
verfasst von
Kimmo Fredriksson
Szymon Grabowski
Publikationsdatum
01.08.2008
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 4/2008
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-008-9054-z

Weitere Artikel der Ausgabe 4/2008

Discover Computing 4/2008 Zur Ausgabe

Preface

Preface