skip to main content
10.1145/1099554.1099731acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Efficient evaluation of parameterized pattern queries

Published:31 October 2005Publication History

ABSTRACT

Many applications rely on sequence databases and use extensively pattern-matching queries to retrieve data of interest. This paper extends the traditional pattern-matching expressions to parameterized patterns, featuring variables. Parameterized patterns are more expressive and allow to define concisely regular expressions that would be very complex to describe without variables. They can also be used to express additional constraints on patterns' variables.We show that they can be evaluated without additional cost with respect to traditional techniques (e.g., the Knuth-Morris-Pratt algorithm). We describe an algorithm that enjoys low memory and CPU time requirements, and provide experimental results which illustrate the gain of the optimized solution.

References

  1. R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. IEEE Intl. Conf. on Data Engineering (ICDE), pages 3--14, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Apostolico and R. Giancarlo. The boyer-moore-galil string searching strategies revisited. SIAM J. Comput., 15(1):98--105, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. S. Baker. Parameterized Pattern Matching by Boyer-Moore Type Algorithms. In Proc. of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 541--550, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. S. Boyer and J. S. Moore. A fast string searching algorithm. Commun. ACM, 20(10):762--772, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Crochemore, A. Czumaj, L. Gasieniec, T. Lecroq, W. Plandowski, and W. Rytter. Fast practical multi-pattern matching. Inf. Process. Lett., 71(3-4):107--113, 1999.Google ScholarGoogle Scholar
  6. M. Crochemore and W. Rytter. Text algorithms. Oxford University Press, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. du Mouza and P. Rigaux. Multi-scale Classification of Moving Objects Trajectories. In Proc. Intl. Conf. on Scientific and Statistical Databases (SSDBM), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. du Mouza and P. Rigaux. Mobility Patterns. GeoInformatica, 2005. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. I. Ezeife and M. Chen. Mining Web Sequential Patterns Incrementally with Revised PLWAP Tree. In WAIM, pages 539--548, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  10. D. Q. Goldin and P. C. Kanellakis. On similarity queries for time-series data: Constraint specification and implementation. In Proc. Intl. Conf. on Principles and Practice of Constraint Programming (CP'95), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S.-W. Kim, J. Yoon, S. Park, and T.-H. Kim. Shape-based Retrieval of Similar Subsequences in Time-Series Databases. In Proc. ACM Symposium on Applied Computing, pages 438--445, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Knuth, J. Morris, and V. Pratt. Fast Pattern Matching in Strings. SIAM J. Computing, 6(2):323--350, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  13. G. M. Landau and U. Vishkin. Fast string matching with k differences. J. Comput. Syst. Sci., 37(1):63--78, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y.-N. Law, H. Wang, and C. Zaniolo. Query Languages and Data Models for Database Sequences and Data Streams. In Proc. Intl. Conf. on Very Large Data Bases (VLDB), pages 492--503, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H.-F. Li, S.-Y. Lee, and M.-K. Shan. On mining webclick streams for path traversal patterns. In WWW (Alternate Track Papers & Posters), pages 404--405, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Mamoulis, H. Cao, G. Kollios, M. Hadjieleftheriou, Y. Tao, and D. W. Cheung. Mining, indexing, and querying historical spatiotemporal data. In ACM SIGKDD international conference on Knowledge discovery and data mining (KDD'04), pages 236--245, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. W. Mount. Bioinformatics Sequence and Genome Analysis, Second Edition. CSHL Press, 2004.Google ScholarGoogle Scholar
  18. R. Ramakrishnan, D. Donjerkovic, A. Ranganathan, K. S. Beyer, and M. Krishnaprasad. Srql: Sorted relational query language. In Proc. Intl. Conf. on Scientific and Statistical Databases (SSDBM), pages 84--95. IEEE Computer Society, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Sadri, C. Zaniolo, A. Zarkesh, and J. Adibi. Expressing and Optimizing Sequence Queries in Database Systems. ACM Trans. on Database Systems, 29(2), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Sadri, C. Zaniolo, A. M. Zarkesh, and J. Adibi. Optimization of Sequence Queries in Database Systems. In Proc. ACM Symp. on Principles of Database Systems (PODS), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Seshadri, M. Livny, and R. Ramakrishnan. SEQ: A Model for Sequence Databases. In Proc. IEEE Intl. Conf. on Data Engineering (ICDE), pages 232--239, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I. Simon. String matching algorithms and automata. In Results and Trends in Theoretical Computer Science, pages 386--395, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. P. Sistla, T. Hu, and V. Chowdhry. Similarity based retrieval from sequence databases using automata as queries. In Proc. Intl. Conf. on Information and Knowledge Management (CIKM), pages 237--244, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. Ukkonen. Finding approximate patterns in strings. J. Algorithms, 6(1):132--137, 1985.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Efficient evaluation of parameterized pattern queries

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
          October 2005
          854 pages
          ISBN:1595931406
          DOI:10.1145/1099554

          Copyright © 2005 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 31 October 2005

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          CIKM '05 Paper Acceptance Rate77of425submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader