ABSTRACT
Many applications rely on sequence databases and use extensively pattern-matching queries to retrieve data of interest. This paper extends the traditional pattern-matching expressions to parameterized patterns, featuring variables. Parameterized patterns are more expressive and allow to define concisely regular expressions that would be very complex to describe without variables. They can also be used to express additional constraints on patterns' variables.We show that they can be evaluated without additional cost with respect to traditional techniques (e.g., the Knuth-Morris-Pratt algorithm). We describe an algorithm that enjoys low memory and CPU time requirements, and provide experimental results which illustrate the gain of the optimized solution.
- R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. IEEE Intl. Conf. on Data Engineering (ICDE), pages 3--14, 1995. Google ScholarDigital Library
- A. Apostolico and R. Giancarlo. The boyer-moore-galil string searching strategies revisited. SIAM J. Comput., 15(1):98--105, 1986. Google ScholarDigital Library
- B. S. Baker. Parameterized Pattern Matching by Boyer-Moore Type Algorithms. In Proc. of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 541--550, 1995. Google ScholarDigital Library
- R. S. Boyer and J. S. Moore. A fast string searching algorithm. Commun. ACM, 20(10):762--772, 1977. Google ScholarDigital Library
- M. Crochemore, A. Czumaj, L. Gasieniec, T. Lecroq, W. Plandowski, and W. Rytter. Fast practical multi-pattern matching. Inf. Process. Lett., 71(3-4):107--113, 1999.Google Scholar
- M. Crochemore and W. Rytter. Text algorithms. Oxford University Press, 1994. Google ScholarDigital Library
- C. du Mouza and P. Rigaux. Multi-scale Classification of Moving Objects Trajectories. In Proc. Intl. Conf. on Scientific and Statistical Databases (SSDBM), 2004. Google ScholarDigital Library
- C. du Mouza and P. Rigaux. Mobility Patterns. GeoInformatica, 2005. To appear. Google ScholarDigital Library
- C. I. Ezeife and M. Chen. Mining Web Sequential Patterns Incrementally with Revised PLWAP Tree. In WAIM, pages 539--548, 2004.Google ScholarCross Ref
- D. Q. Goldin and P. C. Kanellakis. On similarity queries for time-series data: Constraint specification and implementation. In Proc. Intl. Conf. on Principles and Practice of Constraint Programming (CP'95), 1995. Google ScholarDigital Library
- S.-W. Kim, J. Yoon, S. Park, and T.-H. Kim. Shape-based Retrieval of Similar Subsequences in Time-Series Databases. In Proc. ACM Symposium on Applied Computing, pages 438--445, 2002. Google ScholarDigital Library
- D. Knuth, J. Morris, and V. Pratt. Fast Pattern Matching in Strings. SIAM J. Computing, 6(2):323--350, 1977.Google ScholarCross Ref
- G. M. Landau and U. Vishkin. Fast string matching with k differences. J. Comput. Syst. Sci., 37(1):63--78, 1988. Google ScholarDigital Library
- Y.-N. Law, H. Wang, and C. Zaniolo. Query Languages and Data Models for Database Sequences and Data Streams. In Proc. Intl. Conf. on Very Large Data Bases (VLDB), pages 492--503, 2004. Google ScholarDigital Library
- H.-F. Li, S.-Y. Lee, and M.-K. Shan. On mining webclick streams for path traversal patterns. In WWW (Alternate Track Papers & Posters), pages 404--405, 2004. Google ScholarDigital Library
- N. Mamoulis, H. Cao, G. Kollios, M. Hadjieleftheriou, Y. Tao, and D. W. Cheung. Mining, indexing, and querying historical spatiotemporal data. In ACM SIGKDD international conference on Knowledge discovery and data mining (KDD'04), pages 236--245, 2004. Google ScholarDigital Library
- D. W. Mount. Bioinformatics Sequence and Genome Analysis, Second Edition. CSHL Press, 2004.Google Scholar
- R. Ramakrishnan, D. Donjerkovic, A. Ranganathan, K. S. Beyer, and M. Krishnaprasad. Srql: Sorted relational query language. In Proc. Intl. Conf. on Scientific and Statistical Databases (SSDBM), pages 84--95. IEEE Computer Society, 1998. Google ScholarDigital Library
- R. Sadri, C. Zaniolo, A. Zarkesh, and J. Adibi. Expressing and Optimizing Sequence Queries in Database Systems. ACM Trans. on Database Systems, 29(2), 2004. Google ScholarDigital Library
- R. Sadri, C. Zaniolo, A. M. Zarkesh, and J. Adibi. Optimization of Sequence Queries in Database Systems. In Proc. ACM Symp. on Principles of Database Systems (PODS), 2001. Google ScholarDigital Library
- P. Seshadri, M. Livny, and R. Ramakrishnan. SEQ: A Model for Sequence Databases. In Proc. IEEE Intl. Conf. on Data Engineering (ICDE), pages 232--239, 1995. Google ScholarDigital Library
- I. Simon. String matching algorithms and automata. In Results and Trends in Theoretical Computer Science, pages 386--395, 1994. Google ScholarDigital Library
- A. P. Sistla, T. Hu, and V. Chowdhry. Similarity based retrieval from sequence databases using automata as queries. In Proc. Intl. Conf. on Information and Knowledge Management (CIKM), pages 237--244, 2002. Google ScholarDigital Library
- E. Ukkonen. Finding approximate patterns in strings. J. Algorithms, 6(1):132--137, 1985.Google ScholarCross Ref
Index Terms
- Efficient evaluation of parameterized pattern queries
Recommendations
Efficient approximations of conjunctive queries
PODS '12: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database SystemsWhen finding exact answers to a query over a large database is infeasible, it is natural to approximate the query by a more efficient one that comes from a class with good bounds on the complexity of query evaluation. In this paper we study such ...
Answering Conjunctive Queries with Inequalities
In this paper, we study the complexity of answering conjunctive queries (CQ) with inequalities (ź). In particular, we are interested in comparing the complexity of the query with and without inequalities. The main contribution of our work is a novel ...
Evaluation of partial path queries on xml data
CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge managementXML query languages typically allow the specification of structural patterns of elements. Finding the occurrences of such patterns in an XML tree is the key operation in XML query processing. Many algorithms have been presented for this operation. These ...
Comments