ABSTRACT
A defining characteristic of continuous queries over on-line data streams, possibly bounded by sliding windows, is the potentially infinite and time-evolving nature of their inputs and outputs. New items continually arrive on the input streams and new results are continually produced. Additionally, inputs expire by falling out of range of their sliding windows and results expire when they cease to satisfy the query. This impacts continuous query processing in two ways. First, data stream systems allow tables to be queried alongside data streams, but in terms of query semantics, it is not clear how updates of tables are different from insertions and deletions caused by the movement of the sliding windows. Second, many interesting queries need to store state, which must be kept up-to-date as time goes on. Therefore, query processing efficiency depends highly on the amount of overhead involved in state maintenance.In this paper, we show that the above issues can be solved by understanding the update patterns of continuous queries and exploiting them during query processing. We propose a classification that defines four types of update characteristics. Using our classification, we present a definition of continuous query semantics that clearly states the role of relations. We then propose the notion of update-pattern-aware query processing, where physical implementations of query operators, including the data structures used for storing intermediate state, vary depending on the update patterns of their inputs and outputs. When tested on IP traffic logs, our update-pattern-aware query plans routinely outperform the existing techniques by an order of magnitude.
- D. Abadi et al. Aurora: A new model and architecture for data stream management. The VLDB Journal, 12(2):120--139, 2003. Google ScholarDigital Library
- A. Arasu, B. Babcock, S. Babu, J. McAlister, and J. Widom. Characterizing memory requirements for queries over continuous data streams. ACM Trans. Database Sys., 29(1):162--194, 2004. Google ScholarDigital Library
- A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: Semantic foundations and query execution. The VLDB Journal, 14(1), 2005, to appear. Google ScholarDigital Library
- A. Ayad and J. Naughton. Static optimization of conjunctive queries with sliding windows over unbounded streaming information sources. SIGMOD 2004, pages 419--430. Google ScholarDigital Library
- B. Babcock, S. Babu, M. Datar, R. Motwani, and D. Thomas. Operator scheduling in data stream systems. The VLDB Journal, 13(4):333--353, 2004. Google ScholarDigital Library
- S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom. Adaptive ordering of pipelined stream filters. SIGMOD 2004, pages 407--418. Google ScholarDigital Library
- S. Babu and J. Widom. Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. ACM Trans. Database Sys., 29(3):545--580, 2004. Google ScholarDigital Library
- R. Brown. Calendar queues: A fast O(1) priority queue implementation for the simulation event set problem. Communications of the ACM, 31(10):1220--1227, 1988. Google ScholarDigital Library
- D. Carney, U. Cetintemel, A. Rasin, S. Zdonik, M. Cherniack, and M. Stonebraker. Operator scheduling in a data stream manager. VLDB 2003, pages 838--849. Google ScholarDigital Library
- L. Golab and M. T. Özsu. Processing sliding window multi-joins in continuous queries over data streams. VLDB 2003, pages 500--511. Google ScholarDigital Library
- M. Hammad, W. Aref, M. Franklin, M. Mokbel, and A. Elmagarmid. Efficient execution of sliding window queries over data streams. Technical Report CSD TR 03-medulla035, Purdue University, 2003.Google Scholar
- M. Hammad et al. Nile: a query processing engine for data streams. ICDE 2004, page 851. Google ScholarDigital Library
- Q. Jiang and S. Chakravarthy. Scheduling strategies for processing continuous queries over streams. BNCOD 2004, pages 16--30.Google ScholarCross Ref
- J. Kang, J. Naughton, and S. Viglas. Evaluating windows joins over unbounded streams. ICDE 2003, pages 341--352.Google ScholarCross Ref
- J. Krämer and B. Seeger. A temporal foundation for continuous queries over data streams. COMAD 2005, pages 70--82.Google Scholar
- Y.-N. Law, H. Wang, and C. Zaniolo. Query languages and data models for database sequences and data streams. VLDB 2004, pages 492--503. Google ScholarDigital Library
- S. Madden and M. J. Franklin. Fjording the stream: An architecture for queries over streaming sensor data. ICDE 2002, pages 555--566.Google ScholarDigital Library
- R. Motwani et al. Query processing, approximation, and resource management in a data stream management system. CIDR 2003, pages 245--256.Google Scholar
- V. Paxson and S. Floyd. Wide-area traffic: The failure of Poisson modeling. IEEE/ACM Trans. on Networking, 3(3):226--244, 1995. Google ScholarDigital Library
- U. Srivastava and J. Widom. Flexible time management in data stream systems. PODS 2004, pages 263--274. Google ScholarDigital Library
- D. Terry, D. Goldberg, D. Nichols, and B. Oki. Continuous queries over append-only databases. SIGMOD 1992, pages 321--330. Google ScholarDigital Library
- P. Tucker, D. Maier, T. Sheard, and L. Faragas. Exploiting punctuation semantics in continuous data streams. IEEE Trans. Knowledge and Data Eng., 15(3):555--568, 2003. Google ScholarDigital Library
- S. Viglas, J. Naughton, and J. Burger. Maximizing the output rate of multi-join queries over streaming information sources. VLDB 2003, pages 285--296. Google ScholarDigital Library
- Update-pattern-aware modeling and processing of continuous queries
Recommendations
Processing sliding window multi-joins in continuous queries over data streams
VLDB '03: Proceedings of the 29th international conference on Very large data bases - Volume 29We study sliding window multi-join processing in continuous queries over data streams. Several algorithms are reported for performing continuous, incremental joins, under the assumption that all the sliding windows fit in main memory. The algorithms ...
Optimized processing of multiple aggregate continuous queries
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementData Streams Management Systems are designed to support monitoring applications, which require the processing of hundreds of Aggregate Continuous Queries (ACQs). These ACQs typically have different time granularities, with possibly different selection ...
Processing continuous join queries in sensor networks: a filtering approach
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of dataWhile join processing in wireless sensor networks has received a lot of attention recently, current solutions do not work well for continuous queries. In those networks however, continuous queries are the rule. To minimize the communication costs of ...
Comments