ABSTRACT
Traditional Data Stream Management Systems (DSMS) segment data streams using windows that are defined either by a time interval or a number of tuples. Such windows are fixed---the definition unvarying over the course of a stream---and are defined based on external properties unrelated to the data content of the stream. However, streams and their content do vary over time---the rate of a data stream may vary or the data distribution of the content may vary. The mismatch between a fixed stream segmentation and a variable stream motivates the need for a more flexible, expressive and physically independent stream segmentation. We introduce a new stream segmentation technique, called frames. Frames segment streams based on data content. We present a theory and implementation of frames and show the utility of frames for a variety of applications.
- A. Adi and O. Etzion. Amit -- The Situation Manager. VLDB J., 13(2):177--203, 2004. Google ScholarDigital Library
- J. Agrawal et al. Efficient Pattern Matching over Event Streams. In SIGMOD, pages 147--160, 2008. Google ScholarDigital Library
- T. Akidau et al. The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-scale, Unbounded, Out-of-order Data Processing. Proc. VLDB Endow., 8(12):1792--1803, 2015. Google ScholarDigital Library
- A. Artikis et al. Run-time Composite Event Recognition. In DEBS, 2012. Google ScholarDigital Library
- I. Botan et al. Extending XQuery with Window Functions. In VLDB, pages 75--86, 2007. Google ScholarDigital Library
- L. Brenna et al. Cayuga: A High-Performance Event Processing Engine. In SIGMOD, 2007. Google ScholarDigital Library
- B. Chandramouli et al. High-Performance Dynamic Pattern Matching over Disordered Streams. PVLDB, 3(1):220--231, 2010. Google ScholarDigital Library
- C. K. Chui et al. S-OLAP: An OLAP System for Analyzing Sequence Data. In SIGMOD, pages 1131--1134, 2010. Google ScholarDigital Library
- O. Etzion et al. Context-Based Event Processing Systems. In S. Helmer, A. Poulovassilis, and F. Xhafa, editors, Reasoning in Event-Based Distributed Systems, Studies in Computational Intelligence, pages 257--278. Springer, 2011.Google Scholar
- P. M. Fischer et al. Extending XQuery with a Pattern Matching Facility. In Proc. Intl. XML Database Symposium (XSym), 2010. Google ScholarDigital Library
- T. M. Ghanem et al. Supporting Views in Data Stream Management Systems. ACM Trans. Database Syst., 35(1), 2010. Google ScholarDigital Library
- L. Golab et al. A Sequence-Oriented Stream Warehouse Paradigm for Network Monitoring Applications. In Proc. Intl. Conf. on Passive and Active Measurement (PAM), 2012. Google ScholarDigital Library
- J. Himberg et al. Time Series Segmentation for Context Recognition in Mobile Devices. In ICDM, pages 203--210, 2001. Google ScholarDigital Library
- M. Hirzel. Partition and Compose: Parallel Complex Event Processing. In DEBS, pages 191--200, 2012. Google ScholarDigital Library
- F. Hueske. Introducing Stream Windows in Apache Flink. https://flink.apache.org/news/2015/12/04/Introducing-windows.html, 2015.Google Scholar
- IBM Streams Processing Language Standard Toolkit Reference, IBM Infosphere Streams V 2.0.0.4, 2012.Google Scholar
- Y. E. Ioannidis. The History of Histograms (abridged). In VLDB, pages 19--30, 2003. Google ScholarDigital Library
- Y. E. Ioannidis and V. Poosala. Balancing Histogram Optimality and Practicality for Query Result Size Estimation. In SIGMOD, pages 233--244, 1995. Google ScholarDigital Library
- A. Kalinin et al. Interactive Data Exploration Using Semantic Windows. In SIGMOD, pages 505--516, 2014. Google ScholarDigital Library
- E. J. Keogh et al. An Online Algorithm for Segmenting Time Series. In ICDM, pages 289--296, 2001. Google ScholarDigital Library
- J. Li et al. No Pane, No Gain: Efficient Evaluation of Sliding-Window Aggregates over Data Streams. SIGMOD Record, 34(1):39--44, 2005. Google ScholarDigital Library
- J. Li et al. Semantics and Evaluation Techniques for Window Aggregates in Data Streams. In SIGMOD, pages 311--322, 2005. Google ScholarDigital Library
- J. Li et al. AdaptWID: An Adaptive, Memory-Efficient Window Aggregation Implementation. IEEE Internet Computing, 12(6):22--29, 2008. Google ScholarDigital Library
- J. Li et al. Out-of-Order Processing: A New Architecture for High-Performance Stream Systems. PVLDB, 1(1):274--288, 2008. Google ScholarDigital Library
- D. Maier et al. Capturing Episodes: May the Frame Be with You. In DEBS, pages 1--11, 2012. Google ScholarDigital Library
- S. McReynolds. Complex Event Processing in the Real World. Oracle White Paper, Sept. 2007.Google Scholar
- MSDN Library. Snapshot Windows. In Developers Guide (StreamInsight): Writing Query Templates in LINQ.Google Scholar
- J. F. Naughton et al. The Niagara Internet Query System. IEEE Data Eng. Bull., 24(2):27--33, 2001.Google Scholar
- R. Sadri et al. Expressing and Optimizing Sequence Queries in Database Systems. ACM Trans. Database Syst., 29(2):282--318, 2004. Google ScholarDigital Library
- H. Shatkay and S. B. Zdonik. Approximate Queries and Representations for Large Data Sequences. In ICDE, pages 536--545, 1996. Google ScholarDigital Library
- J. Whiteneck et al. Framing the Question: Detecting and Filling Spatial-temporal Windows. In Proc. Intl. Workshop on GeoStreaming (IWGS), 2010. Google ScholarDigital Library
- F. Zemke et al. Pattern Matching in Sequences of Rows. Draft SQL Change Proposal, Mar. 2007.Google Scholar
Index Terms
- Frames: data-driven windows
Recommendations
EDF-PStream: Earliest Deadline First Scheduling of Preemptable Data Streams -- Issues Related to Automotive Applications
RTCSA '15: Proceedings of the 2015 IEEE 21st International Conference on Embedded and Real-Time Computing Systems and ApplicationsAutomotive applications are typical cyber-physical systems, which perform real-time continuous data processing using a variety of onboard sensors and communications from outside the vehicle. However, outside-the-vehicle data transmissions often ...
Dual-Paradigm Stream Processing
ICPP '18: Proceedings of the 47th International Conference on Parallel ProcessingExisting stream processing frameworks operate either under data stream paradigm processing data record by record to favor low latency, or under operation stream paradigm processing data in micro-batches to desire high throughput. For complex and mutable ...
Data Streams with Bounded Deletions
PODS '18: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsTwo prevalent models in the data stream literature are the insertion-only and turnstile models. Unfortunately, many important streaming problems require a Θ(log(n)) multiplicative factor more space for turnstile streams than for insertion-only streams. ...
Comments