skip to main content
10.1145/2806416.2806450acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

F1: Accelerating the Optimization of Aggregate Continuous Queries

Published:17 October 2015Publication History

ABSTRACT

Data Stream Management Systems performing on-line analytics rely on the efficient execution of large numbers of Aggregate Continuous Queries (ACQs). The state-of-the-art WeaveShare optimizer uses the Weavability concept in order to selectively combine ACQs for partial aggregation and produce high quality execution plans. However, WeaveShare does not scale well with the number of ACQs. In this paper we propose a novel closed formula, F1, that accelerates Weavability calculations, and thus allows WeaveShare to achieve exceptional scalability in systems with heavy workloads. In general, F1 can reduce the computation time of any technique that combines partial aggregations within composite slides of multiple ACQs. We theoretically analyze the Bit Set approach currently used by WeaveShare and show that F1 is superior in both time and space complexities. We show that F1 performs 1062 times less operations compared to Bit Set to produce the same execution plan for the same input. We experimentally show that F1 executes up to 60,000 times faster and can handle 1,000,000 ACQs in a setting where the limit for the current technique is 550.

References

  1. D. J. Abadi et al. Aurora: a new model and architecture for data stream management. VLDBJ, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. J. Abadi et al. The design of the borealis stream processing engine. In CIDR, 2005.Google ScholarGoogle Scholar
  3. T. Akidau et al. Millwheel: Fault-tolerant stream processing at internet scale. In VLDB, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Chung, S. Guirguis, and A. Kurdia. Competitive cost-savings in data stream management systems. In COCOON. 2014.Google ScholarGoogle Scholar
  5. T. Condie, N. Conway, P. Alvaro, J. Hellerstein, J. Gerth, J. Talbot, K. Elmeleegy, R. Sears. Online aggregation and continuous query support in mapreduce. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. M. Ghanem, M. A. Hammad, M. F. Mokbel, W. G. Aref, and A. K. Elmagarmid. Incremental evaluation of sliding-window queries over data streams. TKDE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Guirguis, M. Sharaf, P. K. Chrysanthis, and A. Labrinidis. Three-level processing of multiple aggregate continuous queries. In ICDE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Guirguis, M. A. Sharaf, P. K. Chrysanthis, and A. Labrinidis. Optimized processing of multiple aggregate continuous queries. In CIKM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, C. Soriente, and P. Valduriez. Streamcloud: An elastic and scalable data streaming system. IEEE TPDS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. R. Katsipoulakis, C. Thoma, E. A. Gratta, A. Labrinidis, A. J. Lee, and P. K. Chrysanthis. Confidential elastic processing of data streams. In SIGMOD, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Krishnamurthy, C. Wu, and M. Franklin. On-the-fly sharing for streamed aggregation. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker. No pane, no gain: efficient evaluation of sliding-window aggregates over data streams. SIGMOD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker. Semantics and evaluation techniques for window aggregates in data streams. In SIGMOD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, , and R. Varma. Query processing, approximation, and resource management in a data stream management system. In CIDR, 2003.Google ScholarGoogle Scholar
  15. K. Naidu, R. Rastogi, S. Satkin, and A. Srinivasan. Memory-constrained aggregate computation over data streams. In ICDE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and extensible algorithms for multi query optimization. In SIGMOD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. U. Shein, P. K. Chrysanthis, and A. Labrinidis. Processing of aggregate continuous queries in a distributed environment. In BIRTE. 2015.Google ScholarGoogle Scholar
  18. K. Tangwongsan, M. Hirzel, S. Schneider, K-L. Wu. General incremental sliding-window aggregation. VLDB, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Toshniwal et al. Storm@twitter. In SIGMOD, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Weisstein. Binomial series. Wolfram MathWorld.Google ScholarGoogle Scholar
  21. E. Weisstein. Euclidean algorithm. Wolfram MathWorld.Google ScholarGoogle Scholar
  22. E. Weisstein. Harmonic number. Wolfram MathWorld.Google ScholarGoogle Scholar
  23. E. Weisstein. Prime number theorem. Wolfram MathWorld.Google ScholarGoogle Scholar
  24. R. Zhang, N. Koudas, B. C. Ooi, D. Srivastava. Multiple aggregations over data streams. In SIGMOD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Zhang, N. Koudas, B. C. Ooi, D. Srivastava, and P. Zhou. Streaming multiple aggregations using phantoms. VLDBJ, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. F1: Accelerating the Optimization of Aggregate Continuous Queries

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader