skip to main content
10.1145/3318464.3389756acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

Thrifty Query Execution via Incrementability

Published:31 May 2020Publication History

ABSTRACT

Many applications schedule queries before all data is ready. To return fast query results, database systems can eagerly process existing data and incrementally incorporate new data into prior intermediate results, which often relies on incremental view maintenance (IVM) techniques. However, incrementally maintaining a query result can increase the total amount of work mainly as some early work is not useful for computing the final query result. In this paper, we propose a new metric incrementability to quantify the cost-effectiveness of IVM to decide how eagerly or lazily databases should incrementally execute a query. We further observe that different parts of a query have different levels of incrementability and the query execution should have a decomposed control flow based on the difference. Therefore, to address these needs, we propose a new query processing method Incrementability-aware Query Processing (InQP). We build a prototype InQP system based on Spark and show that InQP significantly reduces resource consumption with a similar latency compared with incrementability-oblivious approaches.

Skip Supplemental Material Section

Supplemental Material

3318464.3389756.mp4

mp4

133 MB

References

  1. Apache kafka. https://kafka.apache.org/.Google ScholarGoogle Scholar
  2. Spark structured streaming. https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html.Google ScholarGoogle Scholar
  3. Y. Ahmad, O. Kennedy, C. Koch, and M. Nikolic. DBToaster: Higher-order delta processing for dynamic, frequently fresh views. Proc. VLDB Endow., 5(10):968--979, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Ayad and J. F. Naughton. Static optimization of conjunctive queries with sliding windows over infinite streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13--18, 2004, pages 419--430, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Babcock, S. Babu, M. Datar, and R. Motwani. Chain: Operator scheduling for memory minimization in data stream systems. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, California, USA, June 9--12, 2003, pages 253--264, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. A. Blakeley, P. Larson, and F. W. Tompa. Efficiently updating materialized views. In Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 28--30, 1986, pages 61--71, 1986.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Cai, M. Balazinska, and D. Suciu. Pessimistic cardinality estimation: Tighter upper bounds for intermediate join cardinalities. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, pages 18--35, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi, and K. Tzoumas. Apache flink#8482;: Stream and batch processing in a single engine. IEEE Data Eng. Bull., 38(4):28--38, 2015.Google ScholarGoogle Scholar
  9. D. Carney, U. cC etintemel, A. Rasin, S. B. Zdonik, M. Cherniack, and M. Stonebraker. Operator scheduling in a data stream manager. In Proceedings of 29th International Conference on Very Large Data Bases, VLDB 2003, Berlin, Germany, September 9--12, 2003, pages 838--849, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  10. B. Chandramouli, J. Goldstein, M. Barnett, R. DeLine, J. C. Platt, J. F. Terwilliger, and J. Wernsing. Trill: A high-performance incremental query processor for diverse analytics. Proc. VLDB Endow., 8(4):401--412, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, and M. A. Shah. Telegraphcq: Continuous dataflow processing for an uncertain world. In CIDR 2003, First Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 5--8, 2003, Online Proceedings, 2003.Google ScholarGoogle Scholar
  12. S. Chaudhuri, V. R. Narasayya, and R. Ramamurthy. Estimating progress of long running SQL queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13--18, 2004, pages 803--814, 2004.Google ScholarGoogle Scholar
  13. J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. Niagaracq: A scalable continuous query system for internet databases. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16--18, 2000, Dallas, Texas, USA., pages 379--390, 2000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Chirkova and J. Yang. Materialized views. Foundations and Trends in Databases, 4(4):295--405, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  15. Y. Chung, M. L. Mortensen, C. Binnig, and T. Kraska. Estimating the impact of unknown unknowns on aggregate query results. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 861--876, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. S. Colby, T. Griffin, L. Libkin, I. S. Mumick, and H. Trickey. Algorithms for deferred view maintenance. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4--6, 1996, pages 469--480, 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. S. Colby, A. Kawaguchi, D. F. Lieuwen, I. S. Mumick, and K. A. Ross. Supporting multiple view maintenance policies. In SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13--15, 1997, Tucson, Arizona, USA., pages 405--416, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Estan and J. F. Naughton. End-biased samples for join cardinality estimation. In Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, 3--8 April 2006, Atlanta, GA, USA, page 20, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Gao, M. Wang, X. S. Wang, and S. Padmanabhan. A learning-based approach to estimate statistics of operators in continuous queries: a case study. In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, DMKD 2003, San Diego, California, USA, June 13, 2003, pages 66--72, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys (CSUR), 25(2):73--169, 1993.Google ScholarGoogle Scholar
  21. T. Griffin and B. Kumar. Algebraic change propagation for semijoin and outerjoin queries. SIGMOD Record, 27(3):22--27, 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. Griffin and L. Libkin. Incremental maintenance of views with duplicates. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, USA, May 22--25, 1995, pages 328--339, 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Gupta, I. S. Mumick, and V. S. Subrahmanian. Maintaining views incrementally. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 26--28, 1993, pages 157--166, 1993.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. He, J. Xie, J. Yang, and H. Yu. Asymmetric batch incremental view maintenance. In K. Aberer, M. J. Franklin, and S. Nishio, editors, Proceedings of the 21st International Conference on Data Engineering, ICDE 2005, 5--8 April 2005, Tokyo, Japan, pages 106--117. IEEE Computer Society, 2005.Google ScholarGoogle Scholar
  25. J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online aggregation. In J. Peckham, editor, SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13--15, 1997, Tucson, Arizona, USA, pages 171--182. ACM Press, 1997.Google ScholarGoogle Scholar
  26. M. Idris, M. Ugarte, and S. Vansummeren. The dynamic yannakakis algorithm: Compact and efficient query processing under updates. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14--19, 2017, pages 1259--1274, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Krishnan, Z. Yang, K. Goldberg, J. M. Hellerstein, and I. Stoica. Learning to optimize join queries with deep reinforcement learning. CoRR, abs/1808.03196, 2018.Google ScholarGoogle Scholar
  28. P. Larson and J. Zhou. Efficient maintenance of materialized outer-join views. In Proceedings of the 23rd International Conference on Data Engineering, ICDE 2007, The Marmara Hotel, Istanbul, Turkey, April 15--20, 2007, pages 56--65, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  29. V. Leis, A. Gubichev, A. Mirchev, P. A. Boncz, A. Kemper, and T. Neumann. How good are query optimizers, really? Proc. VLDB Endow., 9(3):204--215, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. V. Leis, B. Radke, A. Gubichev, A. Kemper, and T. Neumann. Cardinality estimation done right: Index-based join sampling. In CIDR 2017, 8th Biennial Conference on Innovative Data Systems Research, Chaminade, CA, USA, January 8--11, 2017, Online Proceedings, 2017.Google ScholarGoogle Scholar
  31. L. F. Mackert and G. M. Lohman. R* optimizer validation and performance evaluation for local queries. In Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 28--30, 1986, pages 84--95, 1986.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. Malik, R. C. Burns, and N. V. Chawla. A black-box approach to query cardinality estimation. In CIDR 2007, Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 7--10, 2007, Online Proceedings, pages 56--67, 2007.Google ScholarGoogle Scholar
  33. R. C. Marcus, P. Negi, H. Mao, C. Zhang, M. Alizadeh, T. Kraska, O. Papaemmanouil, and N. Tatbul. Neo: A learned query optimizer. Proc. VLDB Endow., 12(11):1705--1718, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Nikolic, M. Dashti, and C. Koch. How to win a hot dog eating contest: Distributed incremental view maintenance with batch updates. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 511--526, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Ortiz, M. Balazinska, J. Gehrke, and S. S. Keerthi. An empirical analysis of deep learning for cardinality estimation. CoRR, abs/1905.06425, 2019.Google ScholarGoogle Scholar
  36. P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, Boston, Massachusetts, USA, May 30 - June 1., pages 23--34, 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. A. Sharaf, P. K. Chrysanthis, A. Labrinidis, and K. Pruhs. Algorithms and metrics for processing multiple heterogeneous continuous queries. ACM Trans. Database Syst., 33(1):5:1--5:44, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Stillger, G. M. Lohman, V. Markl, and M. Kandil. LEO - db2's learning optimizer. In VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, September 11--14, 2001, Roma, Italy, pages 19--28, 2001.Google ScholarGoogle Scholar
  39. R. Taft, N. El-Sayed, M. Serafini, Y. Lu, A. Aboulnaga, M. Stonebraker, R. Mayerhofer, and F. Andrade. P-store: An elastic database system with predictive provisioning. In Proceedings of the 2018 International Conference on Management of Data, pages 205--219. ACM, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. D. Tang, Z. Shang, A. J. Elmore, S. Krishnan, and M. J. Franklin. Intermittent query processing. Proc. VLDB Endow., 12(11):1427--1441, July 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Y. Tao, M. L. Yiu, D. Papadias, M. Hadjieleftheriou, and N. Mamoulis. RPJ: producing fast join results on streams through rate-based optimization. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14--16, 2005, pages 371--382, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. I. Trummer. Exact cardinality query optimization with bounded execution cost. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, pages 2--17, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. S. Venkataraman, A. Panda, K. Ousterhout, M. Armbrust, A. Ghodsi, M. J. Franklin, B. Recht, and I. Stoica. Drizzle: Fast and adaptable stream processing at scale. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, October 28--31, 2017, pages 374--389, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. S. Viglas and J. F. Naughton. Rate-based query optimization for streaming information sources. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, Wisconsin, USA, June 3--6, 2002, pages 37--48, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S. Viglas, J. F. Naughton, and J. Burger. Maximizing the output rate of multi-way join queries over streaming information sources. In Proceedings of 29th International Conference on Very Large Data Bases, VLDB 2003, Berlin, Germany, September 9--12, 2003, pages 285--296, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  46. A. Wilschut and P. Apers. Pipelining in query execution. In Proceedings of the International Conference on Databases, Parallel Architectures and Their Applications (PARBASE 1990), pages 562--562, United States, 3 1990. IEEE Computer Society.Google ScholarGoogle ScholarCross RefCross Ref
  47. K. Zeng, S. Agarwal, and I. Stoica. iOLAP: Managing uncertainty for efficient incremental OLAP. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 1347--1361, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. J. Zhou, P. Larson, and H. G. Elmongui. Lazy maintenance of materialized views. In Proceedings of the 33rd International Conference on Very Large Data Bases, University of Vienna, Austria, September 23--27, 2007, pages 231--242, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Thrifty Query Execution via Incrementability

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
        June 2020
        2925 pages
        ISBN:9781450367356
        DOI:10.1145/3318464

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 31 May 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader