skip to main content
article

Towards predicting query execution time for concurrent and dynamic database workloads

Published:01 August 2013Publication History
Skip Abstract Section

Abstract

Predicting query execution time is crucial for many database management tasks including admission control, query scheduling, and progress monitoring. While a number of recent papers have explored this problem, the bulk of the existing work either considers prediction for a single query, or prediction for a static workload of concurrent queries, where by "static" we mean that the queries to be run are fixed and known. In this paper, we consider the more general problem of dynamic concurrent workloads. Unlike most previous work on query execution time prediction, our proposed framework is based on analytic modeling rather than machine learning. We first use the optimizer's cost model to estimate the I/O and CPU requirements for each pipeline of each query in isolation, and then use a combination queueing model and buffer pool model that merges the I/O and CPU requests from concurrent queries to predict running times. We compare the proposed approach with a machine-learning based approach that is a variant of previous work. Our experiments show that our analytic-model based approach can lead to competitive and often better prediction accuracy than its machine-learning based counterpart.

References

  1. M. Ahmad, A. Aboulnaga, S. Babu, and K. Munagala. Interaction-aware scheduling of report-generation workloads. The VLDB Journal, 20:589-615, 2011. Google ScholarGoogle Scholar
  2. M. Ahmad, S. Duan, A. Aboulnaga, and S. Babu. Predicting completion times of batch query workloads using interaction-aware models and simulation. In EDBT, pages 449-460, 2011. Google ScholarGoogle Scholar
  3. M. Akdere, U. Çetintemel, M. Riondato, E. Upfal, and S. B. Zdonik. Learning-based query performance modeling and prediction. In ICDE, pages 390-401, 2012. Google ScholarGoogle Scholar
  4. F. R. Bach and M. I. Jordan. Kernel independent component analysis. Journal of Machine Learning Research, 3:1-48, 2002. Google ScholarGoogle Scholar
  5. S. Chaudhuri, V. R. Narasayya, and R. Ramamurthy. Estimating progress of execution for SQL queries. In SIGMOD, 2004. Google ScholarGoogle Scholar
  6. J. Duggan, U. Çetintemel, O. Papaemmanouil, and E. Upfal. Performance prediction for concurrent database workloads. In SIGMOD, pages 337-348, 2011. Google ScholarGoogle Scholar
  7. J. H. Friedman. Stochastic gradient boosting. Comput. Stat. Data Anal., 38(4):367-378, 2002. Google ScholarGoogle Scholar
  8. A. Ganapathi, H. A. Kuno, U. Dayal, J. L.Wiener, A. Fox, M. I. Jordan, and D. A. Patterson. Predicting multiple metrics for queries: Better decisions enabled by machine learning. In ICDE, 2009. Google ScholarGoogle Scholar
  9. G. Graefe. Query evaluation techniques for large databases. ACM Comput. Surv., 25(2):73-170, 1993. Google ScholarGoogle Scholar
  10. S. Guirguis, M. A. Sharaf, P. K. Chrysanthis, A. Labrinidis, and K. Pruhs. Adaptive scheduling of web transactions. In ICDE, 2009. Google ScholarGoogle Scholar
  11. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA data mining software: an update. SIGKDD Explorations, 11(1):10-18, 2009. Google ScholarGoogle Scholar
  12. E. D. Lazowska, J. Zahorjan, G. S. Graham, and K. C. Sevcik. Quantitative system performance - computer system analysis using queueing network models. Prentice Hall, 1984. Google ScholarGoogle Scholar
  13. J. Li, A. C. König, V. R. Narasayya, and S. Chaudhuri. Robust estimation of resource consumption for sql queries using statistical techniques. PVLDB, 5(11):1555-1566, 2012. Google ScholarGoogle Scholar
  14. G. Luo, J. F. Naughton, C. J. Ellmann, and M. Watzke. Toward a progress indicator for database queries. In SIGMOD, 2004. Google ScholarGoogle Scholar
  15. C. Mishra and N. Koudas. The design of a query monitoring system. ACM Trans. Database Syst., 34(1):1-51, 2009. Google ScholarGoogle Scholar
  16. V. F. Nicola, A. Dan, and D. M. Dias. Analysis of the generalized clock buffer replacement scheme for database transaction processing. In SIGMETRICS, pages 35-46, 1992. Google ScholarGoogle Scholar
  17. R. Osman, I. Awan, and M. E. Woodward. Application of queueing network models in the performance evaluation of database designs. Electr. Notes Theor. Comput. Sci., 232:101-124, 2009. Google ScholarGoogle Scholar
  18. J. R. Quinlan. Simplifying decision trees, 1986.Google ScholarGoogle Scholar
  19. R. Ramamurthy and D. J. DeWitt. Buffer-pool aware query optimization. In CIDR, pages 250-261, 2005.Google ScholarGoogle Scholar
  20. M. Reiser and S. S. Lavenberg. Mean-value analysis of closed multichain queuing networks. J. ACM, 27(2):313-322, 1980. Google ScholarGoogle Scholar
  21. Scilab Enterprises. Scilab: Free and Open Source software for numerical computation. Scilab Enterprises, Orsay, France, 2012.Google ScholarGoogle Scholar
  22. K. C. Sevcik. Data base system performance prediction using an analytical model (invited paper). In VLDB, pages 182-198, 1981. Google ScholarGoogle Scholar
  23. R. Suri, S. Sahu, and M. Vernon. Approximate mean value analysis for closed queueing networks with multiple-server stations. In Proc. of Industrial Engineering Research Conf. (IERC), 2007.Google ScholarGoogle Scholar
  24. N. Tomov, E. W. Dempster, M. H. Williams, A. Burger, H. Taylor, P. J. B. King, and P. Broughton. Analytical response time estimation in parallel relational database systems. Parallel Computing, 30(2):249-283, 2004. Google ScholarGoogle Scholar
  25. S. Tozer, T. Brecht, and A. Aboulnaga. Q-Cop: Avoiding bad query mixes to minimize client timeouts under heavy loads. In ICDE, 2010.Google ScholarGoogle Scholar
  26. T. J. Wasserman, P. Martin, D. B. Skillicorn, and H. Rizvi. Developing a characterization of business intelligence workloads for sizing new database systems. In DOLAP, pages 7-13, 2004. Google ScholarGoogle Scholar
  27. W. Wu, Y. Chi, H. Hacigümüs, and J. F. Naughton. Towards predicting query execution time for concurrent and dynamic database workloads. Technical Report TR-NECLA-DM-2013-13, NEC Laboratories America, 2013.Google ScholarGoogle Scholar
  28. W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacigümüs, and J. F. Naughton. Predicting query execution time: are optimizer cost models really unusable? In ICDE, pages 1081-1092, 2013. Google ScholarGoogle Scholar

Index Terms

  1. Towards predicting query execution time for concurrent and dynamic database workloads
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 6, Issue 10
        August 2013
        180 pages

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 August 2013
        Published in pvldb Volume 6, Issue 10

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader