skip to main content
10.1145/3035918.3064016acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

A Top-Down Approach to Achieving Performance Predictability in Database Systems

Published:09 May 2017Publication History

ABSTRACT

While much of the research on transaction processing has focused on improving overall performance in terms of throughput and mean latency, surprisingly less attention has been given to performance predictability: how often individual transactions exhibit execution latency far from the mean. Performance predictability is increasingly important when transactions lie on the critical path of latency-sensitive applications, enterprise software, or interactive web services.

In this paper, we focus on understanding and mitigating the sources of performance unpredictability in today's transactional databases. We conduct the first quantitative study of major sources of variance in MySQL, Postgres (two of the largest and most popular open-source products on the market), and VoltDB (a non-conventional database). We carry out our study with a tool called TProfiler that, given the source code of a database system and programmer annotations indicating the start and end of a transaction, is able to identify the dominant sources of variance in transaction latency. Based on our findings, we investigate alternative algorithms, implementations, and tuning strategies to reduce latency variance without compromising mean latency or throughput. Most notably, we propose a new lock scheduling algorithm, called Variance-Aware Transaction Scheduling (VATS), and a lazy buffer pool replacement policy. In particular, our modified MySQL exhibits significantly lower variance and 99th percentile latencies by up to 5.6× and 6.3×, respectively. Our proposal has been welcomed by the open-source community, and our VATS algorithm has already been adopted as of MySQL's 5.7.17 release (and been made the default scheduling policy in MariaDB).

References

  1. 15.5.2 InnoDB Transaction Model. http://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-model.html.Google ScholarGoogle Scholar
  2. Google Cloud SQL. http://code.google.com/apis/sql.Google ScholarGoogle Scholar
  3. MariaDB source repository. https://github.com/MariaDB/server/pull/248.Google ScholarGoogle Scholar
  4. Oracle database cloud service. http://cloud.oracle.com.Google ScholarGoogle Scholar
  5. Program2 Two-Phase Locking (2PL) vs. Timestamp Ordering (TSO) vs. A Real Protocol. http://cobweb.cs.uga.edu/shasha/course/csci8370/prog2/prog2.pdf.Google ScholarGoogle Scholar
  6. XtraDB Performance Improvements for I/O-Bound Highly-Concurrent Workloads. https://www.percona.com/doc/percona-server/5.7/performance/xtradb performance improvements for io-bound highly-concurrent workloads.html.Google ScholarGoogle Scholar
  7. SQL Server Locking and You! https://www.brentozar.com/archive/2011/06/sql-server-locking/, 2011.Google ScholarGoogle Scholar
  8. MT LRU flusher. https://blueprints.launchpad.net/percona-server/+spec/mt-lru, 2016.Google ScholarGoogle Scholar
  9. MySQL Transaction Lock Manager Source Code. https://github.com/mysql/mysql-server/blob/5.7/storage/innobase/lock/lock0lock.cc, 2016.Google ScholarGoogle Scholar
  10. Parallel doublewrite buffer. https://blueprints.launchpad.net/percona-server/+spec/parallel-doublewrite, 2016.Google ScholarGoogle Scholar
  11. Percona Server 5.7: multi-threaded LRU flushing. https://www.percona.com/blog/2016/05/05/percona-server-5--7-multi-threaded-lru-flushing/, 2016.Google ScholarGoogle Scholar
  12. Postgre Transaction Lock Manager Source Code. https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/proc.c, 2016.Google ScholarGoogle Scholar
  13. R. K. Abbott et al. Scheduling real-time transactions: A performance evaluation. TODS, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Agrawal. A parallel logging algorithm for multiprocessor database machines. In Database Machines. 1985.Google ScholarGoogle ScholarCross RefCross Ref
  15. R. Agrawal and D. J. DeWitt. Recovery architectures for multiprocessor database machines. ACM, 1985.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Armbrust, K. Curtis, T. Kraska, A. Fox, M. J. Franklin, and D. A. Patterson. Piql: Success-tolerant query processing in the cloud. PVLDB, 5, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Armbrust, E. Liang, T. Kraska, A. Fox, M. J. Franklin, and D. A. Patterson. Generalized scale independence through incremental precomputation. In SIGMOD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Arulraj, A. Pavlo, and S. R. Dulloor. Let's talk about storage; recovery methods for non-volatile memory database systems. In SIGMOD, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Arumugam, A. Dobra, C. M. Jermaine, N. Pansare, and L. Perez. The datapath system: a data-centric analytic processing engine for large data warehouses. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Attariyan et al. X-ray: Automating root-cause diagnosis of performance anomalies in production software. In OSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. D. Bailis. Coordination Avoidance in Distributed Databases. PhD thesis, UC Berkeley, 2015.Google ScholarGoogle Scholar
  22. Bector et al. V-shape property of optimal sequence of jobs about a common due date on a single machine. Computers & operations research, 1989.Google ScholarGoogle Scholar
  23. A. R. Bernat and B. P. Miller. Incremental call-path profiling. Concurrency and Computation: Practice and Experience, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  24. Bhatia et al. Lightweight, high-resolution monitoring for troubleshooting production systems. In OSDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. Candea, N. Polyzotis, and R. Vingralek. A scalable, predictable join operator for highly concurrent data warehouses. PVLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Chaudhuri, H. Lee, and V. R. Narasayya. Variance aware optimization of parameterized queries. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Chen et al. Sequencing heuristic for bicriteria scheduling in a single machine problem. Journal of Information and Optimization Sciences, 2006.Google ScholarGoogle Scholar
  28. S. Chen. Flashlogging: exploiting flash devices for synchronous logging performance. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. F. Chu et al. Least expected cost query optimization: An exercise in utility. In PODS, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. B. F. Cooper et al. Benchmarking cloud serving systems with ycsb. In SoCC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. De et al. On the minimization of completion time variance with a bicriteria extension. Operations Research, 1992.Google ScholarGoogle Scholar
  32. D. E. Difallah, A. Pavlo, C. Curino, and P. Cudre-Mauroux. Oltp-bench: An extensible testbed for bench-marking relational databases. PVLDB, 7, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Eilon and I. Chowdhury. Minimising waiting time variance in the single machine problem. Management Science, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. D. Florescu and D. Kossmann. Rethinking cost and performance of database systems. Sigmod Record, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Giannikis et al. Shareddb: killing one thousand queries with one stone. PVLDB, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. L. Graham et al. Gprof: A call graph execution profiler. In Sigplan Notices, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. B. Gregg. DTrace pid Provider return. http://tinyurl.com/jzpphne, 2011.Google ScholarGoogle Scholar
  38. R. J. Hall. Call path profiling. In ICSE, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. Harizopoulos, D. J. Abadi, S. Madden, and M. Stone-braker. OLTP through the looking glass, and what we found there. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Harizopoulos, V. Shkapenyuk, and A. Ailamaki. Qpipe: a simultaneously pipelined relational query engine. In SIGMOD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. P. Helland, H. Sammer, J. Lyon, R. Carr, P. Garrett, and A. Reuter. Group commit timers and high volume transaction systems. In HPTS. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. Huang, B. Mozafari, and T. Wenisch. Statistical analysis of latency through semantic profiling. In EuroSys, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Y.-K. Kim and S. H. Son. Supporting predictability in real-time database systems. In RTAS, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. W. Kubiak. Completion time variance minimization on a single machine is difficult. Operations Research Letters, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. W. Kubiak. New results on the completion time variance minimization. Discrete Applied Mathematics, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. W. Kubiak et al. Fast fully polynomial approximation schemes for minimizing completion time variance. Eur. Journal of Operational Research, 2002.Google ScholarGoogle Scholar
  47. T. Lahiri, M.-A. Neimat, and S. Folkman. Oracle timesten: An in-memory database for enterprise applications. IEEE Data Eng. Bull., 2013.Google ScholarGoogle Scholar
  48. P. Massa and P. Avesani. An experimental study on epinions.com community. In NCAI, 2005.Google ScholarGoogle Scholar
  49. B. Mozafari, C. Curino, A. Jindal, and S. Madden. Performance and resource modeling in highly-concurrent OLTP workloads. In SIGMOD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. B. Mozafari, C. Curino, and S. Madden. DBSeer: Resource and performance prediction for building a next generation database cloud. In CIDR, 2013.Google ScholarGoogle Scholar
  51. B. Mozafari, E. Z. Y. Goh, and D. Y. Yoon. CliffGuard: A principled framework for finding robust database designs. In SIGMOD, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. B. Mozafari, E. Z. Y. Goh, and D. Y. Yoon. Cliffguard: An extended report. Technical report, University of Michigan, Ann Arbor, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. V. Narasayya, I. Menache, M. Singh, F. Li, M. Syamala, and S. Chaudhuri. Sharing buffer pool memory in multitenant relational database-as-a-service. PVLDB, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. V. R. Narasayya, S. Das, M. Syamala, B. Chandramouli, and S. Chaudhuri. SQLVM: performance isolation in multi-tenant relational database-as-a-service. In CIDR, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. P. O'Neil et al. A two-phase approach to predictably scheduling real-time transactions., 1996.Google ScholarGoogle Scholar
  56. S. Pelley et al. Storage management in the nvram era. PVLDB, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. M. Pinedo. Scheduling: theory, algorithms, and systems. Springer Science, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. L. Qiao, V. Raman, F. Reiss, P. Haas, and G. Lohman. Main-memory scan sharing for multi-core cpus. PVLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, and R. Sidle. Constant-time query processing. In ICDE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. M. Sadoghi et al. Making updates disk-i/o friendly using ssds. PVLDB, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. J. M. Spivey. Fast, accurate call graph profiling.Software: Practice and Experience, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. M. Stonebraker and A. Pavlo. The seats airline ticketingsystems benchmark.Google ScholarGoogle Scholar
  63. R. Strom and S. Yemini. Optimistic recovery in distribute systems. TODS, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Z. Szebenyi et al. Space-efficient time-series call-path profiling of parallel applications. In SC09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. P. Unterbrunner et al. Predictable performance for unpredictable workloads. PVLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. M. Wainwright. Chapter 2: Basic tail and concentration bounds. http://www.stat.berkeley.edu/~mjwain/stat210b/Chap2TailBounds Jan22 2015.pdf.Google ScholarGoogle Scholar
  67. T. Wang and R. Johnson. Scalable logging through emerging non-volatile memory. PVLDB, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. A. Wolski. Tatp benchmark description, 2009.Google ScholarGoogle Scholar
  69. R. J. Yang et al. PTL: Partitioned logging for database storage on flash solid state drives. In WAIM. 2012.Google ScholarGoogle ScholarCross RefCross Ref
  70. N. Ye, X. Li, T. Farley, and X. Xu. Job scheduling methods for reducing waiting time variance. Computers & Operations Research, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. D. Y. Yoon, B. Mozafari, and D. P. Brown. DBSeer: Pain-free database administration through workload intelligence. PVLDB, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. D. Y. Yoon, N. Niu, and B. Mozafari. DBSherlock: A performance diagnostic tool for transactional databases. In SIGMOD, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. X. Yu, G. Bezerra, A. Pavlo, S. Devadas, and M. Stone-braker. Staring into the abyss: An evaluation of concurrency control with one thousand cores. PVLDB, 2014 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Top-Down Approach to Achieving Performance Predictability in Database Systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
      May 2017
      1810 pages
      ISBN:9781450341974
      DOI:10.1145/3035918

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 May 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader