skip to main content
article
Free Access

R* optimizer validation and performance evaluation for local queries

Authors Info & Claims
Published:15 June 1986Publication History
Skip Abstract Section

Abstract

Few database query optimizer models have been validated against actual performance. This paper presents the methodology and results of a thorough validation of the optimizer and evaluation of the performance of the experimental distributed relational database management system R*, which inherited and extended to a distributed environment the optimization algorithms of System R. Optimizer estimated costs and actual R* resources consumed were written to database tables using new SQL commands, permitting automated control from SQL application programs of test data collection and reduction. A number of tests were run over a wide variety of dynamically-created test databases, SQL queries, and system parameters. The results for single-table access, sorting, and local 2-table joins are reported here. The tests confirmed the accuracy of the majority of the I/O cost model, the significant contribution of CPU cost to total cost, and the need to model CPU cost in more detail than was done in System R. The R* optimizer now retains cost components separately and estimates the number of CPU instructions, including those for applying different kinds of predicates. The sensitivity of I/O cost to buffer space motivated the development of more detailed models of buffer utilization unclustered index scans and nested-loop joins often benefit from pages remaining in the buffers, whereas concurrent scans of the data pages and the index pages for multiple tables during joins compete for buffer share. Without an index on the join column of the inner table, the optimizer correctly avoids the nested-loop join, confirming the need for merge-scan joins. When the join column of the inner is indexed, the optimizer overestimates the cost of the nested-loop join, whose actual performance is very sensitive to three parameters that are extremely difficult to estimate (1) the join (result) cardinality, (2) the outer table's cardinality, and (3) the number of buffer pages available to store the inner table. Suggestions are given for improved database statistics, prefetch and page replacement strategies for the buffer manager, and the use of temporary indexes and Bloom filters (hashed semijoins) to reduce access of unneeded data.

References

  1. APER 83 P M G Apers, A R Hevner, and S B Yao, Optnntzmg Algorithms for Dtstnbuted Queries, IEEE Trans on Software Engmeermg SE-9 (January 1983) pp 57-68]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. ASTR 80 M M Astrahan, M Schkolmck, and W Kun, Performance of the System R Access Path Selecuon Mechamsm, Informatton Processing 80 (1980) pp 487-491]]Google ScholarGoogle Scholar
  3. BERN 81 P A Bernstem, N Goodman, E Wong, C L Reeve, J Rothme, Query Processing m a System for D#stnbuted Databases (SDD-1), ACM Trans on Database Systems6,4 (December 1981) pp 602-625]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. BLAS 77 M W Blasgen and K P Eswaren, Storage and Access m RelaUonal Data Bases, IBM Systems Journal 16,4 (1977) Also available as "On the EvaluaUon of Queries m a RelaUonal Data Base System", IBM Research Report RJ1745, San Jose, CA, April 1976]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. BLOO 70 B H Bloom, Space/Trine Trade-offs m Hash Codmg wRh Allowable Errors, Commumcatlons of the A CM 13,7 (July 1970) pp 422-426]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. BRAT 84 K Bratbergsengen, Hashmg Methods and RelaUonal Algebra Operations, Procs of the Tenth lnternattonal Conf on Very Large Data Bases (Smgapore, 1984) pp 323-333]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. CARE 85 M J Carey and H Lu, Some Expertmental Results on Dtstnbuted Jom Algorithms m a Local Network, Procs of the Elewnth International Conf on Very Large Data Bases (Stockholm, Sweden, August 1985)]]Google ScholarGoogle Scholar
  8. CHAM 81 D D Chamberlm, M M Astrahan, W F Kmg, R A Lone, J W Mehl, T G Price, M Sehkolmck, P Gnfflths Selmger, D R Slutz, B W Wade, and R A Yost, Support for RepeUttve TransacUons and Ad Hoc Queries m System R, A CM Trans on Database Systems 6,1 (March 1981) pp 70-94]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. CHAN 82 J-M Chang, A Heurtstlc Approach to Distributed Query Processmg, Procs of the Etghth lnternattonal Conf on Very Large Data Bases (Mexico City, VLDB Endowment, Saratoga, pp 54-61]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. CHOU 85 H-T Chou and D J DeWltt, An Evaluatmn of Buffer Management Strategtes for Relauonal Database Systems, Procs of the Eleventh International Conf on Very Large Data Bases (Stockholm, Sweden, September 1985) pp 127-141]]Google ScholarGoogle Scholar
  11. CHU 82 W W Chu and P Hurley, Optunal Query Processing for D#stnbuted Database Systems, IEEE Trans on Computers C-31 (September 1982) pp 835-850]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. DANI 82 D Darnels, P G Selmger, L M Haas, B G Lmdsay, C Mohan, A Walker, and P Wdms, An Introducuon to D#stnbuted Query Compdataon m R*, Procs Second International Conf on D#trtbuted Databases (Berlm, September 1982) Also avatlable as IBM Research Report RJ3497, San Jose, CA, June 1982]]Google ScholarGoogle Scholar
  13. DEWI 84 D J DeWltt, R H Katz, F Olken, L D Shapiro, M R Stonebraker, and D Wood, Implementation Techmques for Mare Memory Database Systems, Proc SIGMOD 84 (Boston, MA, May 1984) pp 1-8]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. DEWI 85 D J DeWltt and R Gerber, Multi-Processor Hash- Based Join Algortthms, Procs of the Eleventh Internatzonal Conf on Very Large Data Bases (Stockholm, Sweden, September 1985) pp 151-164]]Google ScholarGoogle Scholar
  15. EFFE 84 Effelsberg, W and Haerder, T, Principles of Database Buffer Management, A CM Trans Database Systems 9 (December 1984) pp 560-595]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. EPST 78 R Epstein, M Stonebraker, and E Wong, Distributed Query Processing m a Relational Data Base System, Procs ofACM-SIGMOD (Austm,TX, May 1978) pp 169-180]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. EPST 80 R Epstein and M Stonebraker, Analysis of Distributed Data Base Processing Strategies, Procs of the Szxth lnternatwnal Conf on Very Large Data Bases (Montreal,IEEE, October 1980) pp 92-101]]Google ScholarGoogle Scholar
  18. FINK 82 S Fmkelstem, M Schkolmck, and P Tlbeno, DBDSGN -- a Physical Database Design Tool for System R, IEEE Database Engineering 5,1 (1982) pp 228-235]]Google ScholarGoogle Scholar
  19. GDDM 84 Graphtcal Data D#splay Manager Presentatwn Graphzcs Feature Interactive Chart Utzhty User's GuMe, Release 4, IBM Reference Manual SC33-0111-3 (IBM Corp, October 1984)]]Google ScholarGoogle Scholar
  20. HEVN 79 A R Hevner and S B Yao, Query Processing m Dlstnbuted Database Systems, IEEE Trans Software Engineering SE-5 (May 1979) pp 177-187]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. KERS 82 L Kerschberg, P D Tmg, and S B Yao, Query Optmuzauon m Star Computer Networks, A CM Trans on Database Systems 7,4 (December 1982) pp 678-711]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. LOHM 84 G M Lohman, D Darnels, L M Haas, R gastler, P G Sehnger, OptlmtzaUon of Nested Quenes m a Distributed Relational Database, Procs of the Tenth lnternatwnal Conf on Very Large Data Bases (Singapore, 1984) pp 403-415 Also avadable as IBM Research Report RJ4260, San Jose, CA, Aprd 1984]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. LOHM 85 G M Lohman, C Mohan, L M Haas, B G Lmdsay, P G Sehnger, P F Wllms, and D Darnels, Query Processing m R*, Query Processing m Database Systems, Sprmger-Verlag (Klm, Batory, & Remer (eds), 1985) pp 31-47 Also available as IBM Research Report R.14272, San Jose, CA, April 1984]]Google ScholarGoogle Scholar
  24. MACK 85 L F Mackert and G M Lohman, Index Scans using a Flmte LRU Buffer A Validated I/O Model, IBM Research Report RJ4836 (Almaden Research Center, San Jose, CA,]]Google ScholarGoogle Scholar
  25. MACK 86 L F Mackert and G M Lohman, R* Optmmmzer Validation and Performance Evaluation for Distributed Quenes, IBM Research Report RJ5050 (Almaden Research Center, San jose, CA,]]Google ScholarGoogle Scholar
  26. ONUE 83 E Onuegbe, S Rahum, and A R Hevner, Local Query Translation and Optmuzation m a Dmtnbuted System, Procs NCC 1983 (July 1983) pp 229-239]]Google ScholarGoogle Scholar
  27. PALE 74 F P Palermo, A Data Base Search Problem, Informatwn Systems COINS IV (Procs of COINS-72 Symp) (Mmnu, FL, 1974) pp 67-101 Plenum Press, JuhusT Tou, ed]]Google ScholarGoogle Scholar
  28. RDT 84 RDT Relational Design Tool, IBM Reference Manual SH20-6415 (IBM Corp, June 1984)]]Google ScholarGoogle Scholar
  29. SACC 82 G M Sacco and M Schkolmck, A Mechamsm for Managing the Buffer Pool m a Relational Database System using the Hot Set Model, Procs of the Eighth International Conf on Very Large Data Bases (Mexico City, 1982) pp 257-262]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. SCHK 79 M Schkolmck and P Ttbeno, Considerations m Developing a Design Tool for a Relational DBMS, Proc IEEE COMPSAC Conf (Cbacago, November 1979) pp 228-235]]Google ScholarGoogle Scholar
  31. SCHK 85 M Schkolmck and P Tlbeno, Estimating the Cost of Updates m a Relational Database, A CM Tram on Database Systems 10,2 (June 1985) pp 163-179 Also available as IBM Research Report RJ3327, San Jose, CA, 1981]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. SELI 79 P G Selmger, M M Astrahan, D D Chamberhn, R A Lone, and T G Price, Access Path Selection m a Relational Database Management System, Procs of ACM-SIGMOD 20 (1979) pp 23-34]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. SELI 80 P G Sehnger and M Adiba, Access Path Selection m Dtstnbuted Database Management Systems, Procs lnternatwnal Conf on Data Bases (Deen and Hammersly, ed, Umv of Abet pp 204-215]]Google ScholarGoogle Scholar
  34. SEVE 76 D G Severance and G M Lohman, Differential Fdes Their Apphcation to the Maintenance of Large Databases, ACM Trans on Database Systems 1,3 (September 1976) pp 256-267]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. SQL 84 SQL/Data System Application Programming for VM/ System Product, Release 3, IBM Reference Manual SH24-5068-0 (IBM Corp, December 1984)]]Google ScholarGoogle Scholar
  36. STON 80 M Stonebraker, Retrospection on a Data Base System, ACM Trans on Database Systems 5,2 (June 1980) pp225-240]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. STON 81 M Stonebraker, Operating System Support for Database Management, Commumcatwns of the A CM 24 (July 1981) pp 412-418]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. STON 82 M Stonebraker, J V#oodfdl, J Ranstrom, M Murphy, J Kalash, M Carey, K Arnold, Performance Analysts of Distributed Data Base Systems, Database Engineermg 5 (IEEE Computer Society, December 1982) pp 58-65]]Google ScholarGoogle Scholar
  39. VTAM 85 Network Program Products Planning (MVS, VSE, and VM), IBM Reference Manual SC23-0110-1 (IBM Corp, April 1985)]]Google ScholarGoogle Scholar
  40. WONG 76 E Wong and K Youssefl, Decomposition E a Strategy for Query Processing, A CM Tram on Database Systems 1,3 (September 1976) pp 223-241]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. YAO 79 S B Yao, Optmnzation of Query Algorithms, ACM Trans on Database Systems 4,2 (June 1979) pp 133-155]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. YU 83 C T Yu, and C C Chang, On the Design of a Query Processing Strategy m a D#stnbuted Database Enwronment, Proc SIGMOD 83 (San Jose, CA, May 1983) pp 30-39]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. R* optimizer validation and performance evaluation for local queries

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  • Published in

                    cover image ACM SIGMOD Record
                    ACM SIGMOD Record  Volume 15, Issue 2
                    June 1986
                    407 pages
                    ISSN:0163-5808
                    DOI:10.1145/16856
                    Issue’s Table of Contents
                    • cover image ACM Conferences
                      SIGMOD '86: Proceedings of the 1986 ACM SIGMOD international conference on Management of data
                      June 1986
                      407 pages
                      ISBN:0897911911
                      DOI:10.1145/16894

                    Copyright © 1986 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 15 June 1986

                    Check for updates

                    Qualifiers

                    • article

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader