ABSTRACT
Query performance is a critical factor in modern business intelligence and data warehouse systems. An increasing number of companies uses detailed analyses for conducting daily business and supporting management decisions. Thus, several techniques have been developed for achieving near realtime response times - techniques which try to alleviate I/O bottlenecks while increasing the throughputs of available processing units, i.e. by keeping relevant data in compressed main-memory data structures and exploiting the read-only characteristics of analytical workloads.
However, update processing and skews in data distribution result in degenerations in these densely packed and highly compressed data structures affecting the memory efficiency and query performance negatively. Reorganization tasks can repair these data structures, but -- since these are usually costly operations -- require a well-considered decision which of several possible strategies should be processed and when, in order to reduce system downtimes.
In this paper, we address these problems by presenting an approach for online reorganization in main-memory database systems (MMDBS). Based on a discussion of necessary reorganization strategies in IBM Smart Analytics Optimizer, a read optimized parallel MMDBS, we introduce a framework for executing arbitrary reorganization tasks online, i.e. in the background of normal user workloads without disrupting query results or performance.
- K. J. Achyutuni, E. Omiecinski, and S. B. Navathe. Two techniques for on-line index modification in shared nothing parallel databases. In SIGMOD, 1996. Google ScholarDigital Library
- I. Ahn and R. Snodgrass. Performance evaluation of a temporal database management system. SIGMOD Rec., 15(2):96--107, 1986. Google ScholarDigital Library
- A. Ailamaki, D. J. DeWitt, M. D. Hill, and M. Skounakis. Weaving Relations for Cache Performance. In VLDB, pages 169--180, 2001. Google ScholarDigital Library
- P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR, pages 225--237, 2005.Google Scholar
- Z. Chen, J. Gehrke, and F. Korn. Query optimization in compressed database systems. In SIGMOD, 2001. Google ScholarDigital Library
- G. Copeland, W. Alexander, E. Boughter, and T. Keller. Data placement in bubba. In SIGMOD, 1988. Google ScholarDigital Library
- CURSOR Software AG. DB2 Newsletter. Technical report, CURSOR Software AG, August 2008.Google Scholar
- P. Ganesan, M. Bawa, and H. Garcia-Molina. Online balancing of range-partitioned data with applications to peer-to-peer systems. In VLDB, 2004. Google ScholarDigital Library
- T.-l. Hu, G. Chen, X.-y. Li, and J.-x. Dong. Automatic relational database compression scheme design based on swarm evolution. Journal of Zhejiang University - Science A, 7(10):1642--1651, 2006.Google Scholar
- IBM Corp. DB2 Version 9.5 for Linux, UNIX and Windows English manuals. IBM Corp., April 2009.Google Scholar
- S. Idreos, R. Kaushik, V. R. Narasayya, and R. Ramamurthy. Estimating the compression fraction of an index using sampling. In ICDE, 2010.Google ScholarCross Ref
- C. S. Jensen. Vacuuming in TSQL2. commentary, TSQL2 Design Committee, Sept. 1994.Google Scholar
- R. Johnson, V. Raman, R. Sidle, and G. Swart. Row-wise parallel predicate evaluation. VLDB, 2008. Google ScholarDigital Library
- A. Koeller and E. A. Rundensteiner. Incremental Maintenance of Schema-Restructuring Views in SchemaSQL. IEEE Trans. on Knowl. and Data Eng., 16(9):1096--1111, 2004. Google ScholarDigital Library
- V. M. Markowitz and J. A. Makowsky. Incremental reorganization of relational databases. In VLDB, 1987. Google ScholarDigital Library
- V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, and R. Sidle. Constant-Time Query Processing. In ICDE, 2008. Google ScholarDigital Library
- D. W. Randall Davis. IBM BladeCenter HS22 Technical Introduction. Redpaper, IBM Corp, 2009.Google Scholar
- G. H. Sockut and R. P. Goldberg. Database reorganization--principles and practice. ACM Comput. Surv., 11(4):371--395, 1979. Google ScholarDigital Library
- G. H. Sockut and B. R. Iyer. Online reorganization of databases. ACM Comput. Surv., 41(3):1--136, 2009. Google ScholarDigital Library
- K. Stolze, F. Beier, K.-U. Sattler, S. Sprenger, C. C. Grolimund, and M. Czech. Architecture of a Highly Scalable Data Warehouse Appliance Integrated to Mainframe Database Systems. In BTW, 2011.Google Scholar
- K. Stolze, V. Raman, R. Sidle, and O. Draese. Bringing BLINK Closer to the Full Power of SQL. In BTW, 2009.Google Scholar
- M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, P. O'Neil, A. Rasin, N. Tran, and S. Zdonik. C-store: a column-oriented DBMS. In VLDB, 2005. Google ScholarDigital Library
- TPC. TPC BENCHMARK DS. Standard, Transaction Processing Performance Council, 2007.Google Scholar
- S. B. Yao, K. S. Das, and T. J. Teorey. A dynamic database reorganization algorithm. ACM Trans. Database Syst., 1(2):159--174, 1976. Google ScholarDigital Library
- M. Zukowski, P. A. Boncz, N. Nes, and S. Heman. MonetDB/X100 - A DBMS In The CPU Cache. IEEE Data Engineering Bulletin, 28(2):17--22, June 2005.\endthebibliographyGoogle Scholar
Index Terms
- Online reorganization in read optimized MMDBS
Recommendations
Online reorganization of databases
In practice, any database management system sometimes needs reorganization, that is, a change in some aspect of the logical and/or physical arrangement of a database. In traditional practice, many types of reorganization have required denying access to ...
ZNS - Efficient query processing with ZurichNoSQL
NoSQL data stores have recently gained popularity as an alternative to relational database management systems since they typically do not require a fixed schema and scale well for large data sets. These systems have often been tuned to a number of very ...
DB Facade: A Web Cache with Improved Data Freshness
ISECS '09: Proceedings of the 2009 Second International Symposium on Electronic Commerce and Security - Volume 02Traditional web database cache techniques have a major disadvantage, namely poor data freshness, because they employ an asynchronous data refresh strategy. A novel web database cache, DB Facade, is proposed in this paper. DB Facade uses a main memory ...
Comments