ABSTRACT
The read-mostly environment of data warehousing makes it possible to use more complex indexes to speed up queries than in situations where concurrent updates are present. The current paper presents a short review of current indexing technology, including row-set representation by Bitmaps, and then introduces two approaches we call Bit-Sliced indexing and Projection indexing. A Projection index materializes all values of a column in RID order, and a Bit-Sliced index essentially takes an orthogonal bit-by-bit view of the same data. While some of these concepts started with the MODEL 204 product, and both Bit-Sliced and Projection indexing are now fully realized in Sybase IQ, this is the first rigorous examination of such indexing capabilities in the literature. We compare algorithms that become feasible with these variant index types against algorithms using more conventional indexes. The analysis demonstrates important performance advantages for variant indexes in some types of SQL aggregation, predicate evaluation, and grouping. The paper concludes by introducing a new method whereby multi-dimensional group-by queries, reminiscent of OLAP/Datacube queries but with more flexibility, can be very efficiently performed.
- COMER79.Comer, D. The Ubiquitous B-tree. Comput. Surv. 11 (1979), pp. 121-137. Google ScholarDigital Library
- EDEL95.Herb Edelstein. Faster Data Warehouses. Information Week, Dec. 4, 1995, pp. 77-88. Give title and author on http://www.techweb.com/search/advsearch.html.Google Scholar
- FREN95.Clark D. French. "One Size Fits All" Database Architectures Do Not Work for DSS. Proceedings of the 1995 ACM SIGMOD Conference, pp. 449-450. Google ScholarDigital Library
- GBLP96.Jim Gray, Adam Bosworth, Andrew Layman, and Hamid Pirahesh. Data Cube: A Relational Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Proc. 12th Int. Conf. on Data Eng., pp. 152-159, 1996. Google ScholarDigital Library
- GP87.Jim Gray and Franco Putzolu. The Five Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time. Proc. 1987 ACM SIGMOD, pp. 395-398. Google ScholarDigital Library
- HRU96.Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. Implementing Data Cubes Efficiently. Proc. 1996 ACM SIGMOD, pp. 205-216. Google ScholarDigital Library
- KIMB96.Ralph Kimball. The Data Warehouse Toolkit. John Wiley & Sons, 1996.Google Scholar
- M204.MODEL 204 File Manager's Guide, Version 2, Release 1.0, April 1989, Computer Corporation of America.Google Scholar
- O'NEI87.Patrick O'Neil. Model 204 Architecture and Performance. Springer-Verlag Lecture Notes in Computer Science 359, 2nd Int. Workshop on High Performance Transactions Systems (HPTS), Asilomar, CA, 1987, pp. 40-59. Google ScholarDigital Library
- O'NEI91.Patrick O'Neil. The Set Query Benchmark. The Benchmark Handbook for Database and Transaction Processing Systems, Jim Gray (Ed.), Morgan Kaufmann, 2nd Ed. 1993, pp. 359-395.Google Scholar
- O'NEI96.Patrick O'Neil. Database: Principles, Programming, and Performance. Morgan Kaufmann, 3rd printing, 1996. Google ScholarDigital Library
- O'NGG95.Patrick O'Neil and Goetz Graefe. Multi-Table Joins Through Bitmapped Join Indices. SIGMOD Record, September, 1995, pp. 8-11, Google ScholarDigital Library
- O'NQUA.Patrick O'Neil and Dallan Quass. Improved Query Performance with Variant Indexes. Extended paper, available on h ttp :/www. c s. umb. edu/--po nei I/v ari nde xx. psGoogle Scholar
- PH96.D. A. Patterson and J. L. Hennessy. Computer Architecture, A Quantitative Approach. Morgan Kaufmann, 1996. Google ScholarDigital Library
- STG95.Stanford Technology Group, Inc., An INFORMIX Co. Designing the Data Warehouse on Relational Databases. lnformix White Paper, 1995, http://www.informix.com.Google Scholar
- TPC.TPC Home Page. Descriptions and results of TPC benchmarks, including the TPC-C and TPC-D benchmarks. http://www.tpc.org.Google Scholar
Index Terms
- Improved query performance with variant indexes
Recommendations
Improved query performance with variant indexes
The read-mostly environment of data warehousing makes it possible to use more complex indexes to speed up queries than in situations where concurrent updates are present. The current paper presents a short review of current indexing technology, ...
Inverted indexes vs. bitmap indexes in decision support systems
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementBitmap indexes are widely used in Decision Support Systems (DSSs) to improve query performance. In this paper, we evaluate the use of compressed inverted indexes with adapted query processing strategies from Information Retrieval as an alternative. In a ...
Document identifier reassignment and run-length-compressed inverted indexes for improved search performance
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrievalText search engines are a fundamental tool nowadays. Their efficiency relies on a popular and simple data structure: the inverted indexes. Currently, inverted indexes can be represented very efficiently using index compression schemes. Recent ...
Comments