ABSTRACT
In main-memory databases, the number of processor cache misses has a critical impact on the performance of the system. Cache-conscious indices are designed to improve performance by reducing the number of processor cache misses that are incurred during a search operation. Conventional wisdom suggests that the index's node size should be equal to the cache line size in order to minimize the number of cache misses and improve performance. As we show in this paper, this design choice ignores additional effects, such as the number of instructions executed and the number of TLB misses, which play a significant role in determining the overall performance. To capture the impact of node size on the performance of a cache-conscious B+ tree (CSB+-tree), we first develop an analytical model based on the fundamental components of the search process. This model is then validated with an actual implementation, demonstrating that the model is accurate. Both the analytical model and experiments confirm that using node sizes much larger than the cache line size can result in better search performance for the CSB+-tree.
- Adve, S. V., Adve, V. S., Hill, M. D., and Vernon, M. K. Comparison of Hardware and Software Cache Coherence Schemes. ACM SIGARCH Computer Architecture News 19, 3 (1991), 298--308. Google ScholarDigital Library
- Ailamaki, A., DeWitt, D. J., Hill, M. D., and Wood, D. A. DBMSs on a Modern Processor: Where Does Time Go? In Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK (September 1999), pp. 266--277. Google ScholarDigital Library
- Bernstein, P. A., Brodie, M. L., Ceri, S., DeWitt, D. J., Franklin, M. J., Garcia-Molina, H., Gray, J., Held, G., Hellerstein, J. M., Jagadish, H. V., Lesk, M., Maier, D., Naughton, J. F., Pirahesh, H., Stonebraker, M., and Ullman, J. D. The Asilomar Report on Database Research. SIGMOD Record 27, 4 (1998), 74--80. Google ScholarDigital Library
- Bitton, D., DeWitt, D. J., and Turbyfill, C. Benchmarking database systems a systematic approach. In Proceedings of the 9th International Conference on Very Large Data Bases, Florence, Italy. (1983), pp. 8--19. Google ScholarDigital Library
- Bohannon, P., McIlroy, P., and Rastogi, R. Main-Memory Index Structures with Fixed-Size Partial Keys. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, Santa Barbara, CA, USA (May 2001), pp. 163--174. Google ScholarDigital Library
- Boncz, P. A., Manegold, S., and Kersten, M. L. Database Architecture Optimized for the New Bottleneck: Memory Access. In Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK (September 1999), pp. 54--65. Google ScholarDigital Library
- Browne, S., Dongarra, J., Garner, N., Ho, G., and Mucci, P. A Portable Programming Interface for Performance Evaluation on Modern Processors. The International Journal of High Performance Computing Applications 14, 3 (2000), 189--204. Google ScholarDigital Library
- Cardenas, A. Analysis and Performance of Inverted Data Base Structures. Communications of the ACM 18, 5 (May 1975), 253--264. Google ScholarDigital Library
- Cha, S. K., Hwang, S., Kim, K., and Kwon, K. Cache-Conscious Concurrency Control of Main Memory Indexes on Shared-Memory Multiprocessor Systems. In Proceedings of 27th International Conference on Very Large Data Bases, Roma, Italy (2001). Google ScholarDigital Library
- Chen, S., Gibbons, P. B., and Mowry, T. C. Improving Index Performance through Prefetching. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, Santa Barbara, CA, USA (May 2001), pp. 235--246. Google ScholarDigital Library
- Chen, S., Gibbons, P. B., Mowry, T. C., and Valentin, G. Fractal Prefetching B+trees: Optimizing Both Cache and Disk Performance. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, WI, USA (May 2002), pp. 157--168. Google ScholarDigital Library
- Chilimbi, T., Hill, M. D., and Larus, J. R. Cache-Conscious Structure Layout. In Proceedings of the ACM SIGPLAN '99 Conference on Programming Language Design and Implementation, Atlanta, Georgia, USA (May 1999), pp. 1--12. Google ScholarDigital Library
- Chou, H.-T., and DeWitt, D. J. An Evaluation of Buffer Management Strategies for Relational Database Systems. In Proceedings of the 11th International Conference on Very Large Data Bases, Stockholm, Sweden (Aug. 1985), Morgan Kaufmann, pp. 127--141.Google Scholar
- DeWitt, D. J. The Wisconsin Benchmark: Past, Present, and Future. In The Benchmark Handbook for Database and Transaction Systems (1993), J. Gray, Ed., Morgan Kaufmann.Google Scholar
- Doallo, R., Fraguela, B., and Zapata, E. Direct Mapped Cache Performance Modeling for Sparse Matrix Operations. In 7th EUROMICRO Workshop on Parallel and Distributed Processing (February 1999), pp. 331--338.Google Scholar
- Ghosh, S., Martonosi, M., and Malik, S. Cache Miss Equations: An Analytical Representation of Cache Misses. In Proceedings of the 11th International Conference on Supercomputing (1997), ACM Press, pp. 317--324. Google ScholarDigital Library
- Graefe, G., Bunker, R., and Cooper, S. Hash Joins and Hash Teams in Microsoft SQL Server. In Proceedings of the 24th International Conference on Very Large Data Bases, New York City, New York, USA (August 1998), pp. 86--97. Google ScholarDigital Library
- Gray, J., and Graefe, G. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. SIGMOD Record 26, 4 (1997). Google ScholarDigital Library
- Hankins, R. A., and Patel, J. M. Effect of Node Size on the Performance of Cache-Conscious Indices. Extended Report, http://www.eecs.umich.edu/quickstep/publ/ccindices.pdf (2003).Google Scholar
- Hennessy, J., and Patterson, D. Computer Architecture: A Quantitative Approach. Morgan Kauffman, 1998. Google ScholarDigital Library
- Kim, K., Cha, S. K., and Kwon, K. Optimizing Multidimensional Index Trees for Main Memory Access. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, Santa Barbara, CA, USA (May 2001), pp. 139--150. Google ScholarDigital Library
- Lehman, T. J., and Carey, M. J. A Study of Index Structures for Main Memory Database Management Systems. In Proceedings of the 12th International Conference on Very Large Data Bases, Kyoto, Japan (August 1986), pp. 294--303. Google ScholarDigital Library
- Lo, J. L., Barroso, L. A., Eggers, S. J., Gharachorloo, K., Levy, H. M., and Parekh, S. S. An Analysis of Database Workload Performance on Simultaneous Multithreaded Processors. In Proceedings of the 25th Annual International Symposium on Computer Architecture, Barcelona, Spain (1998), pp. 39--50. Google ScholarDigital Library
- Lomet, D. B-tree Page Size When Caching is Considered. SIGMOD Record 27, 3 (1998). Google ScholarDigital Library
- Nyberg, C., Barclay, T., Cvetanovic, Z., Gray, J., and Lomet, D. B. AlphaSort: A Cache Sensitive Parallel External Sort. VLDB Journal 4, 4 (1995), 603--627. Google ScholarDigital Library
- Rao, J., and Ross, K. A. Cache Conscious Indexing for Decision-Support in Main Memory. In Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK (September 1999), pp. 78--89. Google ScholarDigital Library
- Rao, J., and Ross, K. A. Making B+-Trees Cache Conscious in Main Memory. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA (May 2000), pp. 475--486. Google ScholarDigital Library
- Shatdal, A., Kant, C., and Naughton, J. F. Cache Conscious Algorithms for Relational Query Processing. In Proceedings of the 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile (September 1994), pp. 510--521. Google ScholarDigital Library
- Vernon, M., and Holliday, M. A. Performance Analysis of Multiprocessor Cache Consistency Protocols Using Generalized Timed Petri Nets. Performance Eval. Rev. (USA), Spec. Issue 14, 1 (May 1986), 9--17. Google ScholarDigital Library
- Vernon, M. K., Jog, R., and Sohi, G. S. Performance Analysis of Hierarchical Cache Consistent Multiprocessors. Performance Evaluation 9, 4 (1989), 287--302. Google ScholarDigital Library
- Yao, A. On Random 2-3 Trees. Acta Informatica 9 (1978), 159--170.Google ScholarDigital Library
- Yao, S. Approximating Block Accesses in Database Organization. Communications of the ACM 20, 4 (Apr. 1977), 260--261. Google ScholarDigital Library
Index Terms
- Effect of node size on the performance of cache-conscious B+-trees
Recommendations
Effect of node size on the performance of cache-conscious B+-trees
In main-memory databases, the number of processor cache misses has a critical impact on the performance of the system. Cache-conscious indices are designed to improve performance by reducing the number of processor cache misses that are incurred during ...
Node Compression Techniques Based on Cache-Sensitive B+-Tree
ICIS '10: Proceedings of the 2010 IEEE/ACIS 9th International Conference on Computer and Information ScienceCache-conscious indices have been researched for a decade. These index structures can enhance data locality and reduce cache misses. Cache-Sensitive B+-tree (CSB+-tree) is a state-of-the-art, high performance index tree for main-memory database systems. ...
Heterogeneous way-size cache
ICS '06: Proceedings of the 20th annual international conference on SupercomputingSet-associative cache architectures are commonly used. These caches consist of a number of ways, each of the same size. We have observed that the different ways have very different utilization, which motivates the design of caches with heterogeneous way ...
Comments