skip to main content
research-article

SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units

Published:01 August 2009Publication History
Skip Abstract Section

Abstract

The availability of huge system memory, even on standard servers, generated a lot of interest in main memory database engines. In data warehouse systems, highly compressed column-oriented data structures are quite prominent. In order to scale with the data volume and the system load, many of these systems are highly distributed with a shared-nothing approach. The fundamental principle of all systems is a full table scan over one or multiple compressed columns. Recent research proposed different techniques to speedup table scans like intelligent compression or using an additional hardware such as graphic cards or FPGAs. In this paper, we show that utilizing the embedded Vector Processing Units (VPUs) found in standard superscalar processors can speed up the performance of mainmemory full table scan by factors. This is achieved without changing the hardware architecture and thereby without additional power consumption. Moreover, as on-chip VPUs directly access the system's RAM, no additional costly copy operations are needed for using the new SIMD-scan approach in standard main memory database engines. Therefore, we propose this scan approach to be used as the standard scan operator for compressed column-oriented main memory storage. We then discuss how well our solution scales with the number of processor cores; consequently, to what degree it can be applied in multi-threaded environments. To verify the feasibility of our approach, we implemented the proposed techniques on a modern Intel multi-core processor using Intel® Streaming SIMD Extensions (Intel® SSE). In addition, we integrated the new SIMD-scan approach into SAP® Netweaver® Business Warehouse Accelerator. We conclude with describing the performance benefits of using our approach for processing and scanning compressed data using VPUs in column-oriented main memory database systems.

References

  1. Westmann, T., Kossmann D., Helmer, S., Moerkkotte, G., "The Implementation and Performance of Compressed Databases," in SIGMOD, vol. 29, no. 3, pp. 55--67, 2000 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Harizopoulos S., Liang V., Abadi D., Madden S., "Performance tradeoffs in read-optimized databases," In VLDB, pp. 487--498, 2006 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Flynn, M. J., "Very high-speed computing systems," Proceedings of the IEEE, vol. 54, no. 12, pp. 1901--1909, 1966Google ScholarGoogle ScholarCross RefCross Ref
  4. Duncan, R., "A survey of parallel computer architectures," Computer, vol. 23, no. 2, pp. 5--16, Feb 1990 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Graefe, G., Shapiro, L. D., "Data Compression and Database Performance," Applied Computing, pp. 22--27, 1991Google ScholarGoogle Scholar
  6. Zukowski M., Heman S., Nes N., Boncz P., "Super-Scalar RAM-CPU Cache Compression," Data Engineering, International Conference, vol. 0, no. 0, pp. 59, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Holloway A., Raman V., Swart G., DeWitt D., "How to Barter Bits for Chronons: Compression and Bandwidth Trade Offs for Database Scans," In SIGMOD, pp. 389--400, 2007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Qiao, L., Raman, V., Reiss, F., Haas, P. J., and Lohman, G. M., "Main-memory scan sharing for multi-core CPUs," In VLDB, pp. 610--621, 2008 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Johnson, R., Raman, V., Sidle, R., and Swart, G., "Row-wise parallel predicate evaluation," In VLDB, pp. 622--634, 2008 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhou J., Ross K. A., "Implementing database operations using SIMD instructions," In SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Heman S., Nes N., Zukowski M., Boncz P., "Vectorized Data Processing on the Cell Broadband Engine," Data Management on New Hardware, no. 4, 2007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Roth M., Van Horn S., "Database compression," In SIGMOD Record, pp. 31--39, 1993 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Goldstein J., Ramakrishnan R., Shaft U., "Compressing relations and indexes," In ICDE, 1998 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Abel J., Balasubramanian, K., Bargeron M., Craver T., Phlipot M., "Applications Tuning for Streaming SIMD Extensions," Intel Technology Journal Q2, 1999Google ScholarGoogle Scholar
  15. Oberman S., Favor G., Weber F., "AMD 3DNow! Technology: Architecture and Implementations," IEEE Micro, vol. 19, pp. 37--48, 1999 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Gerber R., Bik A., Smith K., Tian X., "The Software Optimization Cookbook," 2nd edition, Intel Press Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. SAP AG, https://www.sdn.sap.com/irj/sdn/biaGoogle ScholarGoogle Scholar

Index Terms

  1. SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader