skip to main content
article

Efficient sorting using registers and caches

Authors Info & Claims
Published:31 December 2002Publication History
Skip Abstract Section

Abstract

Modern computer systems have increasingly complex memory systems. Common machine models for algorithm analysis do not reflect many of the features of these systems, e.g., large register sets, lockup-free caches, cache hierarchies, associativity, cache line fetching, and streaming behavior. Inadequate models lead to poor algorithmic choices and an incomplete understanding of algorithm behavior on real machines.A key step toward developing better models is to quantify the performance effects of features not reflected in the models. This paper explores the effect of memory system features on sorting performance. We introduce a new cache-conscious sorting algorithm, R-MERGE, which achieves better performance in practice over algorithms that are superior in the theoretical models. R-MERGE is designed to minimize memory stall cycles rather than cache misses by considering features common to many system designs.

Skip Supplemental Material Section

Supplemental Material

References

  1. AGGARWAL, A., ALPERN, B., CHANDRA, A. K., AND SNIR, M. 1987. A model for hierarchical memory. Volume 19 (New York, 1987), pp. 305-314.]]Google ScholarGoogle Scholar
  2. AGGARWAL, A., CHANDRA, A., AND SNIR, M. 1987. Hierarchical memory with block transfer. Volume 28 (Los Angeles, 1987), pp. 204-216.]]Google ScholarGoogle Scholar
  3. AGGARWAL, A. AND VITTER, J.S. 1988. The Input/Output complexity of sorting and related problems. Commun. ACM 31, 9, 1116-1127.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. ALPERN, B., CARTER, L., FEIG, E., AND SELKER, T. 1994. The uniform memory hierarchy model of computation. 12, 2-3, 72-109.]]Google ScholarGoogle Scholar
  5. ANDERSON, J. M., BERC, L. M., DEAN, J., GHEMAWAT, S., HENZINGER, M. R., LEUNG, S. A., SITES, R. L., VANDERVOORDE, M. T., WALDSPURGER, C. A., AND WEIHL, W. E. 1997. Continuous profiling: Where have all the cycles gone? In Proceedings of the 16th Symposium on Operating Systems Principles (SOSP-97), Volume 31,5 of Operating Systems Review (New York, Oct. 5-8 1997), pp. 1-14. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. ARGE, L., CHASE, J., VITTER, J. S., AND WICKREMESINGHE, R. 2000. Efficient sorting using registers and caches. In WAE, Workshop on Algorithm Engineering, Lecture Notes in Computer Science (Sept. 2000). Springer. Journal version to appear in the ACM Journal of Experimental Algorithmics.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. CALLAHAN, D., CARR, S., AND KENNEDY, K. 1990. Improving register allocation for subscripted variables. In Proceedings of the SIGPLAN '90 Conference on Programming Language Design and Implementation (White Plains, NY, June 1990).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. EDMONDSON, J. H., RUBINFELD, P. I., BANNON, P. J., BENSCHNEIDER, B. J., BERNSTEIN, D., CASTELINO, R. W., COOPER, E. M., DEVER, D. E., DONCHIN, D. R., FISCHER, T. C., JAIN, A. K., MEHTA, S., MEYER, J. E., PRESTON, R. P., RAJAGOPALAN, V., SOMANATHAN, C., TAYLOR, S. A., AND WOLRICH, G.M. 1995. Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal 7, 1, 119-135.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. FRIGO, LEISERSON, PROKOP, AND RAMACHANDRAN. 1999. Cache-oblivious algorithms. In FOCS: IEEE Symposium on Foundations of Computer Science (FOCS) (1999).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. HENNESSY, J. L. AND PATTERSON, D. A. 1995. Computer Architecture: A Quantitative Approach (second ed.). Morgan Kaufmann.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. KNUTH, D. E. 1998. Sorting and Searching (2nd ed.), Volume 3 of The Art of Computer Programming. Addison-Wesley, Reading, MA.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. LADNER, R., FIX, J., AND LAMARCA, A. 1999. Cache performance analysis of traversals and random accesses. In Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms (1999).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. LAM, M. S., ROTHBERG, E. E., AND WOLF, M. E. 1991. The cache performance and optimizations of blocked algorithms. In Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (April 1991).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. LAMARCA, A. AND LADNER, R. E. 1997. The influence of caches on the performance of sorting. Volume 7 (January 1997), pp. 370-379.]]Google ScholarGoogle Scholar
  15. RAHMAN, N. AND RAMAN, R. 1999. Analysing cache effects in distribution sorting. In 3rd Workshop on Algorithm Engineering (July 1999).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. RAHMAN, N. AND RAMAN, R. 2000. Adapting radix sort to the memory hierarchy. In ALENEX, Workshop on Algorithm Engineering and Experimentation (2000).]]Google ScholarGoogle Scholar
  17. RANADE, A., KOTHARI, S., AND UDUPA, R. 2000. Register efficient mergesorting. In International Conference on High Performance Computing (2000).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. SANDERS, P. 1999a. Accessing multiple sequences through set associative caches. In J. WIEDERMANN, P. VAN EMDE BOAS, AND M. NIELSEN Eds., Automata, Languages and Programruing, 26th International Colloquium, ICALP'99, Prague, Czech Republic, July 11-15, 1999, Proceedings, Volume 1644 of Lecture Notes in Computer Science (1999). Springer.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. SANDERS, P. 1999b. Fast priority queues for cached memory. In ALENEX, Workshop on Algorithm Engineering and Expermentation, Lecture Notes in Computer Science (1999).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. SEN, S. AND CHATTERJEE, S. 2000. Towards a theory of cache-efficient algorithms. In Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (2000).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. SRIVASTAVA, A. AND EUSTACE, A. 1994. ATOM: A system for building customized program analysis tools. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation (June 1994), pp. 196-205.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. VITTER, J. S. AND SHRIVER, E.A.M. 1994. Algorithms for parallel memory I: Two-level memories. 12, 2-3, 110-147.]]Google ScholarGoogle Scholar

Index Terms

  1. Efficient sorting using registers and caches

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Journal of Experimental Algorithmics
        ACM Journal of Experimental Algorithmics  Volume 7, Issue
        2002
        218 pages
        ISSN:1084-6654
        EISSN:1084-6654
        DOI:10.1145/944618
        Issue’s Table of Contents

        Copyright © 2002 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 31 December 2002
        Published in jea Volume 7, Issue

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader