article

Efficient sorting using registers and caches

Authors:
Rajiv Wickremesinghe

Department of Computer Science, Duke University, Box 90129, Durham, NC

Department of Computer Science, Duke University, Box 90129, Durham, NC
View Profile

,
Lars Arge

Department of Computer Science, Duke University, Box 90129, Durham, NC

Department of Computer Science, Duke University, Box 90129, Durham, NC
View Profile

,
Jeffrey S. Chase

Department of Computer Science, Duke University, Box 90129, Durham, NC

Department of Computer Science, Duke University, Box 90129, Durham, NC
View Profile

,
Jeffrey Scott Vitter

Department of Computer Science, Duke University, Box 90129, Durham, NC

Department of Computer Science, Duke University, Box 90129, Durham, NC
View Profile

ACM Journal of Experimental Algorithmics Volume 7https://doi.org/10.1145/944618.944627

Published:31 December 2002Publication History

ACM Journal of Experimental Algorithmics

Abstract

Modern computer systems have increasingly complex memory systems. Common machine models for algorithm analysis do not reflect many of the features of these systems, e.g., large register sets, lockup-free caches, cache hierarchies, associativity, cache line fetching, and streaming behavior. Inadequate models lead to poor algorithmic choices and an incomplete understanding of algorithm behavior on real machines.A key step toward developing better models is to quantify the performance effects of features not reflected in the models. This paper explores the effect of memory system features on sorting performance. We introduce a new cache-conscious sorting algorithm, R-MERGE, which achieves better performance in practice over algorithms that are superior in the theoretical models. R-MERGE is designed to minimize memory stall cycles rather than cache misses by considering features common to many system designs.

Supplemental Material

Available for Download

p9-wickremesinghe.ps (262.3 KB)

tar

p9-wickremesinghe.tex.tar (299.6 KB)

References

AGGARWAL, A., ALPERN, B., CHANDRA, A. K., AND SNIR, M. 1987. A model for hierarchical memory. Volume 19 (New York, 1987), pp. 305-314.]]Google Scholar
AGGARWAL, A., CHANDRA, A., AND SNIR, M. 1987. Hierarchical memory with block transfer. Volume 28 (Los Angeles, 1987), pp. 204-216.]]Google Scholar
AGGARWAL, A. AND VITTER, J.S. 1988. The Input/Output complexity of sorting and related problems. Commun. ACM 31, 9, 1116-1127.]] Google ScholarDigital Library
ALPERN, B., CARTER, L., FEIG, E., AND SELKER, T. 1994. The uniform memory hierarchy model of computation. 12, 2-3, 72-109.]]Google Scholar
ANDERSON, J. M., BERC, L. M., DEAN, J., GHEMAWAT, S., HENZINGER, M. R., LEUNG, S. A., SITES, R. L., VANDERVOORDE, M. T., WALDSPURGER, C. A., AND WEIHL, W. E. 1997. Continuous profiling: Where have all the cycles gone? In Proceedings of the 16th Symposium on Operating Systems Principles (SOSP-97), Volume 31,5 of Operating Systems Review (New York, Oct. 5-8 1997), pp. 1-14. ACM Press.]] Google ScholarDigital Library
ARGE, L., CHASE, J., VITTER, J. S., AND WICKREMESINGHE, R. 2000. Efficient sorting using registers and caches. In WAE, Workshop on Algorithm Engineering, Lecture Notes in Computer Science (Sept. 2000). Springer. Journal version to appear in the ACM Journal of Experimental Algorithmics.]] Google ScholarDigital Library
CALLAHAN, D., CARR, S., AND KENNEDY, K. 1990. Improving register allocation for subscripted variables. In Proceedings of the SIGPLAN '90 Conference on Programming Language Design and Implementation (White Plains, NY, June 1990).]] Google ScholarDigital Library
EDMONDSON, J. H., RUBINFELD, P. I., BANNON, P. J., BENSCHNEIDER, B. J., BERNSTEIN, D., CASTELINO, R. W., COOPER, E. M., DEVER, D. E., DONCHIN, D. R., FISCHER, T. C., JAIN, A. K., MEHTA, S., MEYER, J. E., PRESTON, R. P., RAJAGOPALAN, V., SOMANATHAN, C., TAYLOR, S. A., AND WOLRICH, G.M. 1995. Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor. Digital Technical Journal 7, 1, 119-135.]] Google ScholarDigital Library
FRIGO, LEISERSON, PROKOP, AND RAMACHANDRAN. 1999. Cache-oblivious algorithms. In FOCS: IEEE Symposium on Foundations of Computer Science (FOCS) (1999).]] Google ScholarDigital Library
HENNESSY, J. L. AND PATTERSON, D. A. 1995. Computer Architecture: A Quantitative Approach (second ed.). Morgan Kaufmann.]] Google ScholarDigital Library
KNUTH, D. E. 1998. Sorting and Searching (2nd ed.), Volume 3 of The Art of Computer Programming. Addison-Wesley, Reading, MA.]] Google ScholarDigital Library
LADNER, R., FIX, J., AND LAMARCA, A. 1999. Cache performance analysis of traversals and random accesses. In Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms (1999).]] Google ScholarDigital Library
LAM, M. S., ROTHBERG, E. E., AND WOLF, M. E. 1991. The cache performance and optimizations of blocked algorithms. In Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (April 1991).]] Google ScholarDigital Library
LAMARCA, A. AND LADNER, R. E. 1997. The influence of caches on the performance of sorting. Volume 7 (January 1997), pp. 370-379.]]Google Scholar
RAHMAN, N. AND RAMAN, R. 1999. Analysing cache effects in distribution sorting. In 3rd Workshop on Algorithm Engineering (July 1999).]] Google ScholarDigital Library
RAHMAN, N. AND RAMAN, R. 2000. Adapting radix sort to the memory hierarchy. In ALENEX, Workshop on Algorithm Engineering and Experimentation (2000).]]Google Scholar
RANADE, A., KOTHARI, S., AND UDUPA, R. 2000. Register efficient mergesorting. In International Conference on High Performance Computing (2000).]] Google ScholarDigital Library
SANDERS, P. 1999a. Accessing multiple sequences through set associative caches. In J. WIEDERMANN, P. VAN EMDE BOAS, AND M. NIELSEN Eds., Automata, Languages and Programruing, 26th International Colloquium, ICALP'99, Prague, Czech Republic, July 11-15, 1999, Proceedings, Volume 1644 of Lecture Notes in Computer Science (1999). Springer.]] Google ScholarDigital Library
SANDERS, P. 1999b. Fast priority queues for cached memory. In ALENEX, Workshop on Algorithm Engineering and Expermentation, Lecture Notes in Computer Science (1999).]] Google ScholarDigital Library
SEN, S. AND CHATTERJEE, S. 2000. Towards a theory of cache-efficient algorithms. In Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (2000).]] Google ScholarDigital Library
SRIVASTAVA, A. AND EUSTACE, A. 1994. ATOM: A system for building customized program analysis tools. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation (June 1994), pp. 196-205.]] Google ScholarDigital Library
VITTER, J. S. AND SHRIVER, E.A.M. 1994. Algorithms for parallel memory I: Two-level memories. 12, 2-3, 110-147.]]Google Scholar

Index Terms

Efficient sorting using registers and caches
1. Hardware
  1. Integrated circuits
    1. Semiconductor memory
      1. Dynamic memory
2. Theory of computation
  1. Design and analysis of algorithms
    1. Data structures design and analysis
      1. Sorting and searching

Recommendations

Efficient Sorting Using Registers and Caches
WAE '00: Proceedings of the 4th International Workshop on Algorithm Engineering

Modern computer systems have increasingly complex memory systems. Common machine models for algorithm analysis do not reflect many of the features of these systems, e.g., large register sets, lockup-free caches, cache hierarchies, associativity, cache ...
Read More
Reducing energy and delay using efficient victim caches
ISLPED '03: Proceedings of the 2003 international symposium on Low power electronics and design

In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of accesses to more power consuming structures such as level 2 caches. We ...
Read More
Fetch Caches
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Journal of Experimental Algorithmics Volume 7, Issue
2002
218 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/944618
Issue’s Table of Contents

Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 December 2002
Published in jea Volume 7, Issue
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 1,316
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient sorting using registers and caches

ACM Journal of Experimental Algorithmics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Efficient Sorting Using Registers and Caches

Reducing energy and delay using efficient victim caches

Fetch Caches

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Efficient sorting using registers and caches

ACM Journal of Experimental Algorithmics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Efficient Sorting Using Registers and Caches

Reducing energy and delay using efficient victim caches

Fetch Caches

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media