Data transformations for eliminating conflict misses

Authors:
Gabriel Rivera

Department of Computer Science, University of Maryland, College Park, MD

Department of Computer Science, University of Maryland, College Park, MD
View Profile

,
Chau-Wen Tseng

Department of Computer Science, University of Maryland, College Park, MD

Department of Computer Science, University of Maryland, College Park, MD
View Profile

PLDI '98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementationMay 1998Pages 38–49https://doi.org/10.1145/277650.277661

Published:01 May 1998Publication History

PLDI '98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation

Pages 38–49

ABSTRACT

Many cache misses in scientific programs are due to conflicts caused by limited set associativity. We examine two compile-time data-layout transformations for eliminating conflict misses, concentrating on misses occuring on every loop iteration. Inter-variable padding adjusts variable base addresses, while intra-variable padding modifies array dimension sizes. Two levels of precision are evaluated. PADLITE only uses array and column dimension sizes, relying on assumptions about common array reference patterns. PAD analyzes programs, detecting conflict misses by linearizing array references and calculating conflict distances between uniformly-generated references. The Euclidean algorithm for computing the gcd of two numbers is used to predict conflicts between different array columns for linear algebra codes. Experiments on a range of programs indicate PADLITE can eliminate conflicts for benchmarks, but PAD is more effective over a range of cache and problem sizes. Padding reduces cache miss rates by 16% on average for a 16K direct-mapped cache. Execution times are reduced by 6% on average, with some SPEC95 programs improving up to 15%.

References

1.J. Anderson, S. Amarasinghe, and M. Lam. Data and computation transformation for multiprocessors. In Proceedings of the Fifth A UM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Santa Barbara, CA, July 1995. Google ScholarDigital Library
2.B. Bershad, D. Lee, T. Romer, and B. Chen. Avoiding confiict misses dynamically in large direct-mapped caches. In Proceedings oat the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), San Jose, CA, October 1994. Google ScholarDigital Library
3.W. Bolosky, R. Fitzgerald, and M. Scott. Simple but effective tecluxiques for NUMA memory management. In Proceedings of the Twelfth Symposium on Operating Systems Principles, Litchfield Park, AZ, December 1989. Google ScholarDigital Library
4.E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum, and M. Lam. Compiler-directed page coloring for multiprocessors. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), Boston, MA, October 1996. Google ScholarDigital Library
5.D. Calahan. Performance evaluation of static and dynamic memory systems on the Cray-2. In Proceedings of the Second international Conference on Supercomputing, St. Malo, France, July 1988. Google ScholarDigital Library
6.M. Cierniak and W. Li. Unifying data and control transformations for distributed shared-memory machines. In Proceedings of the SIGPLAN '95 Conference or& Programming Language Design and Implementation, La Jolla, CA, June 1995. Google ScholarDigital Library
7.S. Coleman and K. S. MCKinley. Tile size selection using cache organization and data layout. In Proceedings of the SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995. Google ScholarDigital Library
8.J. Ferrante, V. Sarkar, and W. Thrash. On estimating and enhancing cache effectiveness. In U. Basmrjee, D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers .for Parallel Computing, Fourth International Workshop, Santa Clara, CA, August 1991. Springer-Verlag. Google ScholarDigital Library
9.D. Gannon, W. Jalby, and K. Galliveal. Strategies for cache and local memory management by global program transformation. Journal o} Parallel and Distributed Computing, 5(5):587-616, October 1988. Google ScholarDigital Library
10.S. Ghosh, M. Martonosi, and S. Malik. Cache miss equations: An analytical representation of cache misses. In Proceedings of the 1997 ACM International Conference on Supercomputing, Vienna, Austria, July 1997. Google ScholarDigital Library
11.A. Gonzalez, M. Valero, N. Topham, and J. Parcerisa. ELiminating cache conflict misses through XOR-based placement functions. In Proceedings of the 1997 ACM International Conference on Supercompuiing, Vienna, Austria, July 1997. Google ScholarDigital Library
12.T. Jeremiassen and S. Eggers. Reducing false sharing on shared memory multiprocessors through compile time data transformations. In Proceedings of the Fifth A CM SIGPLAN Symposium on Principles and Practice o.f Parallel Programming, Santa Barbara, CA, July 1995. Google ScholarDigital Library
13.N. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th International Symposium on Computer Architecture, Seattle, WA, May 1990. Google ScholarDigital Library
14.M. Kandemir, J. Ramanujam, and A. Choudhary. A compiler algorithm for optimizing locality in loop nests. In Proceedings of the 1997 A CM International Conference on Supercomputing, Vienna, Austria, July 1997. Google ScholarDigital Library
15.N. Manjikian and T. Abdelrahman. Fusion of loops for parallelism and locality. IEEE Transactions on Parallel and Distributed Systems, 8(2):193-209, February 1997. Google ScholarDigital Library
16.S. McFarling. Program optimization for instruction caches. In Proceedings of the Third International Conference on Architectural Support .for Programming Languages and Operating Systems (ASPLOS-III), pages 257-270, Boston, MA, April 1989. Google ScholarDigital Library
17.K. S. MCKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. A CM Transactions on Programming Languages and Systems, 18(4):424-453, July 1996. Google ScholarDigital Library
18.K. S. MCKinley and O. Temam. A quantitative analysis of loop nest locality. In Proceedings o} the Eighth international Conference on Architectural Support }or Programming Languages and Operating Systems (ASPLOS-VIII), Boston, MA, October 1996. Google ScholarDigital Library
19.M. O'Boyle and P. Knijnenburg. Non-singular data transformations: Definition, validity, and applications. In Proceedings of the 1997 ACM International Conference on Supercomputing, Vienna, Austria, Jtdy 1997. Google ScholarDigital Library
20.O. Temam, C. Fricker, and W. Jalby. Cache interference phenomena. In Proceedings of the 1994 A CM SIGMET- RICS Conference on Measurement ~ Modeling Computer Systems, Santa Clara, CA, May 1994. Google ScholarDigital Library
21.J. TorrelIas, M. Lam, and J. Hennessy. Shared data placement optimizations to reduce multiprocessor cache miss rates. In Proceedings of the 1990 International Conference on Parallel Processing, St. Charles, IL, August 1990.Google Scholar
22.R. Wilson et al. SUIF: An infrastructure for research on parallelizing and optimizing compilers. A CM SIGPLAN Notices, 29(12):31-37, December 1994. Google ScholarDigital Library
23.M. E. Wolf and M. Lain. A data locality optimizing algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Canada, June 1991. Google ScholarDigital Library

Index Terms

Data transformations for eliminating conflict misses

Recommendations

Effective padding of multidimensional arrays to avoid cache conflict misses
PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Caches are used to significantly improve performance. Even with high degrees of set associativity, the number of accessed data elements mapping to the same set in a cache can easily exceed the degree of associativity. This can cause conflict misses and ...
Read More
Effective padding of multidimensional arrays to avoid cache conflict misses
PLDI '16

Caches are used to significantly improve performance. Even with high degrees of set associativity, the number of accessed data elements mapping to the same set in a cache can easily exceed the degree of associativity. This can cause conflict misses and ...
Read More
Data transformations for eliminating conflict misses

Many cache misses in scientific programs are due to conflicts caused by limited set associativity. We examine two compile-time data-layout transformations for eliminating conflict misses, concentrating on misses occuring on every loop iteration. Inter-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PLDI '98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
May 1998
357 pages
ISBN:0897919874
DOI:10.1145/277650
Chairmen:
Jack W. Davidson
Univ. of Virginia
,
Keith D. Cooper
Rice Univ. Houston, TX
,
Editor:
A. Michael Berman
Rowan Univ. Glassboro, NJ
ACM SIGPLAN Notices Volume 33, Issue 5
May 1998
358 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/277652
Chairmen:
Jack W. Davidson
Univ. of Virginia
,
Keith D. Cooper
Rice Univ., Houston, TX
,
Editor:
A. Michael Berman
Rowan Univ., Glassboro, NJ
Issue’s Table of Contents
Copyright © 1998 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 May 1998
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
PLDI '98 Paper Acceptance Rate31of136submissions,23%Overall Acceptance Rate406of2,067submissions,20%
More
Upcoming Conference
PLDI '24

Sponsor:

sigplan

ACM SIGPLAN Conference on Programming Language Design and Implementation

June 24 - 28, 2024

Copenhagen , Denmark
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 211
  Total Citations
  View Citations
- 792
  Total Downloads
- Downloads (Last 12 months)90
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Data transformations for eliminating conflict misses

PLDI '98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Effective padding of multidimensional arrays to avoid cache conflict misses

Effective padding of multidimensional arrays to avoid cache conflict misses

Data transformations for eliminating conflict misses