skip to main content
10.1145/277650.277661acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article
Free Access

Data transformations for eliminating conflict misses

Authors Info & Claims
Published:01 May 1998Publication History

ABSTRACT

Many cache misses in scientific programs are due to conflicts caused by limited set associativity. We examine two compile-time data-layout transformations for eliminating conflict misses, concentrating on misses occuring on every loop iteration. Inter-variable padding adjusts variable base addresses, while intra-variable padding modifies array dimension sizes. Two levels of precision are evaluated. PADLITE only uses array and column dimension sizes, relying on assumptions about common array reference patterns. PAD analyzes programs, detecting conflict misses by linearizing array references and calculating conflict distances between uniformly-generated references. The Euclidean algorithm for computing the gcd of two numbers is used to predict conflicts between different array columns for linear algebra codes. Experiments on a range of programs indicate PADLITE can eliminate conflicts for benchmarks, but PAD is more effective over a range of cache and problem sizes. Padding reduces cache miss rates by 16% on average for a 16K direct-mapped cache. Execution times are reduced by 6% on average, with some SPEC95 programs improving up to 15%.

References

  1. 1.J. Anderson, S. Amarasinghe, and M. Lam. Data and computation transformation for multiprocessors. In Proceedings of the Fifth A UM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Santa Barbara, CA, July 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.B. Bershad, D. Lee, T. Romer, and B. Chen. Avoiding confiict misses dynamically in large direct-mapped caches. In Proceedings oat the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), San Jose, CA, October 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.W. Bolosky, R. Fitzgerald, and M. Scott. Simple but effective tecluxiques for NUMA memory management. In Proceedings of the Twelfth Symposium on Operating Systems Principles, Litchfield Park, AZ, December 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum, and M. Lam. Compiler-directed page coloring for multiprocessors. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), Boston, MA, October 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.D. Calahan. Performance evaluation of static and dynamic memory systems on the Cray-2. In Proceedings of the Second international Conference on Supercomputing, St. Malo, France, July 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.M. Cierniak and W. Li. Unifying data and control transformations for distributed shared-memory machines. In Proceedings of the SIGPLAN '95 Conference or& Programming Language Design and Implementation, La Jolla, CA, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.S. Coleman and K. S. MCKinley. Tile size selection using cache organization and data layout. In Proceedings of the SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.J. Ferrante, V. Sarkar, and W. Thrash. On estimating and enhancing cache effectiveness. In U. Basmrjee, D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers .for Parallel Computing, Fourth International Workshop, Santa Clara, CA, August 1991. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.D. Gannon, W. Jalby, and K. Galliveal. Strategies for cache and local memory management by global program transformation. Journal o} Parallel and Distributed Computing, 5(5):587-616, October 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.S. Ghosh, M. Martonosi, and S. Malik. Cache miss equations: An analytical representation of cache misses. In Proceedings of the 1997 ACM International Conference on Supercomputing, Vienna, Austria, July 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.A. Gonzalez, M. Valero, N. Topham, and J. Parcerisa. ELiminating cache conflict misses through XOR-based placement functions. In Proceedings of the 1997 ACM International Conference on Supercompuiing, Vienna, Austria, July 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.T. Jeremiassen and S. Eggers. Reducing false sharing on shared memory multiprocessors through compile time data transformations. In Proceedings of the Fifth A CM SIGPLAN Symposium on Principles and Practice o.f Parallel Programming, Santa Barbara, CA, July 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.N. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th International Symposium on Computer Architecture, Seattle, WA, May 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.M. Kandemir, J. Ramanujam, and A. Choudhary. A compiler algorithm for optimizing locality in loop nests. In Proceedings of the 1997 A CM International Conference on Supercomputing, Vienna, Austria, July 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.N. Manjikian and T. Abdelrahman. Fusion of loops for parallelism and locality. IEEE Transactions on Parallel and Distributed Systems, 8(2):193-209, February 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.S. McFarling. Program optimization for instruction caches. In Proceedings of the Third International Conference on Architectural Support .for Programming Languages and Operating Systems (ASPLOS-III), pages 257-270, Boston, MA, April 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.K. S. MCKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. A CM Transactions on Programming Languages and Systems, 18(4):424-453, July 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.K. S. MCKinley and O. Temam. A quantitative analysis of loop nest locality. In Proceedings o} the Eighth international Conference on Architectural Support }or Programming Languages and Operating Systems (ASPLOS-VIII), Boston, MA, October 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19.M. O'Boyle and P. Knijnenburg. Non-singular data transformations: Definition, validity, and applications. In Proceedings of the 1997 ACM International Conference on Supercomputing, Vienna, Austria, Jtdy 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20.O. Temam, C. Fricker, and W. Jalby. Cache interference phenomena. In Proceedings of the 1994 A CM SIGMET- RICS Conference on Measurement ~ Modeling Computer Systems, Santa Clara, CA, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.J. TorrelIas, M. Lam, and J. Hennessy. Shared data placement optimizations to reduce multiprocessor cache miss rates. In Proceedings of the 1990 International Conference on Parallel Processing, St. Charles, IL, August 1990.Google ScholarGoogle Scholar
  22. 22.R. Wilson et al. SUIF: An infrastructure for research on parallelizing and optimizing compilers. A CM SIGPLAN Notices, 29(12):31-37, December 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.M. E. Wolf and M. Lain. A data locality optimizing algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Canada, June 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Data transformations for eliminating conflict misses

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          PLDI '98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
          May 1998
          357 pages
          ISBN:0897919874
          DOI:10.1145/277650

          Copyright © 1998 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 May 1998

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          PLDI '98 Paper Acceptance Rate31of136submissions,23%Overall Acceptance Rate406of2,067submissions,20%

          Upcoming Conference

          PLDI '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader