skip to main content
10.1145/781131.781140acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article

A comparison of empirical and model-driven optimization

Published:09 May 2003Publication History

ABSTRACT

Empirical program optimizers estimate the values of key optimization parameters by generating different program versions and running them on the actual hardware to determine which values give the best performance. In contrast, conventional compilers use models of programs and machines to choose these parameters. It is widely believed that model-driven optimization does not compete with empirical optimization, but few quantitative comparisons have been done to date. To make such a comparison, we replaced the empirical optimization engine in ATLAS (a system for generating a dense numerical linear algebra library called the BLAS) with a model-driven optimization engine that used detailed models to estimate values for optimization parameters, and then measured the relative performance of the two systems on three different hardware platforms. Our experiments show that model-driven optimization can be surprisingly effective, and can generate code whose performance is comparable to that of code generated by empirical optimizers for the BLAS.

References

  1. R. Allan and K. Kennedy, Optimizing Compilers for Modern Architectures, Morgan Kaufman Publishing, 2002 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Boulet, A. Darte, T. Risset, and Y. Robert, Pen-ultimate tiling?, INTEGRATION, the VLSI Journal, Volume 17, pages 33--51, 1994 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Cascaval and D. Padua, Estimating Cache Misses and Locality using Stack Distances. To appear in the International Conference on Supercomputing (ICS), 2003 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Clauss, Counting solutions to linear and nonlinear constraints through Ehrhart polynomials: Applications to analyze and transform scientific programs, In proceedings of the International Conference on Supercomputing (ICS), 1996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Coleman and K. S. McKinley. Tile size selection using cache organization and data layout. In proceedings of Programming Languages Design and Implementation (PLDI), 1995 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Dongarra and R. Schreiber, Automatic blocking of nested loops, Technical Report UT-CS-90-108, Department of Computer Science, University of Tennessee, May 1990 Google ScholarGoogle Scholar
  7. B. B. Fraguela, R. Doallo, and E. Zapata: Automatic Analytical Modeling for the Estimation of Cache Misses. In proceedings of Parallel Architectures and Compilation Techniques (PACT), pages 221--231, 1999 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), volume 3, pages 1381--1384, 1998Google ScholarGoogle ScholarCross RefCross Ref
  9. D. A. Patterson and J. L. Hennessy, Computer Architecture: A Quantitative Approach, 2 Edition, Morgan Kaufman, 1996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Kisuki, P. M. W. Knijnenburg, M. F. P. O'Boyle, and H. A. G. Wijshoff . Iterative compilation in program optimization. In proceedings of Compilers for Parallel Computers, pages (CPC), pages 35--44, 2000Google ScholarGoogle Scholar
  11. T. Kisuki, P. M. W. Knijnenburg, M. F. P. O'Boyle, F. Bodin, and H. A. G. Wijshoff. A feasibility study in iterative compilation. In proceedings of the International Symposium on High Performance Computing (ISHPC), 1999 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. S. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 63--74, 1999 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal 9, 2, 78--117, 1970Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. C. McKellar and E. G. Coffman. Organizing matrices and matrix operations for paged memory systems. Communications of the ACM 12, 3, pages 153--165 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. S. McKinley, S. Carr, and C. Tseng, Improving data locality with loop transformations, In ACM Transactions on Programming Languages and Systems, 18(4):424--453, 1996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. J. Navarro, T. Juan, and T. Lang, MOB forms: a class of multilevel block algorithms for dense linear algebra operations, In proceedings of the International Conference on Supercomputing (ICS), pages 354--362, 1994 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Ramanujam and P. Sadayappan, Tiling multidimensional iteration spaces for multicomputers, Journal of Parallel and Distributed Computing, 16(2):108--120, 1992Google ScholarGoogle ScholarCross RefCross Ref
  18. R. Whaley and J. Dongarra. Automatically Tuned Linear Algebra Software. Technical Report UT CS-97-366, LAPACK Working Note No.131, University of Tennessee, 1997 Google ScholarGoogle Scholar
  19. M. Wolfe, Iteration space tiling for memory hierarchies, In SIAM Conference on Parallel Processing for Scientific Computing, 1987 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. E. Wolf, D. E. Maydan, and D. Chen, Combining loop transformations considering caches and scheduling, In proceeding of the International Symposium on Microarchitecture (MICRO 29), pages 274--286, 1996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Xiong, J. Johnson, R. Johnson, and D. Padua, SPL: A Language and its Compiler for DSP Algorithms, In proceedings of Programming Languages Design and Implementation (PLDI), 2001 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A comparison of empirical and model-driven optimization

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
        June 2003
        360 pages
        ISBN:1581136625
        DOI:10.1145/781131

        Copyright © 2003 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 9 May 2003

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        PLDI '03 Paper Acceptance Rate28of131submissions,21%Overall Acceptance Rate406of2,067submissions,20%

        Upcoming Conference

        PLDI '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader