skip to main content
article

Cross-architecture performance predictions for scientific applications using parameterized models

Published:01 June 2004Publication History
Skip Abstract Section

Abstract

This paper describes a toolkit for semi-automatically measuring and modeling static and dynamic characteristics of applications in an architecture-neutral fashion. For predictable applications, models of dynamic characteristics have a convex and differentiable profile. Our toolkit operates on application binaries and succeeds in modeling key application characteristics that determine program performance. We use these characterizations to explore the interactions between an application and a target architecture. We apply our toolkit to SPARC binaries to develop architecture-neutral models of computation and memory access patterns of the ASCI Sweep3D and the NAS SP, BT and LU benchmarks. From our models, we predict the L1, L2 and TLB cache miss counts as well as the overall execution time of these applications on an Origin 2000 system. We evaluate our predictions by comparing them against measurements collected using hardware performance counters.

References

  1. The ASCI Sweep3D Benchmark Code. DOE Accelerated Strategic Computing Initiative. http://www.llnl.gov/asci_benchmarks/asci/limited/sweep3d/asci_sweep3d.html.Google ScholarGoogle Scholar
  2. D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow. The NAS parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center, Dec. 1995.Google ScholarGoogle Scholar
  3. T. Ball and J. R. Larus. Optimally profiling and tracing programs. ACM Transactions on Programming Languages and Systems, 16(4):1319--1360, July 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Bennett and V. Kruskal. Lru stack processing. IBM Journal of Research and Development, 19(4):353--357, July 1975.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Beyls and E. D'Hollander. Reuse distance as a metric for cache behavior. In IASTED conference on Parallel and Distributed Computing and Systems 2001 (PDCS01), pages 617--662, 2001.Google ScholarGoogle Scholar
  6. T. Cormen, C. Leiserson, and R. Rivest. Introduction to Algorithms. The MIT Press, Cambridge, MA, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Ding and Y. Zhong. Reuse distance analysis. Technical Report TR741, Dept. of Computer Science, University of Rochester, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Hristea, D. Lenoski, and J. Keen. Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks. In Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM), pages 1--12. ACM Press, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. J. Kerbyson, H. J. Alme, A. Hoisie, F. Petrini, H. J. Wasserman, and M. Gittings. Predictive Performance and Scalability Modeling of a Large-Scale Application. In Supercomputing 2001, Denver, CO, Nov. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. E. Knuth and F. R. Stevenson. Optimal measurement points for program frequency counts. BIT, 13(3):313--322, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  11. J. Larus and E. Schnarr. EEL: Machine-Independent Executable Editing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 291--300, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Marin. Semi-Automatic Synthesis of Parameterized Performance Models for Scientific Programs. Master's thesis, Dept. of Computer Science, Rice University, Houston, TX, Apr. 2003.Google ScholarGoogle Scholar
  13. MathWorks. Optimization Toolbox: Function quadprog. http://www.mathworks.com/access/helpdesk/help/toolbox/optim/quadprog.shtml.Google ScholarGoogle Scholar
  14. R. Mattson, J. Gecsei, D. Slutz, and I. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal, 9(2):78--117, 1970.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. U. Prestor. Evaluating the memory performance of a ccNUMA system. Master's thesis, Dept. of Computer Science, University of Utah, Salt Lake City, UT, Aug. 2001.Google ScholarGoogle Scholar
  16. A. Snavely, L. Carrington, and N. Wolter. Modeling application performance by convolving machine signatures with application profiles. In Proc. IEEE 4th Annual Workshop on Workload Characterization, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Sundaram-Stukel and M. K. Vernon. Predictive Analysis of a Wavefront Application Using LogGP. In Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '99), Atlanta, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. E. Tarjan. Testing flow graph reducibility. Journal of Computer and System Sciences, 9:355--365, 1974.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Zhong, S. G. Dropsho, and C. Ding. Miss Rate Prediction across All Program Inputs. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques, New Orleans, Louisiana, Sept. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cross-architecture performance predictions for scientific applications using parameterized models

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM SIGMETRICS Performance Evaluation Review
              ACM SIGMETRICS Performance Evaluation Review  Volume 32, Issue 1
              June 2004
              432 pages
              ISSN:0163-5999
              DOI:10.1145/1012888
              Issue’s Table of Contents
              • cover image ACM Conferences
                SIGMETRICS '04/Performance '04: Proceedings of the joint international conference on Measurement and modeling of computer systems
                June 2004
                450 pages
                ISBN:1581138733
                DOI:10.1145/1005686

              Copyright © 2004 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 June 2004

              Check for updates

              Qualifiers

              • article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader