skip to main content
10.1145/1176760.1176765acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article

Automatic performance model construction for the fast software exploration of new hardware designs

Published:22 October 2006Publication History

ABSTRACT

Developing an optimizing compiler for a newly proposed architecture is extremely difficult when there is only a simulator of the machine available. Designing such a compiler requires running many experiments in order to understand how different optimizations interact. Given that simulators are orders of magnitude slower than real processors, such experiments are highly restricted. This paper develops a technique to automatically build a performance model for predicting the impact of program transformations on any architecture, based on a limited number of automatically selected runs. As a result, the time for evaluating the impact of any compiler optimization in early design stages can be drastically reduced such that all selected potential compiler optimizations can be evaluated. This is achieved by first evaluating a small set of sample compiler optimizations on a prior set of benchmarks in order to train a model, followed by a very small number of evaluations, or probes, of the target program.We show that by training on less than 0. 7% of all possible transformations (640 samples collected from 10 benchmarks out of 880000 possible samples, 88000 per training benchmark) and probing the new program on only 4 transformations, we can predict the performance of all program transformations with an error of just 7. 3% on average. As each prediction takes almost no time to generate, this scheme provides an accurate method of evaluating compiler performance, which is several orders of magnitude faster than current approaches.

References

  1. F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. O'Boyle, J. Thomson, M. Toussaint, and C. Williams. Using machine learning to focus iterative optimization. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Almagor, K. Cooper, A. Grosul, T. Harvey, S. Reeves, D. Subramanian, L. Torczon, and T. Waterman. Finding effective compilation sequences. In Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), pages 231--239, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Bachega, J. Brunheroto, L. DeRose, P. Mindlin, and J. Moreira. The bluegene/l pseudo cycle-accurate simulator. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. Berry, D. G. Prez, and O. Temam. Chaos in computer performance. Chaos, 16(1), Dec. 2005.Google ScholarGoogle Scholar
  5. C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Chapman and Hall, 1984.Google ScholarGoogle Scholar
  7. J. Cavazos and J. E. Moss. Inducing heuristics to decide whether to schedule. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. D. Cooper, A. Grosul, T. Harvey, S. Reeves, D. Subramanian, L. Torczon, , and T. Waterman. Searching for compilation sequences. Tech. report, Rice University, 2005.Google ScholarGoogle Scholar
  9. T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley, New York, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. Eeckhout, R. H. B. Jr., B. Stougie, K. D. Bosschere, and L. K. John. Control flow modeling in statistical simulation for accurate and efficient processor design studies. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 350--363, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Epshteyn, M. Garzaran, G. DeJong, D. Padua, G. Ren, X. Li, K. Yotov, and K. Pingali. Analytic models and empirical search: A hybrid approach to code optimization. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing (LCPC), Hawthorne, NY, USA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Franke, M. O'Boyle, J. Thomson, and G. Fursin. Probabilistic source-level optimisation of embedded programs. In Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, volume 3, pages 1381--1384, Seattle, WA, May 1998.Google ScholarGoogle ScholarCross RefCross Ref
  14. M. Hall, L. Anderson, S. Amarasinghe, B. Murphy, S. Liao, E. Bugnion, M., and Lam. Maximizing multiprocessor performance with the SUIF compiler. IEEE Computer, 29(12):84--89, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. Ipek, S. A. McKee, B. R. de Supinski, M. Schulz, and R. Caruana. Efficiently Exploring Architectural Design Spaces via Predictive Modeling. In ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, San Jose, CA, USA, 2006. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive Mixtures of Local Experts. Neural Computation, 3, 1991.Google ScholarGoogle Scholar
  17. T. Karkhanis and J. E. Smith. A first-order superscalar processor model. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 338--349, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Kulkarni, W. Zhao, H. Moon, K. Cho, D. Whalley, J. Davidson, M. Bailey, Y. Paek, and K. Gallivan. Finding effective optimization phase sequences. In Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), pages 12--23, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Lee. Utdsp benchmark suite. In http://www. eecg. toronto. edu/~corinna/DSP/infrastructure/UTDSP. html, 1998.Google ScholarGoogle Scholar
  20. A. Monsifrot, F. Bodin, and R. Quiniou. A machine learning approach to automatic production of compiler heuristics. In Proceedings of the International Conference on Artificial Intelligence: Methodology, Systems, Applications, LNCS 2443, pages 41--50, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Z. Pan and R. Eigenmann. Fast automatic procedure-level performance tuning. In IEEE PACT, Seattle, WA, September 2006. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Parello, O. Temam, A. Cohen, and J. -M. Verdun. Toward a systematic, pragmatic and architecture-aware program optimization process for complex processors. In Proceedings of the International Conference on Supercomputing, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. W. Singer, J. Xiong, F. Franchetti, A. Gacić, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, special issue on "Program Generation, Optimization, and Adaptation", 93(2):232--275, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  24. M. Saghir, P. Chow, and C. Lee. A comparison of traditional and vliw dsp architecture for compiled dsp applications. In Proceedings of the International Workshop on Compiler and Architecture Support for Embedded Systems (CASES), Washington, DC, USA, 1998.Google ScholarGoogle Scholar
  25. T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In ASPLOS-X: Proceedings of the 10th international conference on Architectural support for programming languages and operating systems, pages 45--57, New York, NY, USA, 2002. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Stephenson and S. P. Amarasinghe. Predicting unroll factors using supervised classification. In Proceedings of International Symposium on Code Generation and Optimization (CGO), pages 123--134, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Stephenson, M. Martin, and U. O'Reilly. Meta optimization: Improving compiler heuristics with machine learning. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 77--90, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Triantafyllis, M. Vachharajani, N. Vachharajani, and D. August. Compiler optimization-space exploration. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 204--215, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. F. Wenisch, R. E. Wunderlich, B. Falsafi, and J. C. Hoe. Turbosmarts: accurate microarchitecture simulation sampling in minutes. In Proceedings of the ACM SIGMETRICS, pages 408--409, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. C. Whaley and J. Dongarra. Automatically tuned linear algebra software. In SuperComputing 1998: High Performance Networking and Computing, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. E. Wunderlich, T. F. Wenisch, B. Falsafi, and J. C. Hoe. Smarts: Accelerating microarchitecture simulation via rigorous statistical sampling. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 84--95, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. K. Yotov, X. Li, G. Ren, M. Cibulskis, G. DeJong, M. Garzaran, D. Padua, K. Pingali, P. Stodghill, and P. Wu. A comparison of empirical and model-driven optimization. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 63--76, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. K. Yotov, K. Pingali, and P. Stodghill. Think globally, search locally. In ICS '05: Proceedings of the 19th annual international conference on Supercomputing, pages 141--150, New York, NY, USA, 2005. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Zhao, B. R. Childers, and M. L. Soffa. A model-based framework: an approach for profit-driven optimization. In Third Annual IEEE/ACM Interational Conference on Code Generation and Optimization, pages 317--327, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic performance model construction for the fast software exploration of new hardware designs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CASES '06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
        October 2006
        448 pages
        ISBN:1595935436
        DOI:10.1145/1176760

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 October 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate52of230submissions,23%

        Upcoming Conference

        ESWEEK '24
        Twentieth Embedded Systems Week
        September 29 - October 4, 2024
        Raleigh , NC , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader