ABSTRACT
Recent research on processor microarchitecture suggests using instruction criticality as a metric to guide hardware control policies. Fields et al. [3, 4] have proposed a directed acyclic graph (DAG) model for characterizing program microexecutions on uniprocessors. Under such a model, critical path analysis can be applied and instructions' slack values can be used to quantify instruction criticality. In this paper, we extend the uniprocessor DAG model to characterize parallel program executions on shared memory multiprocessor systems. We describe how critical path analysis can be applied, at a fine grain, in a multiprocessor system running both finite and continuous workloads. We provide detailed evaluations for various aspects of multiprocessor executions under the DAG model. To enable efficient offline critical path analysis, we propose a novel graph reduction technique that reduces a DAG to an equivalent but significantly smaller DAG.
- Alaa R. Alameldeen, Carl J. Mauer, Min Xu, Pacia J. Harper, Milo M. K. Martin, Daniel J. Sorin, Mark D. Hill, and David A. Wood. Evaluating Non-deterministic Multithreaded Commercial Workloads. In Proceedings of the Fifth Workshop on Computer Architecture Evaluation Using Commercial Workloads, pages 30--38, February 2002.Google Scholar
- Carl J. Beckmann. Hardware and Software for Functional and Fine Grain Parallelism. PhD thesis, University of Illinois at Urbana-Champaign, April 1994. Google ScholarDigital Library
- Brian Fields, Rastislav Bodik, and Mark D. Hill. Slack: Maximizing Performance Under Technological Constraints. In Proceedings of the 29th Annual International Symposium on Computer Architecture, pages 47--58, May 2002. Google ScholarDigital Library
- Brian Fields, Shai Rubin, and Rastislav Bodik. Focusing Processor Policies via Critical-Path Prediction. In Proceedings of the 28th Annual International Symposium on Computer Architecture, pages 74--85, July 2001. Google ScholarDigital Library
- Kourosh Gharachorloo, Madhu Sharma, Simon Steely, and Stephen Von Doren. Architecture and Design of AlphaServer GS320. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 13--24, November 2000. Google ScholarDigital Library
- Jeffrey K. Hollingsworth. Critical Path Profiling of Message Passing and Shared-Memory Programs. IEEE Transactions on Parallel and Distributed Systems, 9(10):1029--1040, October 1998. Google ScholarDigital Library
- Jeffrey K. Hollingsworth and Barton P. Miller. Slack: A New Performance Metric for Parallel Programs. Technical Report 1260, Computer Sciences Department, University of Wisconsin--Madison, December 1994.Google Scholar
- H. Kasahara and S. Narita. Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing. IEEE Transactions on Computers, C-33(11):1023--1029, November 1990.Google Scholar
- Yu-Kwong Kwok and Ishfaq Ahmad. Static Scheduling Algorithms for Allocating Directed Task Graphs. ACM Computing Surveys, 31(4):406--471, December 1999. Google ScholarDigital Library
- Peter S. Magnusson, Magnus Christensson, Jesper Eskilson, Daniel Forsgren, Gustav Hallberg, Johan Hogberg, Fredik Larsson, Andreas Moestedt, and Bengt Werner. Simics: A Full System Simulation Platform. IEEE Computer, 35(2):50--58, February 2002. Google ScholarDigital Library
- Robert H. B. Netzer. Optimal Tracing and Replay for Debugging Shared-Memory Parallel Programs. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging (PADD), pages 1--11, 1993. Google ScholarDigital Library
- Srikanth T. Srinivasan, Roy Dz-ching Ju, Alvin R. Lebeck, and Chris Wilkerson. Locality vs. Criticality. In Proceedings of the 28th Annual International Symposium on Computer Architecture, pages 132--143, July 2001. Google ScholarDigital Library
- E. Tune, D. Liang, D. M. Tullsen, and B. Calder. Dynamic Prediction of Critical Path Instructions. In Proceedings of the Seventh IEEE Symposium on High-Performance Computer Architecture, pages 185--196, January 2001. Google ScholarDigital Library
- Eric S. Tune, Dean M. Tullsen, and Brad Calder. Quantifying Instruction Criticality. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, September 2002. Google ScholarDigital Library
- Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 24--37, June 1995. Google ScholarDigital Library
- Cui-Qing Yang and Barton P. Miller. Critical Path Analysis for the Execution of Parallel and Distributed Programs. In Proceedings of the Seventh Conference on Distributed Memory Computer Systems, pages 366--373, June 1988.Google Scholar
Index Terms
- Quantifying instruction criticality for shared memory multiprocessors
Recommendations
Quantifying Instruction Criticality
PACT '02: Proceedings of the 2002 International Conference on Parallel Architectures and Compilation TechniquesInformation about instruction criticality can be used to control the application of micro-architectural resources efficiently. To this end, several groups have proposed methods to predict critical instructions. This paper presents a framework that ...
Evaluating kilo-instruction multiprocessors
WMPI '04: Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architectureThe ever increasing gap in processor and memory speeds has a very negative impact on performance. One possible solution to overcome this problem is the Kilo-instruction processor. It is a recent proposed architecture able to hide large memory latencies ...
Comments