ABSTRACT
Extracting performance from many-core architectures requires software engineers to create multi-threaded applications, which significantly complicates the already daunting task of software development. One solution to this problem is automatic compile-time parallelization, which can ease the burden on software developers in many situations. Clearly, automatic parallelization in its present form is not suitable for many application domains and new compiler analyses are needed address its shortcomings.
In this paper, we present one such analysis: a new approach for detecting commutative functions. Commutative functions are sections of code that can be executed in any order without affecting the outcome of the application, e.g., inserting elements into a set. Previous research on this topic had one significant limitation, in that the results of a commutative functions must produce identical memory layouts. This prevented previous techniques from detecting functions like malloc, which may return different pointers depending on the order in which it is called, but these differing results do not affect the overall output of the application. Our new commutativity analysis correctly identify these situations to better facilitate automatic parallelization. We demonstrate that this analysis can automatically extract significant amounts of parallelism from many applications, and where it is ineffective it can provide software developers a useful list of functions that may be commutative provided semantic program changes that are not automatable.
- W. Blume et al. Parallel programming with Polaris. IEEE Computer, 29(12):78--82, Dec. 1996. Google ScholarDigital Library
- M. Bridges, N. Vachharajani, Y. Zhang, T. Jablin, and D. August. Revisiting the sequential programming model for multi-core. In Proc. of the 40th Annual International Symposium on Microarchitecture, pages 69--84, 2007. Google ScholarDigital Library
- M. K. Chen and K. Olukotun. The Jrpm system for dynamically parallelizing Java programs. In Proc. of the 30th Annual International Symposium on Computer Architecture, pages 434--446, 2003. Google ScholarDigital Library
- P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the 4th ACM Symposium on Principles of Programming Languages, pages 234--252, 1977. Google ScholarDigital Library
- R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451--490, Oct. 1991. Google ScholarDigital Library
- Z.-H. Du et al. A cost-driven compilation framework for speculative parallelization of sequential programs. In Proc. of the SIGPLAN '04 Conference on Programming Language Design and Implementation, pages 71--81, 2004. Google ScholarDigital Library
- S. Gulwani. Program Analysis using Random Interpretation. PhD thesis, University of California Berkeley, 2005. Google ScholarDigital Library
- M. Hind. Pointer analysis: haven't we solved this problem yet? In Proc. of the 2001 ACM Workshop on Program Analysis For Software Tools and Engineering, pages 54--61, June 2001. Google ScholarDigital Library
- W. Hwu et al. Implicitly parallel programming models for thousand-core microprocessors. In Proc. of the 44th Design Automation Conference, pages 754--759, June 2007. Google ScholarDigital Library
- M. S. Lam and M. C. Rinard. Coarse-grain parallel programming in Jade. In Third ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 94--105, 1991. Google ScholarDigital Library
- C. Lee, M. Potkonjak, andW. Mangione-Smith. MediaBench: A tool for evaluating and synthesizing multimedia and communications systems. In Proc. of the 30th Annual International Symposium on Microarchitecture, pages 330--335, 1997. Google ScholarDigital Library
- V. Menon, K. Pengali, and N. Mateev. Fractal symbolic analysis. ACM Transactions on Programming Languages and Systems, 25(6):776--813, Nov. 2003. Google ScholarDigital Library
- S. Moon, B. So, and M. W. Hall. Evaluating automatic parallelization in SUIF. Journal of Parallel and Distributed Computing, 11(1):36--49, 2000. Google ScholarDigital Library
- E. Nystrom, H.--S. Kim, and W. Hwu. Bottom-up and topdown context-sensitive summary-based pointer analysis. In Proc. of the 11th Static Analysis Symposium, pages 165--180, Aug. 2004.Google ScholarCross Ref
- M. Prabhu and K. Olukotun. Exposing speculative thread parallelism in SPEC2000. In Proc. of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 142--152, June 2005. Google ScholarDigital Library
- M. Rinard and P. Diniz. Commutativity analysis: A new analysis technique for parallelizing compilers. ACM Transactions on Programming Languages and Systems, 19(6):1--47, Nov. 1997. Google ScholarDigital Library
- M. C. Rinard. The Design, Implementation and Evaluation of Jade, a Portable, Implicitly Parallel Programming Language. PhD thesis, Stanford University, 1994. Google ScholarDigital Library
- S. Ryoo et al. Automatic discovery of coarse--grained parallelism in media applications. Transactions on High Performance Embedded Architectures and Compilers, 1(1):194--213, Jan. 2007. Google ScholarDigital Library
- J. Shewchuk. Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates. Journal on Discrete and Computational Geometry, 18(3):305--363, 1997.Google ScholarCross Ref
- J. G. Steffan and T. C. Mowry. The potential for using threadlevel data speculation to facilitate automatic parallelization. In Proc. of the 4th International Symposium on High-Performance Computer Architecture, pages 2--13, 1998. Google ScholarDigital Library
- Trimaran. An infrastructure for research in ILP, 2000. http://www.trimaran.org/.Google Scholar
- P.Wadler. Deforestation: Transforming programs to eliminate trees. In Proc. of the 1990 European Symposium on Programming, pages 344--358, 1990. Google ScholarDigital Library
- P. Wu and A. Fekete. An empirical study of commutativity in application code. In Proc. of the 2003 International Database Engineering and Applications Symposium, page 358, 2003.Google ScholarCross Ref
- K. Zee, V. Kuncak, and M. Rinard. Full functional verification of linked data structures. In Proc. of the SIGPLAN '08 Conference on Programming Language Design and Implementation, pages 349--361, 2008. Google ScholarDigital Library
- H. Zhong et al. Uncovering hidden loop level parallelism in sequential applications. In Proc. of the 14th International Symposium on High-Performance Computer Architecture, pages 290--301, 2008Google Scholar
Index Terms
- Commutativity analysis for software parallelization: letting program transformations see the big picture
Recommendations
Commutativity analysis for software parallelization: letting program transformations see the big picture
ASPLOS 2009Extracting performance from many-core architectures requires software engineers to create multi-threaded applications, which significantly complicates the already daunting task of software development. One solution to this problem is automatic compile-...
Commutativity analysis for software parallelization: letting program transformations see the big picture
ASPLOS 2009Extracting performance from many-core architectures requires software engineers to create multi-threaded applications, which significantly complicates the already daunting task of software development. One solution to this problem is automatic compile-...
Commutativity analysis: a new analysis framework for parallelizing compilers
PLDI '96: Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementationThis paper presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures. Commutativity analysis views the computation as composed of operations on objects. ...
Comments