ABSTRACT
Partitioned Global Address Space (PGAS) languages combine the programming convenience of shared memory with the locality and performance control of message passing. One such language, Unified Parallel C (UPC) is an extension of ISO C defined by a consortium that boasts multiple proprietary and open source compilers. Another PGAS language, Titanium, is a dialect of JavaTM designed for high performance scientific computation. In this paper we describe some of the highlights of two related projects, the Titanium project centered at U.C. Berkeley and the UPC project centered at Lawrence Berkeley National Laboratory. Both compilers use a source-to-source strategy that trans-lates the parallel languages to C with calls to a communication layer called GASNet. The result is portable high-performance compilers that run on a large variety of shared and distributed memory multiprocessors. Both projects combine compiler, runtime, and application efforts to demonstrate some of the performance and productivity advantages to these languages.
- E. Allen, D. Chase, J. Hallett, V. Luchangco, J.-W. Maessen, S. Ryu, G. Steele, and S. Tobin-Hochstadt. The Fortress language specification. Available from http://research.sun.com/projects/plrg/.Google Scholar
- D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, D. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS Parallel Benchmarks. The International Journal of Supercomputer Applications, 5(3):63--73, Fall 1991.Google ScholarDigital Library
- G. T. Balls and P. Colella. A finite difference domain decomposition method using local corrections for the solution of poisson's equation. In Journal of Computational Physics, Volume 180, Issue 1, pp. 25--53, July 2002. Google ScholarDigital Library
- C. Bell, D. Bonachea, R. Nishtala, and K. Yelick. Optimizing bandwidth limited problems using one-sided communication and overlap. In 20th International Parallel and Distributed Processing Symposium (IPDPS), 2006. Google ScholarDigital Library
- The Berkeley UPC Compiler, 2002. http://upc.lbl.gov.Google Scholar
- D. Bonachea. GASNet specification. Technical Report CSD-02-1207, University of California, Berkeley, October 2002. Google ScholarDigital Library
- Chapel: The Cascade high productivity language. http://chapel.cs.washington.edu/.Google Scholar
- W. Chen, C. Iancu, and K. Yelick. Automatic nonblocking communication for partitioned global address space programs. In Proceedings of the International Conference on Supercomputing (ICS), 2007. Google ScholarDigital Library
- W. Chen, A. Krishnamurthy, and K. Yelick. Polynomial-time algorithms for enforcing sequential consistency in SPMD programs with arrays. In 16th International Workshop on Languages and Compilers for Parallel Computing (LCPC), 2003.Google Scholar
- The Chombo website. http://seesar.lbl.gov/ANAG/software.html.Google Scholar
- Cray C/C++ reference manual. http://www.cray.com/craydoc/manuals/004-2179-003/html-004-2179-003/.Google Scholar
- K. Datta, D. Bonachea, and K. Yelick. Titanium performance and potential: an NPB experimental study. In The 18th International Workshop on Languages and Compilers for Parallel Computing, October 2005. Google ScholarDigital Library
- T. El-Ghazawi and F. Cantonnet. UPC performance and potential: A NPB experimental study. In Supercomputing2002 (SC2002), November 2002. Google ScholarDigital Library
- GASNet home page. http://gasnet.cs.berkeley.edu.Google Scholar
- Hewlett-Packard Company. HP UPC Version 2.0 for Tru64 UNIX. http://h30097.www3.hp.com/upc/.Google Scholar
- High Performance Fortran Forum. High Performance Fortran Language Specification. http://dacnet.rice.edu/Depts/CRPC/HPFF/versions/hpf2/hpf-v20, Jan. 1997.Google Scholar
- P. Hilfinger, D. Bonachea, D. Gay, S. Graham, B. Liblit, G. Pike, and K. Yelick. Titanium language reference manual. Tech Report UCB/CSD-01-1163, U.C. Berkeley, November 2001. Google ScholarDigital Library
- C. Iancu, P. Husbands, and W. Chen. Message strip mining heuristics for high speed networks. In Proc. 6th International Meeting on High Performance Computing for Computational Science (VECPAR), 2004. Google ScholarDigital Library
- Intrepid Technology, Inc. GCC/UPC Compiler. http://www.intrepid.com/upc/.Google Scholar
- L. V. Kale and S. Krishnan. CHARM++ : A portable concurrent object oriented system based on C++. ACM SIGPLAN Notes, 28(10):91--108, 1993. Google ScholarDigital Library
- A. Kamil, J. Su, and K. Yelick. Making sequential consistency practical in Titanium. In SC '05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing, November 2005. Google ScholarDigital Library
- C. Leiserson and R. Blumofe. Space-efficient scheduling of multithreaded computations. SIAM Journal on Computing, 27(1):202--229, 1998. Google ScholarDigital Library
- B. Liblit and A. Aiken. Type systems for distributed data structures. In the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), January 2000. Google ScholarDigital Library
- P. McCorquodale and P. Colella. Implementation of a multilevel algorithm for gas dynamics in a high-performance Java dialect. In International Parallel Computational Fluid Dynamics Conference (CFD'99), 1999.Google Scholar
- S. Merchant. Analysis of a contractile torus simulationin Titanium. Masters Report, Computer Science Division, University of California Berkeley, August 2003.Google Scholar
- MuPC portable UPC runtime system. http://www.upc.mtu.edu/.Google Scholar
- R. Numrich and J. Reid. Co-array fortran for parallel programming. In ACM Fortran Forum 17, 2, 1-31., 1998. Google ScholarDigital Library
- Open64 compiler tools. http://open64.sourceforge.net.Google Scholar
- G. Pike, L. Semenzato, P. Colella, and P. N. Hilfinger. Parallel 3D adaptive mesh refinement in Titanium. In 9th SIAM Conference on Parallel Processing for Scientific Computing, San Antonio, Texas, March 1999.Google Scholar
- J. Prins, J. Huan, W. Pugh, et al. UPC implementation of an unbalanced tree search benchmark. Technical Report 03-034, Department of Computer Science, University of North Carolina, 2003.Google Scholar
- L. Snyder. The ZPL Programmer's Guide. MIT Press, 1999.Google ScholarCross Ref
- J. Su and K. Yelick. Automatic support for irregular computations in a high-level language. In 19th International Parallel and Distributed Processing Symposium (IPDPS), 2005. Google ScholarDigital Library
- Titanium home page. http://titanium.cs.berkeley.edu.Google Scholar
- UPC language specifications, v1.2. Technical Report LBNL-59208, Berkeley National Lab, 2005.Google Scholar
- W.Chen, C. Iancu, and K. Yelick. Communication Optimizations for Fine-Grained UPC Applications. In 14th International Conference on Parallel Architectures and Compilation TechniquesPACT, 2005. Google ScholarDigital Library
- T. Wen and P. Colella. Adaptive Mesh Refinement in Titanium. In 19th International Parallel and Distributed Processing Symposium (IPDPS), 2005. Google ScholarDigital Library
- T. Wen, P. Colella, J. Su, and K. Yelick. An adaptivemesh refinement benchmark for modern parallel programming languages. Submitted to Supercomputing 2007. Google ScholarDigital Library
- The X10 programming language. http://www.research.ibm.com/x10.Google Scholar
- S. M. Yau. Experiences in using Titanium for simulation of immersed boundary biological systems. Masters Report, Computer Science Division, University of California Berkeley, May 2002.Google Scholar
- K. Yelick, P. Hilfinger, S. Graham, D. Bonachea, J. Su, A. Kamil, K. Datta, P. Colella, and T. Wen. Parallel languages and compilers: Perspective from the Titanium experience. The International Journal of High Performance Computing Applications, 21(2), 2007. Google ScholarDigital Library
Index Terms
- Productivity and performance using partitioned global address space languages
Recommendations
An evaluation of global address space languages: co-array fortran and unified parallel C
PPoPP '05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programmingCo-array Fortran (CAF) and Unified Parallel C (UPC) are two emerging languages for single-program, multiple-data global address space programming. These languages boost programmer productivity by providing shared variables for inter-process ...
Preliminary Implementation of Coarray Fortran Translator Based on Omni XcalableMP
PGAS '15: Proceedings of the 2015 9th International Conference on Partitioned Global Address Space Programming ModelsXcalableMP (XMP) is a PGAS language for distributed memory environments. It employs Coarray Fortran (CAF) features as the local-view programming model. We implemented the main part of CAF in the form of a translator, i.e., a source-to-source compiler, ...
A preliminary evaluation of the hardware acceleration of the Cray Gemini interconnect for PGAS languages and comparison with MPI
The Gemini interconnect on the Cray XE6 platform provides for lightweight remote direct memory access (RDMA) between nodes, which is useful for implementing partitioned global address space (PGAS) languages like UPC and Co-Array Fortran. In this paper, ...
Comments