Abstract
Compiling for distributed-memory machines has been a very active research area in recent years. Much of this work has concentrated on programs that use arrays as their primary data structures. To date, little work has been done to address the problem of supporting programs that use pointer-based dynamic data structures. The techniques developed for supporting SPMD execution of array-based programs rely on the fact that arrays are statically defined and directly addressable. Recursive data structures do not have these properties, so new techniques must be developed. In this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation. We intend to exploit this execution model using compiler analyses and automatic parallelization techniques. We have implemented a prototype system, which we call Olden, that runs on the Intel iPSC/860 and the Thinking Machines CM-5. We discuss our implementation and report on experiments with five benchmarks.
- ALLEN, R. AND I~ENNEDY, K 1987. Automatic translation of FORTRAN programs to vector form. A CI~f Trans. Program. Lang. Syst. 9, 4 (Oct.), 491-542. Google Scholar
- ALLEN, F., BURKE, M , CHARLES, P , CYTRON, R., AND ~ERRANTE, J. 1988. An overview of the PTRAN analysis system for multiprocessing. J. Parallel Distrib. Comput. 5, 5 (Oct.), 617-640. Google Scholar
- AMARASINGHE, S. AND LAM3 M 1993. Communication optimization and code generation for distributed memory machines. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation. ACM, New York, 126-138. Google Scholar
- ANDt~RSON~ J. AND LAM, M. 1993. Global optimizations for paralle}ism and locality on scalable parallel machines. In Proceedings of the SIGPLAN '93 Conference on Programming Language Deszgn and Implementation. ACM, New York, 112-125. Google Scholar
- APPEL, A. AND LI, I(. 1991. Virtual memory primitives for user programs. In Proceedings of the ~th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, 96-107. Google Scholar
- BAL, H., I~{AASHOEK, M. F., AND TANENBAUM, A 1992. Orca: A language for parallel programruing of distributed systems. IEEE Trans. Soflw. Eng. 18, 3 (Mar.), 190-205. Google Scholar
- BILARDI, G. AND }qICOLAU, A. 1989. Adaptive bitonic sorting: An optimal parallel algorithm for shared-memory machines. SIAM J. Comput. 18, 2, 216-228. Google Scholar
- BOBROW, D. AND WEGBREIT, B. 1973. A model and stack implementation of multiple environments. Commun. A CM 16, 10, 591-603. Google Scholar
- CALLAHAN, D. AND KENNEDY, K. 1988. Compiling programs for distributed-memory multiprocessors. J. Supercomput. 2, 2 (Oct.), 151-169.Google Scholar
- CARLISLE, M., ROGERS, A., REPPY, J., AND HENDREN, L. 1994. Early experiences with Olden. In Languages and Compilers for Parallel lllachines: 6th International Workshop, U. BANER- JEE, D. GELERNTER, t. NICOLAU, AND D. PADUA, Eds. Lecture Notes in Computer Science, vol. 768. Springer-Verlag, Berlin, 1-20. Google Scholar
- CARRIERO, IN., GELERNTER, D., AND LEICHTER, J. 1986. Distributed data structures in Linda. In Conference Record of the 13th Annual A CM Symposium on Principles of Programming Languages. ACM, New York, 236-242. Google Scholar
- CHASE, J., AMADOR, F. G., LAZOWSKA, E., LEVY, H. M., AND LITTLEFn~LD, R. J. 1989. The Amber system: Parallel programming on a network of multiprocesors. In Proceedings of the 12th ACM Symposium on Operating Systems Principles. ACM, New York, 147-158. Google Scholar
- CHIEN, t., KARAMCHETI, V., AND PLEVYAK, J. 1993. The Concert system-compiler and runtime support for efficient, fine-grained concurrent object-oriented programs. Tech. Rep. UIUCDCS-R-93-1815, Dept. of Computer Science, University of Illinois, Urbana, Ill. June. Google Scholar
- CORMEN, W., LEISERSON, C., AND RIVEST, R. 1989. Introduction to Algorithms. McGraw-Hill, New York. Google Scholar
- CULLER, D. E., DUSSEAU, A., GOLDSTEIN, S. C., ~RISHNAMURTHY, A., LUMETTA, S., VON EICKEN, T., AND YELICK, I(. 1993. Parallel programming in Split-C. In Proceedings of Supercomputing 93. IEEE Computer Society Press, Los Alamitos, Ca., 262-273. Google Scholar
- FRASER, C. W. AND HANSON, D. R. 1995. A Retargetable C Compiler: Design and Implementation. Benjamin/Cummings, Redwood City, Ca. Google Scholar
- GERNDT, M. 1990. Automatic parallelization for distributed-memory multiprocessing systems. Ph. D. thesis, University of Bonn.Google Scholar
- GumAs, L. AND STOLFI, J. 1985. General subdivisions and voronoi diagrams. A CM Trans. Graph. g, 2, 74-123.Google Scholar
- GUPTA, R. 1992. SPMD execution of programs with dynamic data structures on distributed memory machines. In Proceedings of the 1992 International Conference on Computer Languages. IEEE Computer Society Press, Los Alamitos, Ca., 232-241.Google Scholar
- HALSTEAD, R. H., JR. 1985. Multilisp: A language for concurrent symbolic computation. A CM Trans. Program. Lang. Syst. 7, 4 (Oct.), 501-538. Google Scholar
- HENDREN, L. J. 1990. Parallelizing programs with recursive data structures. Ph.D. thesis, Cornell University, Ithaca, N.Y. Google Scholar
- HENDREN, L. J. AND NICOLAU, A. 1990. Parallelizing programs with recursive data structures. IEEE Trans. Parallel Distrib. Syst. 1, 1, 35-47. Google Scholar
- HENDREN, L. J., HUMMEL, J., AND NICOLAU, A. 1992. Abstractions for recursive pointer data structures: Improving the analysis and transformation of imperative programs. In Proceedings of the SIGPLAN '92 Conference on Programming Language Design and Implementation. ACM, New York, 249-260. Google Scholar
- HIRANANDANI, S., KENNEDY, K., AND TSENG, C. 1991. Compiler optimizations for FORTRAN D on MIMD distributed memory machines. In Proceedings of Supercomputing 91. IEEE Computer Society Press, Los Alamitos, Ca., 86-100. Google Scholar
- I-IOGEN, G. AND LOOGEN, R. 1993. A new stack technique for the management of runtime structures in distributed implementations. Tech. Rep. 3, RWTH Aachen.Google Scholar
- HORWAT, W., CHIEN, A., AND DALLY, W. 1989. Experience with CST: Programming and implementation. In Proceedings of the SIGPLAN '89 Conference on Programming Language Design and Implementation. ACM, New York, 101-108. Google Scholar
- ttSIEH, W., ~ANG, 1~., AND WEIHL~ W. 1993. Computation migration: Enhancing locality for distributed-memory parallel systems. In Proceedings of the gth A CM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York, 239-248. Google Scholar
- ~}{UDAK. P. AND SMITH, L. 1986. Para-functional programming: A paradigm for programming multiprocessor systems. In Conference Record of the 13th Annual ACM Symposium on PmnczpIes of Programm~?zg Languages. ACM, New York, 243-254. Google Scholar
- JUL, E., LEVY, It., HUTCHINSON, N., AND BLACK, A. 1988. Pine-grained mobility in the Emerald system. A CM Trans. Comput. Syst. 6, 1,109-133. Google Scholar
- ARAMCHETI, V. AND CHIEN, A. 1993. Concert efficient runtime support for concurrent object-oriented programming languages on stock hardware. In Proceedzngs of S~percompu~~ng 93. IEEE Computer Society Press, Los Alamitos, Ca., 598-607. Google Scholar
- I~ARP, 1:(. 1977. Probabilistic analysis of partitioning algorithms for the traveling-saleman problem in the plane. Math. Oper. Res. 2, 3 (Aug.), 209-224.Google Scholar
- KOELBEL, C. 1990. Compiling programs for nonshared memory machines. Ph.D. thesis, Purdue University, West Lafayette, Ind. Google Scholar
- LARUS, J. R. 1991. Compiling lisp programs for parallel execution. Lisp Symb. Comput. ~, 1, 29-99. Google Scholar
- LARU$, J. R. 1989. Restructuring symbolic programs for concurrent execution on multiprocessors. Ph. D. thesis, University of California, Berkeley. Google Scholar
- LARUS, J. R. AND HILFINGER, P. N. 1988a. Detecting conflicts between structure accesses. In Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation. ACM, New York, 21-34. Google Scholar
- T, ARUS~ J. R. AND HILFINGER, P. N. 1988b. Restructuring Lisp programs for concurrent execution. In Proceedings of the ACM/SIGPLAN PPEALS 1988 Parallel Programming" Expemencs with Applications, Languages and Systems. ACM, New York, 100-110. Google Scholar
- LEE, D. W. AND SCHACHTER, }3. J. 1980. Two algorighms for constructing a delaunay triangulation. Int. J. Comput. Inf. Sci. 9, 3, 219-242.Google Scholar
- LUMETTA, S., MURPHY, L., LI, X., CULLER, O., AND }~HALIL, I 1993. Decentralized optimal power pricing: The development of a parallel program. In Proceedzngs of Supercomputzng 93. IEEE Computer Society Press, Los Alamitos, Ca., 240-249. Google Scholar
- MOHR, E., KaANZ, D. A., AND HALSTEAD, R. H. JR. 1991. Lazy task creation: A technique for increasing the granularity of parallel programs. IEEE Trans. Parallel Dist. Syst. 2, 3 (July), 264-280. Google Scholar
- M/JLLER, J. 1993. Parallelverarbeitung auf workstation-clustern mlttels express und networklinda. Diplomarbeit in Elektrotedanik, RWTH-Aachen.Google Scholar
- NIKmL, R. 1994. Cid: A parallel, "shared-memory" C for distributed-memory machines. In Proeeed~ngs of the 7th Ann~aI Workshop on Languages and Compilers for Parallel Computzng, Dept. of Computer Science, Cornell University, Ithaca, N.Y. Google Scholar
- ROBERTS, E. S. AND VANDEVOORDE, ~I. T. 1989. WorkCrews: An abstraction for controlling parallelism. Tech. Rap. 42, DEC Systems Research Center, Palo Alto, Ca. April.Google Scholar
- ROGERS, A. AND PINaALI, K. 1989. Process decomposition through locality of reference. In Proceedings of the SIGPLAN '89 Conference on Programming Language Design and Implementatzon. ACM, New York, 69-80. Google Scholar
- ROGERS, A, REPPY, J., AND HENDREN, L. 1993. Supporting SPMD execution for dynamic data structures. In Languages and Compzlers for Parallel Machines: 5th InternatzonaI WorJcshop, U. BANEI:IJEE, D. (~ELERNTER, x~. NICOLAU, AND D. PADUA, Eds. Lecture Notes in Computer Science, vol 757. Springer-Verlag, Berlin, 192-207. Google Scholar
- SCHO~NAS, I., FALSAF~, B., LEBECK, A. R., RE~HARDT, S. K., LARUS, J. R, AND WOOD, D. A. 1994. Fine-grain access control for distributed shared memory. In Proceedings of the 6th Internatzonal Conference on Architectural Support for Programming Languages and Operatzng Systems. ACM, New York, 297-306. Google Scholar
- VON F~ICKEN, T , CULLER, D. E., GOLDSTEIN, S. C , AND SCHAUSER, N. E 1992. Active messages: A mechanism for integrating communication and computation. In Proceedings of the 19~h Annual International Symposium on Computer ArchitectrLre. ACM, New York, 256-266. Google Scholar
- WEII-m, W., BREWER, E., COLBROOK, A., DELLAROCAS, C., HSIEH, W., JOSEPH, A., WALD- SPURGER, C., AND \~ANG, t}. 1991. Prelude: A system for portable parallel software. MIT/LCS 519, Massachusetts Institute of Technology, Cambridge, Mass.Google Scholar
- WOLFE, M 1989. Optimizing Supercompilers :for Supercomputers. Pitman Publishing, London. Google Scholar
- ZIMA, H., BAST, H., AND GERNDT, M. 1988. SUPERB: A tool for semi-automatic MIMD/SIMD parallelizatlon. Parallel Comput. 6, 1, 1-18.Google Scholar
Index Terms
- Supporting dynamic data structures on distributed-memory machines
Recommendations
Dynamic Data Partitioning for Distributed-Memory Multicomputers
Special issue on compilation techniques for distributed memory systemsFor distributed-memory multicomputers such as the Intel Paragon, the IBM SP-1/SP-2, the NCUBE/2, and the Thinking Machines CM-5, the quality of the data partitioning for a given application is crucial to obtaining high performace. This task has ...
Dynamic Processor Self-Scheduling for General Parallel Nested Loops
A processor self-scheduling scheme is proposed for general parallel nested loops in multiprocessor systems. In this scheme, programs are instrumented to allow processors to schedule loop iterations among themselves dynamically at run time without ...
Comments