Abstract
We present the Stanford Parallel Applications for Shared-Memory (SPLASH), a set of parallel applications for use in the design and evaluation of shared-memory multiprocessing systems. Our goal is to provide a suite of realistic applications that will serve as a well-documented and consistent basis for evaluation studies. We describe the applications currently in the suite in detail, discuss some of their important characteristics, and explore their behavior by running them on a real multiprocessor as well as on a simulator of an idealized parallel architecture. We expect the current set of applications to act as a nucleus for a suite that will grow with time.
- {1} J.J. Dongarra, J.L. Martin and J. Worlton, "Evaluating Computers and Their Performance: Perspectives, Pitfalls, and Paths," IBM Research Report 12904, April 1987.Google Scholar
- {2} "SPEC Benchmark Suite Release 1.0," October, 1989.Google Scholar
- {3} E.L. Lusk and R.A. Overbeek, "Use of Monitors in FORTRAN: A Tutorial on the Barrier, Self-scheduling DO-Loop, and Askfor Monitors," Tech. Report No. ANL-84-51, Rev. 1, Argonne National Laboratory, June 1987.Google Scholar
- {4} "Using the Encore Multimax," Tech. Mem. No. 65, Rev. 1, Math. and Comp. Sci. Division, Argonne National Laboratory, Feb. 1987.Google Scholar
- {5} J.J. Dongarra, J. Bunch, C. Moler and G. Stewart, "LINPACK Users' Guide," SIAM Pub., Philadelphia, 1976.Google Scholar
- {6} H. Davis, S. Goldschmidt and J.L. Hennessy, "Tango: a Multiprocessor Simulation and Tracing System," Tech. Report No. CSL-TR-90-439, Stanford University, 1990. Google ScholarDigital Library
- {7} J.P. Singh and J.L. Hennessy, "Finding and Exploiting Parallelism in an Ocean Simulation Program: Experience, Results and Implications," to appear in Journal of Parallel and Distributed Computing. Also Tech. Report No. CSL-TR-89-388, Stanford University, Aug. 1989. Google ScholarDigital Library
- {8} G.H. Golub and C.F. Van Loan, Matrix Computations, Second Edition, Chap. 10, The Johns Hopkins University Press, 1989.Google Scholar
- {9} C.W. Gear, Numerical Initial Value problems in Ordinary Differential Equations, Prentice-Hall, New Jersey, 1971. Google ScholarDigital Library
- {10} J.P. Singh and J.L. Hennessy, "Data Locality and Memory System Performance in the Parallel Simulation of Ocean Eddy Currents," Proceedings of the Second Symposium on High Performance Computing, Montpelier, France, October 1991. Also Tech. Report. No. CSL-TR-91-490, Stanford University, Aug. 1991.Google Scholar
- {11} J.P. Singh and J.L. Hennessy, "Automatic and Explicit Parallelization of an N-body Simulation," submitted for publication.Google Scholar
- {12} G.C. Lie and E. Clementi, "Molecular-Dynamics Simulation of Liquid Water with an ab initio Flexible Water-Water Interaction Potential," Physical Review, Vol. A33, pp. 2679 ff., 1986.Google Scholar
- {13} O. Matsuoka, E. Clementi and M. Yoshimine, "CI Study of the Water Dimer Potential Surface," Journal of Chemical Physics, Vol. 64, No. 4, pp. 1351-61, Feb. 1976.Google ScholarCross Ref
- {14} R. Bartlett, I. Shavitt and G. Purvis, "The Quartic Force Field of H 2 O Determined by Many-Body Methods that Include Quadruple Excitation Effects," Journal of Chemical Physics, Vol. 71, No. 1, pp. 281-291, July 1979.Google ScholarCross Ref
- {15} M. Berry et. al., "The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers," CSRD Report No. 827, Center for Supercomputing Research and Develpment, Urbana, Illinois, May 1989.Google Scholar
- {16} J.E. Barnes and P. Hut, "A Hierarchical O(N log N) Force Calculation Algorithm", Nature, Vol. 324, No. 4, pp. 446-449, December 1986.Google ScholarCross Ref
- {17} G.C. Fox, "A Graphical Approach to Load Balancing and Sparse Matrix Vector Multiplication on the Hypercube", in Numerical Algorithms for Modern Parallel Computer Architectures, ed. M. Schultz, Springer-Verlag, 1988, pp. 37-62.Google Scholar
- {18} J.K. Salmon, "Parallel Hierarchical N-body Methods", Ph.D. Thesis, California Insitute of Technology, December 1990. Google ScholarDigital Library
- {19} J.P. Singh, J.L. Hennessy and A. Gupta, "Implications of Hierarehical N-Body Techniques for Multiprocessor Architecture", Technical Report CSL-TR-92-506, Stanford University, February 1992.Google Scholar
- {20} L. Greengard and V. Rokhlin, "A Fast Algorithm for Particle Simulation", Journal of Computational Physics, Vol. 73, No. 325, 1987. Google ScholarDigital Library
- {21} J.P. Singh, C. Holt, T. Totsuka, A. Gupta and J.L. Hennessy, "Load Balancing and Data Locality in Hierarchical N-body Methods", Technical Report CSL-TR-92-505, Stanford University, February 1992.Google Scholar
- {22} David R. Cheriton, Hendrik A. Goosen, and Philip Machanick, "Restructuring a parallel simulation to improve cache behavior in a shared-memory multiprocessor: A first experience, 1990," to appear in Proc. International Symposium on Shared-Memory Multiprocessing, April 1991.Google Scholar
- {23} Jeffrey D. McDonald, "A direct particle simulation method for hypersonic rarified flow," CS 411 - Final Project Report, Stanford University, March 1988.Google Scholar
- {24} J.S. Rose, "LocusRoute: a parallel global router for standard cells," Proc. 25th Design Automation Conference, pages 189-195, June 1988. Google ScholarDigital Library
- {25} J.S. Rose, "The parallel decomposition and implementation of an integrated circuit global router," ACM Sigplan Symposium on Parallel Programming: Experience with Applications, Languages and Systems, pages 138-145, July 1988. Sep. 1990. Google ScholarDigital Library
- {26} J.S. Rose, "Parallel global routing for standard cells", IEEE Trans. Computer-Aided Design of Circuits and Systems, September 1990.Google ScholarDigital Library
- {27} K.M. Chandy and J. Misra, "Asynchronous Distributed Simulation Via a Sequence of Parallel Computations," Comm of the ACM, 24:11, pages 198-206, April 1981. Google ScholarDigital Library
- {28} Larry Soule and Anoop Gupta "Analysis of parallelism and deadlocks in distributed-time logic simulation," Technical Report CSL-TR-89-378, Stanford University, March 1989. Google ScholarDigital Library
- {29} I. Duff, R. Grimes, and J. Lewis, "Sparse matrix test problems," ACM Transactions on Mathematical Software, 15: 1-14, 1989. Google ScholarDigital Library
- {30} A. George, M. Heath, J. Liu, and E. Ng, "Solution of sparse positive definite systems on a hypercube," Technical Report TM-10865, Oak Ridge National Laboratory, 1988.Google Scholar
- {31} A. George and J. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall Inc., Englewood Cliffs, New Jersey, 1981. Google ScholarDigital Library
- {32} E. Rothberg and A. Gupta, "Techniques for improving the performance of sparse factorization on multiprocessor workstations," Proceedings of Supercomputing '90, November, 1990. Google ScholarDigital Library
Index Terms
- SPLASH: Stanford parallel applications for shared-memory
Recommendations
Convolution on Splash 2
FCCM '95: Proceedings of the IEEE Symposium on FPGA's for Custom Computing MachinesA Abstract: Convolution is a fundamental operation in many signal and image processing applications. Since the computation and communication pattern in a convolution operation is regular, a number of special architectures have been designed and ...
Evaluating SPLASH-2 Applications Using MapReduce
APPT '09: Proceedings of the 8th International Symposium on Advanced Parallel Processing TechnologiesMapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers, MapReduce significantly simplifies the programming of large clusters. ...
Comments