ABSTRACT
Purely functional programs should run well on parallel hardware because of the absence of side effects, but it has proved hard to realise this potential in practice. Plenty of papers describe promising ideas, but vastly fewer describe real implementations with good wall-clock performance. We describe just such an implementation, and quantitatively explore some of the complex design tradeoffs that make such implementations hard to build. Our measurements are necessarily detailed and specific, but they are reproducible, and we believe that they offer some general insights.
Supplemental Material
- J. R. Armstrong, R. Virding, C. Wikstrom, and M. Williams. Concurrent programming in ERLANG (2nd ed.). Prentice Hall International (UK) Ltd., Hertfordshire, UK, 1996. Google ScholarDigital Library
- Nimar S. Arora, Robert D. Blumofe, and C. Greg Plaxton. Thread scheduling for multiprogrammed multiprocessors. In In Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), Puerto Vallarta, pages 119--129, 1998. Google ScholarDigital Library
- J. Berthold, S. Marlow, A. Al Zain, and K. Hammond. Comparing and optimising parallel Haskell implementations on multicore. In IFL'08: International Symposium on Implementation and Application of Functional Languages (Draft Proceedings), Hatfield, UK, 2008.Google Scholar
- Stephen M. Blackburn and Antony L. Hosking. Barriers: friend or foe? In ISMM '04: Proceedings of the 4th international symposium on Memory management, pages 143--151, New York, NY, USA, 2004. ACM. ISBN 1-58113-945-4. doi: http://doi.acm.org/10.1145/1029873.1029891. Google ScholarDigital Library
- G. E. Blelloch, S. Chatterjee, J. C. Hardwick, J. Sipelstein, and M Zagha. Implementation of a portable nested data-parallel language. JDPC, 21(1):4--14, 1994. Google ScholarDigital Library
- R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, and C. E. Lieserson. Cilk: an efficient multithreaded runtime system. SIGNPLAN Not., 30(8):207--216, 2001. Google ScholarDigital Library
- M. T. Chakravarty, G. Keller, R. Leshinskiy, and W. Pfannenstiel. Nepal - Nested Data Parallelism in Haskell. LNCS, 2150, Aug 2001.Google ScholarCross Ref
- David Chase and Yossi Lev. Dynamic circular work-stealing deque. In SPAA '05: Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures, pages 21--28, New York, NY, USA, 2005. ACM. ISBN 1-58113-986-1. doi: http://doi.acm.org/10.1145/1073970.1073974. Google ScholarDigital Library
- Damien Doligez and Xavier Leroy. A concurrent, generational garbage collector for a multithreaded implementation of ml. In POPL '93: Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 113--123, New York, NY, USA, 1993. ACM. ISBN 0-89791-560-7. doi: http://doi.acm.org/10.1145/158511.158611. Google ScholarDigital Library
- Christine Flood, Dave Detlefs, Nir Shavit, and Catherine Zhang. Parallel garbage collection for shared memory multiprocessors. In Usenix Java Virtual Machine Research and Technology Symposium (JVM '01), Monterey, CA, 2001. URL citeseer.ist.psu.edu/flood01parallel.html. Google ScholarDigital Library
- Matthew Fluet, Mike Rainey, and John Reppy. A scheduling framework for general-purpose parallel languages. SIGPLAN Not., 43 (9):241--252, 2008a. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/1411203.1411239. Google ScholarDigital Library
- Matthew Fluet, Mike Rainey, John Reppy, and Adam Shaw. Implicitly-threaded Parallelism in Manticore. International Conference on Functional Programming, pages 119--130, 2008b. Google ScholarDigital Library
- J. L. Gaudiot, T. DeBoni, J. Feo,W. Bohm,W. Najjar, and P. Miller. The Sisal model of functional programming and its implementation. In As '97, pages 112--123, Los Altimos, CA, March 1997. IEEE Computer Society Press. Google ScholarDigital Library
- Tim Harris and Satnam Singh. Feedback directed implicit parallelism. In ICFP '07: Proceedings of the 12th ACM SIGPLAN international conference on Functional programming, pages 251--264, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-815-2. doi: http://doi.acm.org/10.1145/1291151.1291192. Google ScholarDigital Library
- Tim Harris, Simon Marlow, and Simon Peyton Jones. Haskell on a shared-memory multiprocessor. In Haskell '05: Proceedings of the 2005 ACM SIGPLAN workshop on Haskell, pages 49--61. ACM Press, September 2005. ISBN 1-59593-071-X. doi: http://doi.acm.org/10.1145/1088348.1088354. URL http://www.haskell.org/~simonmar/papers/multiproc.pdf. Google ScholarDigital Library
- S. Jagannathan and J. Philbin. A customizable substrate for concurrent languages. In ACM Conference on Programming Languages Design and Implementation (PLDI'92), pages 55--81. ACM Press, June 1992. Google ScholarDigital Library
- P. Li, Simon Marlow, Simon Peyton Jones, and A. Tolmach. Lightweight concurrency primitives for GHC. In Haskell '07: Proceedings of the 2007 ACM SIGPLAN workshop on Haskell, pages 107--118. ACM Press, September 2007a. Google ScholarDigital Library
- Peng Li, Simon Marlow, Simon Peyton Jones, and Andrew Tolmach. Lightweight concurrency primitives for GHC. Haskell '07: Proceedings of the ACM SIGPLAN workshop on Haskell workshop, June 2007b. URL http://www.haskell.org/~simonmar/papers/conc-substrate.pdf. Google ScholarDigital Library
- H-W. Loidl, P.W. Trinder, K. Hammond, S.B. Junaidu, R.G. Morgan, and S.L. Peyton Jones. Engineering Parallel Symbolic Programs in GPH. Concurrency -- Practice and Experience, 11:701--752, 1999. URL http://www.cee.hw.ac.uk/\~{}dsg/gph/papers/ps/cpe.ps.gz.Google Scholar
- H.-W. Loidl, F. Rubio, N. Scaife, K. Hammond, S. Horiguchi, U. Klusik, R. Loogen, G. J. Michaelson, R. Pe na, S. Priebe, A J. Rebon, and P. W. Trinder. Comparing parallel functional languages: Programming and performance. Higher Order Symbol. Comput., 16(3):203--251, 2003. ISSN 1388-3690. doi: http://dx.doi.org/10.1023/A:1025641323400. Google ScholarDigital Library
- Rita Loogen, Yolanda Ortega-Mallen, and Ricardo Pena. Parallel Functional Programming in Eden. Journal of Functional Programming, 15(3):431--475, 2005. Google ScholarDigital Library
- Simon Marlow, Simon Peyton Jones, and Wolfgang Thaller. Extending the haskell foreign function interface with concurrency. In Proceedings of the ACM SIGPLAN workshop on Haskell, pages 57--68, Snowbird, Utah, USA, September 2004. URL http://www.haskell.org/~simonmar/papers/conc-ffi.pdf. Google ScholarDigital Library
- Simon Marlow, Tim Harris, Roshan P. James, and Simon Peyton Jones. Parallel generational-copying garbage collection with a block-structured heap. In ISMM '08: Proceedings of the 7th international symposium on Memory management. ACM, June 2008. URL http://www.haskell.org/~simonmar/papers/parallel-gc.pdf. Google ScholarDigital Library
- R. S. Nikhl. ID language reference manual. Laboratory for Computer Science, MIT, Jul 1991.Google Scholar
- R. S. Nikhl and Arvind. Implicit Parallel Programming in pH. Morgan Kaufmann Publishers, San Francisco, CA, 2001. Google ScholarDigital Library
- WD Partain. The nofib benchmark suite of Haskell programs. In Functional Programming, Glasgow 1992, Workshops in Computing, pages 195--202. Springer Verlag, 1992. Google ScholarDigital Library
- S. Peyton Jones, A. Gordon, and S. Finne. Concurrent Haskell. In Proc. of POPL'96, pages 295--308. ACM Press, 1996. Google ScholarDigital Library
- Simon Peyton Jones, Roman Leshchinskiy, Gabriele Keller, and Manuel M. T. Chakravarty. Harnessing the multicores: Nested data parallelism in Haskell. In FSTTCS 2009: IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, 2009.Google Scholar
- Daniel Spoonhower, Guy E. Blelloch, Robert Harper, and Phillip B. Gibbons. Space profiling for parallel functional programs. In ICFP '08: Proceedings of the 12th ACM SIGPLAN international conference on Functional programming, pages 253--264, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- P. W. Trinder, H.-W. Loidl, and R. F. Pointon. Parallel and distributed Haskells. J. Funct. Program., 12(5):469--510, 2002. ISSN 0956-7968. doi: http://dx.doi.org/10.1017/S0956796802004343. Google ScholarDigital Library
- PW Trinder, K Hammond, JS Mattson, AS Partridge, and SL Peyton Jones. GUM: a portable parallel implementation of haskell. In ACM Conference on Programming Languages Design and Implementation (PLDI'96). ACM Press, Philadelphia, May 1996. Google ScholarDigital Library
- PW Trinder, K Hammond, H-W Loidl, and SL Peyton Jones. Algorithm + strategy = parallelism. Journal of Functional Programming, 8:23--60, January 1998. Google ScholarDigital Library
Index Terms
- Runtime support for multicore Haskell
Recommendations
Runtime support for multicore Haskell
ICFP '09Purely functional programs should run well on parallel hardware because of the absence of side effects, but it has proved hard to realise this potential in practice. Plenty of papers describe promising ideas, but vastly fewer describe real ...
Hybrid PGAS runtime support for multicore nodes
PGAS '10: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming ModelWith multicore processors as the standard building block for high performance systems, parallel runtime systems need to provide excellent performance on shared memory, distributed memory, and hybrids. Conventional wisdom suggests that threads should be ...
Comparing and Optimising Parallel Haskell Implementations for Multicore Machines
ICPPW '09: Proceedings of the 2009 International Conference on Parallel Processing WorkshopsIn this paper, we investigate the differences and tradeoffs imposed by two parallel Haskell dialects running on multicore machines. GpH and Eden are both constructed using the highly-optimising sequential GHC compiler, and share thread scheduling, and ...
Comments