research-article

Multiverse: efficiently supporting distributed high-level speculation

Authors:
Kaushik Ravichandran

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Santosh Pande

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

Authors Info & Claims

ACM SIGPLAN Notices Volume 48 Issue 10October 2013pp 533–552https://doi.org/10.1145/2544173.2509525

Published:29 October 2013Publication History

ACM SIGPLAN Notices

Abstract

Algorithmic speculation or high-level speculation is a promising programming paradigm which allows programmers to speculatively branch an execution into multiple independent parallel sections and then choose the best (perhaps fastest) amongst them. The continuing execution after the speculatively branched section sees only the modifications made by the best one. This programming paradigm allows programmers to harness parallelism and can provide dramatic performance improvements.

In this paper we present the Multiverse speculative programming model. Multiverse allows programmers to exploit parallelism through high-level speculation. It can effectively harness large amounts of parallelism by speculating across an entire cluster and is not bound by the parallelism available in a single machine. We present abstractions and a runtime which allow programmers to introduce large scale high-level speculative parallelism into applications with minimal effort. We introduce a novel on-demand address space sharing mechanism which provide speculations efficient transparent access to the original address space of the application (including the use of pointers) across machine boundaries. Multiverse provides single commit semantics across speculations while guaranteeing isolation between them. We also introduce novel mechanisms to deal with scalability bottlenecks when there are a large number of speculations.

We demonstrate that for several benchmarks, Multiverse achieves impressive speedups and good scalability across entire clusters. We study the overheads of the runtime and demonstrate how our special scalability mechanisms are crucial in scaling cluster wide.

References

H. Abdel-Shafi, E. Speight, and J. K. Bennett. Efficient user-level thread migration and checkpointing on windows nt clusters. In phProceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3, WINSYM'99, pages 1--1, Berkeley, CA, USA, 1999. USENIX Association. URL http://dl.acm.org/citation.cfm?id=1268427.1268428. Google ScholarDigital Library
J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao, A. Edelman, and S. Amarasinghe. Petabricks: a language and compiler for algorithmic choice. In phProceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation, PLDI '09, pages 38--49, New York, NY, USA, 2009. ACM. ISBN 978--1--60558--392--1. 10.1145/1542476.1542481. URL http://doi.acm.org/10.1145/1542476.1542481. Google ScholarDigital Library
G. Antoniu, L. Bougé, and R. Namyst. An efficient and transparent thread migration scheme in the pm2 runtime system. In phProceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, pages 496--510, London, UK, UK, 1999. Springer-Verlag. ISBN 3--540--65831--9. URL http://dl.acm.org/citation.cfm?id=645611.662349. Google ScholarDigital Library
E. D. Berger, T. Yang, T. Liu, and G. Novark. Grace: safe multithreaded programming for c/cGoogle Scholar
. In phProceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, OOPSLA '09, pages 81--96, New York, NY, USA, 2009. ACM. ISBN 978--1--60558--766-0. 10.1145/1640089.1640096. URL http://doi.acm.org/10.1145/1640089.1640096.Google Scholar
Y. Caniou, D. Diaz, F. Richoux, P. Codognet, and S. Abreu. Performance analysis of parallel constraint-based local search. In phProceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, PPoPP '12, pages 337--338, New York, NY, USA, 2012. ACM. ISBN 978--1--4503--1160--1. 10.1145/2145816.2145883. URL http://doi.acm.org/10.1145/2145816.2145883. Google ScholarDigital Library
R. E. Cledat, T. Kumar, and S. Pande. Efficiently speeding up sequential computation through the n-way programming model. In phProceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications, OOPSLA '11, pages 537--554, New York, NY, USA, 2011. ACM. ISBN 978--1--4503-0940-0. 10.1145/2048066.2048109. URL http://doi.acm.org/10.1145/2048066.2048109. Google ScholarDigital Library
B. Cox, D. Evans, A. Filipi, J. Rowanhill, W. Hu, J. Davidson, J. Knight, A. Nguyen-Tuong, and J. Hiser. N-variant systems: a secretless framework for security through diversity. In phProceedings of the 15th conference on USENIX Security Symposium - Volume 15, USENIX-SS'06, Berkeley, CA, USA, 2006. USENIX Association. URL http://dl.acm.org/citation.cfm?id=1267336.1267344. Google ScholarDigital Library
H. Croft, K. Falconer, and R. Guy. phUnsolved problems in geometry. Problem books in mathematics. Springer-Verlag, 1991. ISBN 9780387975061. URL http://books.google.com/books?id=xF4PAQAAMAAJ.Google Scholar
D. Diaz, P. Codognet, and S. Abreu. Adaptive search distribution, http://cri-dist.univ-paris1.fr/diaz/adaptive/.Google Scholar
W. R. Dieter and J. E. Lumpp, Jr. User-level checkpointing for linuxthreads programs. In phProceedings of the FREENIX Track: 2001 USENIX Annual Technical Conference, pages 81--92, Berkeley, CA, USA, 2001. USENIX Association. ISBN 1--880446--10--3. URL http://dl.acm.org/citation.cfm?id=647054.715766. Google ScholarDigital Library
G. Fursin, A. Cohen, M. O'Boyle, and O. Temam. A practical method for quickly evaluating program optimizations. In phProceedings of the First international conference on High Performance Embedded Architectures and Compilers, HiPEAC'05, pages 29--46, Berlin, Heidelberg, 2005. Springer-Verlag. ISBN 3--540--30317-0, 978--3--540--30317--6. 10.1007/11587514_4. URL http://dx.doi.org/10.1007/11587514_4. Google ScholarDigital Library
R. P. Garg and I. Sharapov. phTechniques for Optimizing Applications: High Performance Computing. Prentice Hall Professional Technical Reference, 2002. ISBN 0130091189. Google ScholarDigital Library
S. Golomb and H. Taylor. Constructions and properties of costas arrays. phProceedings of the IEEE, 72 (9): 1143--1163, 1984. ISSN 0018--9219. 10.1109/PROC.1984.12994.Google Scholar
L. Hammond, M. Willey, and K. Olukotun. Data speculation support for a chip multiprocessor. In phProceedings of the eighth international conference on Architectural support for programming languages and operating systems, ASPLOS VIII, pages 58--69, New York, NY, USA, 1998. ACM. ISBN 1--58113--107-0. 10.1145/291069.291020. URL http://doi.acm.org/10.1145/291069.291020. Google ScholarDigital Library
T. Harris, J. Larus, and R. Rajwar. phTransactional Memory, 2nd Edition. Morgan and Claypool Publishers, 2nd edition, 2010. ISBN 1608452352, 9781608452354. Google ScholarDigital Library
K. Helsgaun. An effective implementation of the linkernighan traveling salesman heuristic. phEuropean Journal of Operational Research, 126 (1): 106--130, 2000. ISSN 0377--2217. http://dx.doi.org/10.1016/S0377--2217(99)00284--2. URL http://www.sciencedirect.com/science/article/pii/S0377221799002842.Google ScholarCross Ref
H. Hoos. phStochastic Local Search - Methods, Models, Applications. IOS Press. ISBN 9783898382151. URL http://books.google.com/books?id=xc\_4OtzmHUUC.Google Scholar
P. Hosek and C. Cadar. Safe software updates via multi-version execution. In phProceedings of the 2013 International Conference on Software Engineering, ICSE '13, pages 612--621, Piscataway, NJ, USA, 2013. IEEE Press. ISBN 978-1-4673-3076-3. URL http://dl.acm.org/citation.cfm?id=2486788.2486869. Google ScholarDigital Library
T. IEEE and T. O. Group. The open group base specifications, issue 6, ieee std 1003.1, 2004.Google Scholar
C. Jung, D. Lim, J. Lee, and S. Han. Adaptive execution techniques for smt multiprocessor architectures. In phProceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP '05, pages 236--246, New York, NY, USA, 2005. ACM. ISBN 1--59593-080--9. 10.1145/1065944.1065976. URL http://doi.acm.org/10.1145/1065944.1065976. Google ScholarDigital Library
L. V. Kale and S. Krishnan. CharmGoogle Scholar
: a portable concurrent object oriented system based on cGoogle Scholar
. In phProceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications, OOPSLA '93, pages 91--108, New York, NY, USA, 1993. ACM. ISBN 0--89791--587--9. 10.1145/165854.165874. URL http://doi.acm.org/10.1145/165854.165874.Google Scholar
A. P. Kamath, N. K. Karmarkar, K. G. Ramakrishnan, and M. G. C. Resende. A continuous approach to inductive inference. phMath. Program., 57 (2): 215--238, Nov. 1992. ISSN 0025--5610. 10.1007/BF01581082. URL http://dx.doi.org/10.1007/BF01581082. Google ScholarCross Ref
M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic parallelism requires abstractions. In phProceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, PLDI '07, pages 211--222, New York, NY, USA, 2007. ACM. ISBN 978--1--59593--633--2. 10.1145/1250734.1250759. URL http://doi.acm.org/10.1145/1250734.1250759. Google ScholarDigital Library
J. Lau, M. Arnold, M. Hind, and B. Calder. Online performance auditing: using hot optimizations without getting burned. In phProceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation, PLDI '06, pages 239--251, New York, NY, USA, 2006. ACM. ISBN 1--59593--320--4. 10.1145/1133981.1134010. URL http://doi.acm.org/10.1145/1133981.1134010. Google ScholarDigital Library
M. Luby and W. Ertel. Optimal parallelization of las vegas algorithms. In phProceedings of the 11th Annual Symposium on Theoretical Aspects of Computer Science, STACS '94, pages 463--474, London, UK, UK, 1994. Springer-Verlag. ISBN 3--540--57785--8. URL http://dl.acm.org/citation.cfm?id=646510.695012. Google ScholarDigital Library
M. Mitzenmacher and E. Upfal. phProbability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, 2005. ISBN 9780521835404. URL http://books.google.com/books?id=0bAYl6d7hvkC. Google ScholarDigital Library
S. Negara, G. Zheng, K.-C. Pan, N. Negara, R. E. Johnson, L. V. Kalé, and P. M. Ricker. Automatic mpi to ampi program transformation using photran. In phProceedings of the 2010 conference on Parallel processing, Euro-Par 2010, pages 531--539, Berlin, Heidelberg, 2011. Springer-Verlag. ISBN 978--3--642--21877--4. URL http://dl.acm.org/citation.cfm?id=2031978.2032050. Google ScholarDigital Library
AMD64 Architecture Programmer's Manual, Volume 2, 2010.Google Scholar
P. S. Pacheco. phParallel programming with MPI. 1996. ISBN 1--55860--339--5. Google ScholarDigital Library
P. Prabhu, G. Ramalingam, and K. Vaswani. Safe programmable speculative parallelism. In phProceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation, PLDI '10, pages 50--61, New York, NY, USA, 2010. ACM. ISBN 978--1--4503-0019--3. 10.1145/1806596.1806603. URL http://doi.acm.org/10.1145/1806596.1806603. Google ScholarDigital Library
H. K. Pyla, C. Ribbens, and S. Varadarajan. Exploiting coarse-grain speculative parallelism. In phProceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications, OOPSLA '11, pages 555--574, New York, NY, USA, 2011. ACM. ISBN 978--1--4503-0940-0. 10.1145/2048066.2048110. URL http://doi.acm.org/10.1145/2048066.2048110. Google ScholarDigital Library
M. Rieker, J. Ansel, and G. Cooperman. Transparent user-level checkpointing for the native posix thread library for linux. In phThe 2006 International Conference on Parallel and Distributed Processing Techniques and Applications, Jun 2006.Google Scholar
Rutgers. Dimacs benchmarks, http://dimacs.rutgers.edu/challenges/.Google Scholar
B. Selman, H. A. Kautz, and B. Cohen. Noise strategies for improving local search. In phProceedings of the twelfth national conference on Artificial intelligence (vol. 1), AAAI '94, pages 337--343, Menlo Park, CA, USA, 1994. American Association for Artificial Intelligence. ISBN 0--262--61102--3. URL http://dl.acm.org/citation.cfm?id=199288.178090. Google ScholarDigital Library
A. Shye, J. Blomstedt, T. Moseley, V. J. Reddi, and D. A. Connors. Plr: A software approach to transient fault tolerance for multicore architectures. phIEEE Trans. Dependable Secur. Comput., 6 (2): 135--148, Apr. 2009. ISSN 1545--5971. 10.1109/TDSC.2008.62. URL http://dx.doi.org/10.1109/TDSC.2008.62. Google ScholarDigital Library
J. Steffan and T. Mowry. The potential for using thread-level data speculation to facilitate automatic parallelization. In phProceedings of the 4th International Symposium on High-Performance Computer Architecture, HPCA '98, pages 2--, Washington, DC, USA, 1998. IEEE Computer Society. ISBN 0--8186--8323--6. URL http://dl.acm.org/citation.cfm?id=822079.822712. Google ScholarDigital Library
O. Trachsel and T. R. Gross. Variant-based competitive parallel execution of sequential programs. In phProceedings of the 7th ACM international conference on Computing frontiers, CF '10, pages 197--206, New York, NY, USA, 2010. ACM. ISBN 978--1--4503-0044--5. 10.1145/1787275.1787325. URL http://doi.acm.org/10.1145/1787275.1787325. Google ScholarDigital Library
S. P. Vanderwiel and D. J. Lilja. Data prefetch mechanisms. phACM Comput. Surv., 32 (2): 174--199, June 2000. ISSN 0360-0300. 10.1145/358923.358939. URL http://doi.acm.org/10.1145/358923.358939. Google ScholarDigital Library
M. J. Voss and R. Eigenmann. High-level adaptive program optimization with adapt. In phACM SIGPLAN Notices, pages 93--102. ACM Press, 2001. Google ScholarDigital Library
C. M. Wintersteiger, Y. Hamadi, and L. Moura. A concurrent portfolio approach to smt solving. In phProceedings of the 21st International Conference on Computer Aided Verification, CAV '09, pages 715--720, Berlin, Heidelberg, 2009. Springer-Verlag. ISBN 978--3--642-02657--7. 10.1007/978--3--642-02658--4_60. URL http://dx.doi.org/10.1007/978--3--642-02658--4_60. Google ScholarCross Ref
W. Wulf and M. Shaw. Global variable considered harmful. phSIGPLAN Not., 8 (2): 28--34, Feb. 1973. ISSN 0362--1340. 10.1145/953353.953355. URL http://doi.acm.org/10.1145/953353.953355. Google ScholarDigital Library
K. Xu and W. Li. Exact phase transitions in random constraint satisfaction problems. phJournal of Artificial Intelligence Research, 12: 93--103, 2000. Google ScholarDigital Library
T.-Y. Yeh and Y. N. Patt. A comparison of dynamic branch predictors that use two levels of branch history. In phProceedings of the 20th annual international symposium on computer architecture, ISCA '93, pages 257--266, New York, NY, USA, 1993. ACM. ISBN 0--8186--3810--9. 10.1145/165123.165161. URL http://doi.acm.org/10.1145/165123.165161. Google ScholarDigital Library

Index Terms

Multiverse: efficiently supporting distributed high-level speculation
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features
        Concurrent programming structures
      2. Language types
        Distributed programming languages

Recommendations

Multiverse: efficiently supporting distributed high-level speculation
OOPSLA '13: Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications

Algorithmic speculation or high-level speculation is a promising programming paradigm which allows programmers to speculatively branch an execution into multiple independent parallel sections and then choose the best (perhaps fastest) amongst them. The ...
Read More
An evaluation of speculative instruction execution on simultaneous multithreaded processors

Modern superscalar processors rely heavily on speculative execution for performance. For example, our measurements show that on a 6-issue superscalar, 93% of committed instructions for SPECINT95 are speculative. Without speculation, processor resources ...
Read More
Tuning Compiler Optimizations for Simultaneous Multithreading
Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part II

Simultaneous Multithreading (SMT) is a processor architectural technique that promises to significantly improve the utilization and performance of modern wide-issue superscalar processors. An SM T processor is capable of issuing multiple instructions ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 48, Issue 10
OOPSLA '13
October 2013
867 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2544173
Editor:
Mark W. Bailey
Hamilton College, Clinton, NY
Issue’s Table of Contents
OOPSLA '13: Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
October 2013
904 pages
ISBN:9781450323741
DOI:10.1145/2509136
Co-chair:
Antony Hosking
Purdue University, USA
,
General Chair:
Patrick Eugster
Purdue University, USA
,
Program Chair:
Cristina V. Lopes
University of California, Irvine, USA
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 October 2013
Check for updates
Author Tags
algorithmic speculation
cluster
function speculation
multiverse
parallelism
speculation
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 374
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Multiverse: efficiently supporting distributed high-level speculation

ACM SIGPLAN Notices

Abstract

References

Cited By

Index Terms

Recommendations

Multiverse: efficiently supporting distributed high-level speculation

An evaluation of speculative instruction execution on simultaneous multithreaded processors

Tuning Compiler Optimizations for Simultaneous Multithreading