ABSTRACT
Existing dynamic race detectors suffer from at least one of the following three limitations:
(i)space overhead per memory location grows linearly with the number of parallel threads [13], severely limiting the parallelism that the algorithm can handle;
(ii)sequentialization: the parallel program must be processed in a sequential order, usually depth-first [12, 24]. This prevents the analysis from scaling with available hardware parallelism, inherently limiting its performance;
(iii) inefficiency: even though race detectors with good theoretical complexity exist, they do not admit efficient implem entations and are unsuitable for practical use [4, 18].
We present a new precise dynamic race detector that leverages structured parallelism in order to address these limitations. Our algorithm requires constant space per memory location, works in parallel, and is efficient in practice. We implemented and evaluated our algorithm on a set of 15 benchmarks. Our experimental results indicate an average (geometric mean) slowdown of 2.78x on a 16-core SMP system.
Supplemental Material
Available for Download
Included files: This folder includes the following files: the only high-level tex file, Figures and Building the document instructions (to create PS and PDF files): latex draft, dvips draft, ps2pdf draft.ps
- Intel Cilk++ Programmer's Guide. http://software.intel.com/enus/ articles/download-intel-cilk-sdk/.Google Scholar
- The Computer Language Benchmarks Game. http://shootout.alioth.debian.org/.Google Scholar
- BARIK, R., ET AL. The habanero multicore software research project. In Proceeding of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications (2009), ACM, pp. 735--736. Google ScholarDigital Library
- BENDER, M. A., ET AL. On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs. In SPAA'04 (Barcelona, Spain, June27-30 2004), pp. 133--144. Google ScholarDigital Library
- BLUMOFE, R. D., ET AL. Cilk: an efficient multithreaded runtime system. In PPoPP'95 (Oct. 1995), pp. 207--216. Google ScholarDigital Library
- CAVE, V., ZHAO, J., SHIRAKO, J., AND SARKAR, V. Habanero-java: the new adventures of old x10. In PPPJ'11 (2011). Google ScholarDigital Library
- CHARLES, P., ET AL. X10: An object-oriented approach to nonuniform cluster computing. In OOPSLA 2005 Onward! Track (2005). Google ScholarDigital Library
- CHENG, G.-I., FENG, M., LEISERSON, C. E., RANDALL, K. H., AND STARK, A. F. Detecting data races in cilk programs that use locks. In SPAA'98 (1998), pp. 298--309. Google ScholarDigital Library
- CONG, J., SARKAR, V., REINMAN, G., AND BUI, A. Customizable Domain-Specific Computing. IEEE Design and Test, 2:28 (Mar 2011), 6--15. Google ScholarDigital Library
- DINNING, A., AND SCHONBERG, E. An empirical comparison of monitoring algorithms for access anomaly detection. In PPoPP'90 (1990), ACM, pp. 1--10. Google ScholarDigital Library
- DURAN, A., ET AL. Barcelona OpenMP tasks suite: A set of benchmarks targeting the exploitation of task parallelism in openmp. In ICPP'09 (2009), pp. 124--131. Google ScholarDigital Library
- FENG,M., AND LEISERSON,C. E. Efficient detection of determinacy races in cilk programs. In SPAA'97 (1997), ACM, pp. 1--11. Google ScholarDigital Library
- FLANAGAN, C., AND FREUND, S. N. FastTrack: efficient and precise dynamic race detection. In PLDI '09 (2009), ACM, pp. 121--133. Google ScholarDigital Library
- FLANAGAN, C., AND FREUND, S. N. The roadrunner dynamic analysis framework for concurrent programs. In PASTE'10 (2010), ACM, pp. 1--8. Google ScholarDigital Library
- FRIGO, M., LEISERSON, C. E., AND RANDALL, K. H. The implementation of the cilk-5 multithreaded language. In PLDI'98 (1998), ACM, pp. 212--223. Google ScholarDigital Library
- GUO, Y., ET AL. Work-first and help-first scheduling policies for async-finish task parallelism. In IPDPS'09 (2009), IEEE Computer Society, pp. 1--12. Google ScholarDigital Library
- GUO, Y., ZHAO, J., CAVE, V., AND SARKAR, V. Slaw: A scalable locality-aware adaptive work-stealing scheduler. In IPDPS (2010). Google ScholarDigital Library
- KARUNARATNA, T. C. Nondeterminator-3: A provably good datarace detector that runs in parallel. Master's thesis, Department of Electrical Engineering and Computer Science, MIT,, Sept. 2005.Google Scholar
- LAMPORT, L. Concurrent reading and writing. Commun. ACM 20 (November 1977), 806--811. Google ScholarDigital Library
- LEE, J. K., AND PALSBERG, J. Featherweight x10: a core calculus for async-finish parallelism. In PPoPP'10 (2010), ACM, pp. 25--36. Google ScholarDigital Library
- MELLOR-CRUMMEY, J. On-the-fly detection of data races for programs with nested fork-join parallelism. In Supercomputing'91 (1991), ACM, pp. 24--33. Google ScholarDigital Library
- MELLOR-CRUMMEY, J. Compile-time support for efficient data race detection in shared-memory parallel programs. In PADD '93: Proceedings of the ACM/ONR workshop on Parallel and distributed debugging (1993), ACM, pp. 129--139. Google ScholarDigital Library
- OpenMP Application Program Interface v 3.0, 2008.Google Scholar
- RAMAN, R., ET AL. Efficient data race detection for async-finish parallelism. In RV'10 (2010), Springer-Verlag, pp. 368--383. Google ScholarDigital Library
- RAMAN, R., ZHAO, J., SARKAR, V., VECHEV, M., AND YAHAV, E. Scalable and precise dynamic datarace detection for structured parallelism. Tech. Rep. TR12-01, Department of Computer Science, Rice University, Houston, TX, 2012.Google Scholar
- SAVAGE, S., ET AL. Eraser: a dynamic data race detector for multithreaded programs. ACM Trans. Comput. Syst. 15, 4 (1997), 391--411. Google ScholarDigital Library
- SCHONBERG, E. On-the-fly detection of access anomalies. In PLDI'98 (1998), pp. 285--297. Google ScholarDigital Library
- SMITH, L. A., AND BULL, J. M. A Parallel Java Grande Benchmark Suite. In In Supercomputing'01 (2001), ACM Press, p. 8. Google ScholarDigital Library
- VALLEE-RAI, R., ET AL. Soot - a Java Optimization Framework. In Proceedings of CASCON 1999 (1999), pp. 125--135. Google ScholarDigital Library
- WINSKEL, G. The Formal Semantics of Programming Languages. MIT Press, 1993. Google ScholarCross Ref
- ZHAO, J., AND SARKAR, V. Intermediate language extensions for parallelism. In VMIL'11 (2011), pp. 333--334. Google ScholarDigital Library
Index Terms
- Scalable and precise dynamic datarace detection for structured parallelism
Recommendations
Scalable and precise dynamic datarace detection for structured parallelism
PLDI '12Existing dynamic race detectors suffer from at least one of the following three limitations:
(i)space overhead per memory location grows linearly with the number of parallel threads [13], severely limiting the parallelism that the algorithm can handle;
(...
Efficient data race detection for async-finish parallelism
A major productivity hurdle for parallel programming is the presence of data races . Data races can lead to all kinds of harmful program behaviors, including determinism violations and corrupted memory. However, runtime overheads of current dynamic ...
Parallel data race detection for task parallel programs with locks
FSE 2016: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software EngineeringProgramming with tasks is a promising approach to write performance portable parallel code. In this model, the programmer explicitly specifies tasks and the task parallel runtime employs work stealing to distribute tasks among threads. Similar to ...
Comments