Skip to main content

2016 | OriginalPaper | Buchkapitel

Fencing Programs with Self-Invalidation and Self-Downgrade

verfasst von : Parosh Aziz Abdulla, Mohamed Faouzi Atig, Stefanos Kaxiras, Carl Leonardsson, Alberto Ros, Yunyun Zhu

Erschienen in: Formal Techniques for Distributed Objects, Components, and Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Cache coherence protocols using self-invalidation and self-downgrade have recently seen increased popularity due to their simplicity, potential performance efficiency, and low energy consumption. However, such protocols result in memory instruction reordering, thus causing extra program behaviors that are often not intended by the programmer. We propose a novel formal model that captures the semantics of programs running under such protocols, and employs a set of fences that interact with the coherence layer. Using the model, we perfform a reachability analysis that can check whether a program satisfies a given safety property with the current set of fences. Based on an algorithm in [19], we describe a method for insertion of optimal sets of fences that ensure correctness of the program under such protocols. The method relies on a counter-example guided fence insertion procedure. One feature of our method is that it can handle a variety of fences (with different costs). This diversity makes optimization more difficult since one has to optimize the total cost of the inserted fences, rather than just their number. To demonstrate the strength of our approach, we have implemented a prototype and run it on a wide range of examples and benchmarks. We have also, using simulation, evaluated the performance of the resulting fenced programs.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
This definition can be generalized. Our prototype tool does indeed support a more general definition of fence positions, which is left out of the article for simplicity.
 
2
Our methods could also run under a plain Sd protocol. However, to our knowledge, no cache coherence protocol employs only Sd without Si.
 
Literatur
1.
Zurück zum Zitat Abdulla, P.A., Atig, M.F., Chen, Y.-F., Leonardsson, C., Rezine, A.: Counter-example guided fence insertion under TSO. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 204–219. Springer, Heidelberg (2012)CrossRef Abdulla, P.A., Atig, M.F., Chen, Y.-F., Leonardsson, C., Rezine, A.: Counter-example guided fence insertion under TSO. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 204–219. Springer, Heidelberg (2012)CrossRef
2.
Zurück zum Zitat Adve, S.V., Hill, M.D.: Weak ordering - a new definition. In: ISCA, pp. 2–14 (1990) Adve, S.V., Hill, M.D.: Weak ordering - a new definition. In: ISCA, pp. 2–14 (1990)
3.
Zurück zum Zitat Alglave, J., Kroening, D., Nimal, V., Poetzl, D.: Don’t sit on the fence. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 508–524. Springer, Heidelberg (2014) Alglave, J., Kroening, D., Nimal, V., Poetzl, D.: Don’t sit on the fence. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 508–524. Springer, Heidelberg (2014)
4.
Zurück zum Zitat Bouajjani, A., Derevenetc, E., Meyer, R.: Checking and enforcing robustness against TSO. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 533–553. Springer, Heidelberg (2013)CrossRef Bouajjani, A., Derevenetc, E., Meyer, R.: Checking and enforcing robustness against TSO. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 533–553. Springer, Heidelberg (2013)CrossRef
5.
Zurück zum Zitat Bouajjani, A., Meyer, R., Möhlmann, E.: Deciding robustness against total store ordering. In: Aceto, L., Henzinger, M., Sgall, J. (eds.) ICALP 2011, Part II. LNCS, vol. 6756, pp. 428–440. Springer, Heidelberg (2011)CrossRef Bouajjani, A., Meyer, R., Möhlmann, E.: Deciding robustness against total store ordering. In: Aceto, L., Henzinger, M., Sgall, J. (eds.) ICALP 2011, Part II. LNCS, vol. 6756, pp. 428–440. Springer, Heidelberg (2011)CrossRef
6.
Zurück zum Zitat Chase, D., Lev, Y.: Dynamic circular work-stealing deque. In: SPAA, pp. 21–28 (2005) Chase, D., Lev, Y.: Dynamic circular work-stealing deque. In: SPAA, pp. 21–28 (2005)
7.
Zurück zum Zitat Choi, B., Komuravelli, R., Sung, H., Smolinski, R., Honarmand, N., Adve, S.V., Adve, V.S., Carter, N.P., Chou, C.T.: DeNovo: rethinking the memory hierarchy for disciplined parallelism. In: PACT, pp. 155–166 (2011) Choi, B., Komuravelli, R., Sung, H., Smolinski, R., Honarmand, N., Adve, S.V., Adve, V.S., Carter, N.P., Chou, C.T.: DeNovo: rethinking the memory hierarchy for disciplined parallelism. In: PACT, pp. 155–166 (2011)
8.
Zurück zum Zitat Davari, M., Ros, A., Hagersten, E., Kaxiras, S.: An efficient, self-contained, on-chip, directory: DIR\(_1\)-SISD. In: PACT, pp. 317–330 (2015) Davari, M., Ros, A., Hagersten, E., Kaxiras, S.: An efficient, self-contained, on-chip, directory: DIR\(_1\)-SISD. In: PACT, pp. 317–330 (2015)
9.
Zurück zum Zitat Dice, D., Shalev, O., Shavit, N.N.: Transactional locking II. In: Dolev, S. (ed.) DISC 2006. LNCS, vol. 4167, pp. 194–208. Springer, Heidelberg (2006)CrossRef Dice, D., Shalev, O., Shavit, N.N.: Transactional locking II. In: Dolev, S. (ed.) DISC 2006. LNCS, vol. 4167, pp. 194–208. Springer, Heidelberg (2006)CrossRef
10.
Zurück zum Zitat Dijkstra, E.W.: Cooperating sequential processes (2002) Dijkstra, E.W.: Cooperating sequential processes (2002)
11.
Zurück zum Zitat Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., San Francisco (2008) Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., San Francisco (2008)
12.
Zurück zum Zitat Hower, D.R., Hechtman, B.A., Beckmann, B.M., Gaster, B.R., Hill, M.D., Reinhardt, S.K., Wood, D.A.: Heterogeneous-race-free memory models. In: ASPLOS, pp. 427–440 (2014) Hower, D.R., Hechtman, B.A., Beckmann, B.M., Gaster, B.R., Hill, M.D., Reinhardt, S.K., Wood, D.A.: Heterogeneous-race-free memory models. In: ASPLOS, pp. 427–440 (2014)
13.
Zurück zum Zitat Kaxiras, S., Keramidas, G.: SARC coherence: scaling directory cache coherence in performance and power. IEEE Micro 30(5), 54–65 (2011)CrossRef Kaxiras, S., Keramidas, G.: SARC coherence: scaling directory cache coherence in performance and power. IEEE Micro 30(5), 54–65 (2011)CrossRef
14.
Zurück zum Zitat Kaxiras, S., Ros, A.: A new perspective for efficient virtual-cache coherence. In: ISCA, pp. 535–547 (2013) Kaxiras, S., Ros, A.: A new perspective for efficient virtual-cache coherence. In: ISCA, pp. 535–547 (2013)
15.
Zurück zum Zitat Koukos, K., Ros, A., Hagersten, E., Kaxiras, S.: Building heterogeneous unified virtual memories (UVMS) without the overhead. ACM TACO 13(1), 1:1–1:22 (2016) Koukos, K., Ros, A., Hagersten, E., Kaxiras, S.: Building heterogeneous unified virtual memories (UVMS) without the overhead. ACM TACO 13(1), 1:1–1:22 (2016)
16.
Zurück zum Zitat Kuperstein, M., Vechev, M., Yahav, E.: Automatic inference of memory fences. In: FMCAD, pp. 111–119. IEEE (2010) Kuperstein, M., Vechev, M., Yahav, E.: Automatic inference of memory fences. In: FMCAD, pp. 111–119. IEEE (2010)
18.
Zurück zum Zitat Lebeck, A.R., Wood, D.A.: Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors. In: ISCA, pp. 48–59 (1995) Lebeck, A.R., Wood, D.A.: Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors. In: ISCA, pp. 48–59 (1995)
19.
Zurück zum Zitat Liu, F., Nedev, N., Prisadnikov, N., Vechev, M.T., Yahav, E.: Dynamic synthesis for relaxed memory models. In: PLDI, pp. 429–440 (2012) Liu, F., Nedev, N., Prisadnikov, N., Vechev, M.T., Yahav, E.: Dynamic synthesis for relaxed memory models. In: PLDI, pp. 429–440 (2012)
20.
Zurück zum Zitat Magnusson, P., Landin, A., Hagersten, E.: Queue locks on cache coherent multiprocessors. In: Proceedings of Eighth International Parallel Processing Symposium, pp. 165–171. IEEE (1994) Magnusson, P., Landin, A., Hagersten, E.: Queue locks on cache coherent multiprocessors. In: Proceedings of Eighth International Parallel Processing Symposium, pp. 165–171. IEEE (1994)
21.
Zurück zum Zitat Martin, M.M., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen, A.R., Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. Comput. Archit. News 33(4), 92–99 (2005)CrossRef Martin, M.M., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen, A.R., Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. Comput. Archit. News 33(4), 92–99 (2005)CrossRef
22.
Zurück zum Zitat Mellor-Crummey, J.M., Scott, M.L.: Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. (TOCS) 9(1), 21–65 (1991)CrossRef Mellor-Crummey, J.M., Scott, M.L.: Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. (TOCS) 9(1), 21–65 (1991)CrossRef
23.
Zurück zum Zitat Ros, A., Davari, M., Kaxiras, S.: Hierarchical private/shared classification: the key to simple and efficient coherence for clustered cache hierarchies. In: HPCA, pp. 186–197 (2015) Ros, A., Davari, M., Kaxiras, S.: Hierarchical private/shared classification: the key to simple and efficient coherence for clustered cache hierarchies. In: HPCA, pp. 186–197 (2015)
24.
Zurück zum Zitat Ros, A., Kaxiras, S.: Complexity-effective multicore coherence. In: PACT, pp. 241–252 (2012) Ros, A., Kaxiras, S.: Complexity-effective multicore coherence. In: PACT, pp. 241–252 (2012)
25.
Zurück zum Zitat Ros, A., Kaxiras, S.: Callback: efficient synchronization without invalidation with a directory just for spin-waiting. In: ISCA, pp. 427–438 (2015) Ros, A., Kaxiras, S.: Callback: efficient synchronization without invalidation with a directory just for spin-waiting. In: ISCA, pp. 427–438 (2015)
26.
Zurück zum Zitat Ros, A., Kaxiras, S.: Fast & furious: a tool for detecting covert racing. In: PARMA and DITAM, pp. 1–6 (2015) Ros, A., Kaxiras, S.: Fast & furious: a tool for detecting covert racing. In: PARMA and DITAM, pp. 1–6 (2015)
27.
Zurück zum Zitat Sakalis, C., Leonardsson, C., Kaxiras, S., Ros, A.: Splash-3: a properly synchronized benchmark suite for contemporary research. In: ISPASS (2016) Sakalis, C., Leonardsson, C., Kaxiras, S., Ros, A.: Splash-3: a properly synchronized benchmark suite for contemporary research. In: ISPASS (2016)
28.
Zurück zum Zitat Schmidt, D.C., Harrison, T.: Double-checked locking - an optimization pattern for efficiently initializing and accessing thread-safe objects. In: PLoP (1996) Schmidt, D.C., Harrison, T.: Double-checked locking - an optimization pattern for efficiently initializing and accessing thread-safe objects. In: PLoP (1996)
29.
Zurück zum Zitat Scott, M.L.: Shared-Memory Synchronization. Morgan & Claypool, San Rafael (2013) Scott, M.L.: Shared-Memory Synchronization. Morgan & Claypool, San Rafael (2013)
30.
Zurück zum Zitat Shasha, D., Snir, M.: Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst. (TOPLAS) 10(2), 282–312 (1988)CrossRef Shasha, D., Snir, M.: Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst. (TOPLAS) 10(2), 282–312 (1988)CrossRef
31.
Zurück zum Zitat Sung, H., Adve, S.V.: DeNovoSync: efficient support for arbitrary synchronization without writer-initiated invalidations. In: ASPLOS, pp. 545–559 (2015) Sung, H., Adve, S.V.: DeNovoSync: efficient support for arbitrary synchronization without writer-initiated invalidations. In: ASPLOS, pp. 545–559 (2015)
32.
Zurück zum Zitat Sung, H., Komuravelli, R., Adve, S.V.: DeNovoND: efficient hardware support for disciplined non-determinism. In: ASPLOS, pp. 13–26 (2013) Sung, H., Komuravelli, R., Adve, S.V.: DeNovoND: efficient hardware support for disciplined non-determinism. In: ASPLOS, pp. 13–26 (2013)
33.
Zurück zum Zitat Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: characterization and methodological considerations. In: ISCA, pp. 24–36 (1995) Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: characterization and methodological considerations. In: ISCA, pp. 24–36 (1995)
Metadaten
Titel
Fencing Programs with Self-Invalidation and Self-Downgrade
verfasst von
Parosh Aziz Abdulla
Mohamed Faouzi Atig
Stefanos Kaxiras
Carl Leonardsson
Alberto Ros
Yunyun Zhu
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-39570-8_2