Skip to main content
Erschienen in: The Journal of Supercomputing 1/2019

05.01.2019

Dynamic directory table with victim cache: on-demand allocation of directory entries for active shared cache blocks

verfasst von: Han Jun Bae, Lynn Choi

Erschienen in: The Journal of Supercomputing | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present a novel directory architecture that can dynamically allocate a directory entry for a cache block on demand at runtime only when the block is shared by more than a single core. Thus, we do not maintain coherence for private blocks, substantially reducing the number of directory entries. Even for shared blocks, we allocate directory entry dynamically only when the block is actively shared, further reducing the number of directory entries at runtime. For this, we propose a new directory architecture called dynamic directory table (DDT), which is a decoupled directory storage from the shared cache and dynamically maintains directory entries only for actively shared blocks. Also, we add a small additional victim cache to its original DDT in order to reduce invalidation broadcasts caused by DDT eviction. Through our detailed simulation on PARSEC benchmarks, we show that DDT can outperform the expensive full-map directory by a slight margin with only 16.09% of directory area across a variety of different workloads. This is achieved by its faster access and high hit rates in the small directory. In addition, we demonstrate that even smaller DDTs can give comparable or higher performance compared to recent directory optimization schemes such as SPACE and DGD with considerably less area.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Zhao H, Shriraman A, Dwarkadas S (2010) SPACE: sharing pattern-based directory coherence for multicore scalability. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, pp 135–146 Zhao H, Shriraman A, Dwarkadas S (2010) SPACE: sharing pattern-based directory coherence for multicore scalability. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, pp 135–146
2.
Zurück zum Zitat Zebchuk J, Qureshi MK, Srinivasan V, Moshovos A (2009) A Tagless coherence directory. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp 423–434 Zebchuk J, Qureshi MK, Srinivasan V, Moshovos A (2009) A Tagless coherence directory. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp 423–434
3.
Zurück zum Zitat Zhao H, Shriraman A, Dwarkadas S, Srinivasan V (2011) Spatl: honey, I shrunk the coherence directory. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), pp 33–44 Zhao H, Shriraman A, Dwarkadas S, Srinivasan V (2011) Spatl: honey, I shrunk the coherence directory. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), pp 33–44
4.
Zurück zum Zitat Sanchez D, Kozyrakis C (2012) SCD: a scalable coherence directory with flexible sharer set encoding. In: Proceedings of the IEEE 18th International Symposium on High Performance Computer Architecture (HPCA), pp 1–12 Sanchez D, Kozyrakis C (2012) SCD: a scalable coherence directory with flexible sharer set encoding. In: Proceedings of the IEEE 18th International Symposium on High Performance Computer Architecture (HPCA), pp 1–12
5.
Zurück zum Zitat Zebchuk J, FalsafiB, Moshovos A (2013) Multi-grain coherence directories. In: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, pp 359–370 Zebchuk J, FalsafiB, Moshovos A (2013) Multi-grain coherence directories. In: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, pp 359–370
6.
Zurück zum Zitat Alisafaee M (2012) Spatiotemporal coherence tracking. In: Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp 341–350 Alisafaee M (2012) Spatiotemporal coherence tracking. In: Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp 341–350
7.
Zurück zum Zitat Zhao H, Shriraman A, Kumar S, Dwarkadas S (2013) Protozoa: adaptive granularity cache coherence. ACM SIGARCH Comput Archit News 41(3):547–558CrossRef Zhao H, Shriraman A, Kumar S, Dwarkadas S (2013) Protozoa: adaptive granularity cache coherence. ACM SIGARCH Comput Archit News 41(3):547–558CrossRef
8.
Zurück zum Zitat Zhang G, Horn W, Sanchez D (2015) Exploiting commutativity to reduce the cost of updates to shared data in cache-coherent systems. In: Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture, pp 13–25 Zhang G, Horn W, Sanchez D (2015) Exploiting commutativity to reduce the cost of updates to shared data in cache-coherent systems. In: Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture, pp 13–25
9.
Zurück zum Zitat Manivannan M, Negi A, Stenström P (2013) Efficient forwarding of producer–consumer data in task-based programs. In: 2013 42nd International Conference on Parallel Processing, pp 517–522 Manivannan M, Negi A, Stenström P (2013) Efficient forwarding of producer–consumer data in task-based programs. In: 2013 42nd International Conference on Parallel Processing, pp 517–522
10.
Zurück zum Zitat Censier LM, Feautrier P (1978) A new solution to coherence problems in multi-cache systems. IEEE Trans Comput 100(12):1112–1118CrossRefMATH Censier LM, Feautrier P (1978) A new solution to coherence problems in multi-cache systems. IEEE Trans Comput 100(12):1112–1118CrossRefMATH
11.
Zurück zum Zitat Agarwal A, Simoni R, Hennessy J, Horowitz M (1988) An evaluation of directory schemes for cache coherence. ACM SIGARCH Comput Archit News 16(2):280–298CrossRef Agarwal A, Simoni R, Hennessy J, Horowitz M (1988) An evaluation of directory schemes for cache coherence. ACM SIGARCH Comput Archit News 16(2):280–298CrossRef
12.
Zurück zum Zitat Cuesta BA, Ros A, Gómez ME, Robles A, Duato JF (2011) Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks. ACM SIGARCH Comput Archit News 39(3):93–104CrossRef Cuesta BA, Ros A, Gómez ME, Robles A, Duato JF (2011) Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks. ACM SIGARCH Comput Archit News 39(3):93–104CrossRef
13.
Zurück zum Zitat Gupta A, WeberWD, Mowry T (1992) Reducing memory and traffic requirements for scalable directory-based cache coherence schemes. In: Scalable shared memory multiprocessors, pp 167–192 Gupta A, WeberWD, Mowry T (1992) Reducing memory and traffic requirements for scalable directory-based cache coherence schemes. In: Scalable shared memory multiprocessors, pp 167–192
14.
Zurück zum Zitat Titos-Gil R, Flores A, Fernández-Pascual R, Ros A, Acacio ME (2017) Way-combining directory: an adaptive and scalable low-cost coherence directory. In: Proceedings of the International Conference on Supercomputing Titos-Gil R, Flores A, Fernández-Pascual R, Ros A, Acacio ME (2017) Way-combining directory: an adaptive and scalable low-cost coherence directory. In: Proceedings of the International Conference on Supercomputing
15.
Zurück zum Zitat Intel I (2013) Intel 64 and IA-32 architectures software developer’s manual. Syst Program Guide Part 1 3A:64 Intel I (2013) Intel 64 and IA-32 architectures software developer’s manual. Syst Program Guide Part 1 3A:64
16.
Zurück zum Zitat Hackenberg D, Molka D, Nagel WE (2009) Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp 413–422 Hackenberg D, Molka D, Nagel WE (2009) Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp 413–422
17.
Zurück zum Zitat Conway P, Hughes B (2007) The AMD opteron northbridge architecture. IEEE Micro 27(2):10–21CrossRef Conway P, Hughes B (2007) The AMD opteron northbridge architecture. IEEE Micro 27(2):10–21CrossRef
18.
Zurück zum Zitat Conway P, Kalyanasundharam N, Donley G, Lepak K, Hughes B (2010) Cache hierarchy and memory subsystem of the AMD opteron processor. IEEE Micro 30(2):16–29CrossRef Conway P, Kalyanasundharam N, Donley G, Lepak K, Hughes B (2010) Cache hierarchy and memory subsystem of the AMD opteron processor. IEEE Micro 30(2):16–29CrossRef
19.
Zurück zum Zitat Papamarcos MS, Patel JH (1984) A low-overhead coherence solution for multiprocessors with private cache memories. ACM SIGARCH Comput Archit News 12(3):348–354CrossRef Papamarcos MS, Patel JH (1984) A low-overhead coherence solution for multiprocessors with private cache memories. ACM SIGARCH Comput Archit News 12(3):348–354CrossRef
20.
Zurück zum Zitat Rudolph L, Segall Z (1984) Dynamic decentralized cache schemes for MIMD parallel processors. ACM SIGARCH Comput Archit News 12(3):340–347CrossRef Rudolph L, Segall Z (1984) Dynamic decentralized cache schemes for MIMD parallel processors. ACM SIGARCH Comput Archit News 12(3):340–347CrossRef
21.
Zurück zum Zitat Sweazey P, Smith AJ (1986) A class of compatible cache consistency protocols and their support by the IEEE futurebus. ACM SIGARCH Comput Archit News 14(2):414–423CrossRef Sweazey P, Smith AJ (1986) A class of compatible cache consistency protocols and their support by the IEEE futurebus. ACM SIGARCH Comput Archit News 14(2):414–423CrossRef
22.
Zurück zum Zitat Bae HJ, Choi L (2017) Dynamic directory table: on-demand allocation of directory entries for active shared cache blocks. J Korean Inst Inf Sci Eng (KIISE) 44(12):1245–1251 Bae HJ, Choi L (2017) Dynamic directory table: on-demand allocation of directory entries for active shared cache blocks. J Korean Inst Inf Sci Eng (KIISE) 44(12):1245–1251
23.
Zurück zum Zitat Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill MD, Wood DA (2011) The Gem5 simulator. ACM SIGARCH Comput Archit News 39(2):1–7CrossRef Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill MD, Wood DA (2011) The Gem5 simulator. ACM SIGARCH Comput Archit News 39(2):1–7CrossRef
24.
Zurück zum Zitat Bienia C, Kumar S, Singh JP, Li K (2008) The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp 72–81 Bienia C, Kumar S, Singh JP, Li K (2008) The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp 72–81
25.
Zurück zum Zitat Thoziyoor S, Muralimanohar N, Ahn JH, Jouppi N (2008) Cacti 5.3. HP Laboratories, Palo Alto Thoziyoor S, Muralimanohar N, Ahn JH, Jouppi N (2008) Cacti 5.3. HP Laboratories, Palo Alto
Metadaten
Titel
Dynamic directory table with victim cache: on-demand allocation of directory entries for active shared cache blocks
verfasst von
Han Jun Bae
Lynn Choi
Publikationsdatum
05.01.2019
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 1/2019
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-018-02735-z

Weitere Artikel der Ausgabe 1/2019

The Journal of Supercomputing 1/2019 Zur Ausgabe