Skip to main content
Erschienen in: The Journal of Supercomputing 10/2020

29.01.2020

Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache

verfasst von: Han Jun Bae, Lynn Choi

Erschienen in: The Journal of Supercomputing | Ausgabe 10/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Although the shared last-level cache (SLLC) occupies a significant portion of multicore CPU chip die area, more than 59% of SLLC cache blocks are not reused during their lifetime. If we can filter out these useless blocks from SLLC, we can effectively reduce the size of SLLC without sacrificing performance. For this purpose, we classify the reuse of cache blocks into temporal and spatial reuse and further analyze the reuse by using reuse interval and reuse count. From our experimentation, we found that most of spatially reused cache blocks are reused only once with short reuse interval, so it is inefficient to manage them in SLLC. In this paper, we propose a new small additional cache called Filter Cache to the SLLC, which cannot only check the temporal reuse but also can prevent spatially reused blocks from entering the SLLC. Thus, we do not maintain data for non-reused blocks and spatially reused blocks in the SLLC, dramatically reducing the size of the SLLC. Through our detailed simulation on PARSEC benchmarks, we show that our new SLLC design with Filter Cache exhibits comparable performance to the conventional SLLC with only 24.21% of SLLC area across a variety of different workloads. This is achieved by its faster access and high reuse rates in the small SLLC with Filter Cache.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Core Intel (2010) i7 processor extreme edition and Intel Core i7 processor datasheet. White paper, Intel Core Intel (2010) i7 processor extreme edition and Intel Core i7 processor datasheet. White paper, Intel
2.
Zurück zum Zitat Singh T, Rangarajan S, John D, Henrion C, Southard S, McIntyre H, Novak A et al. (2017) 3.2 Zen: a next-generation high-performance × 86 core. In: Solid-State Circuits Conference (ISSCC), 2017 IEEE International, pp 52–53. IEEE Singh T, Rangarajan S, John D, Henrion C, Southard S, McIntyre H, Novak A et al. (2017) 3.2 Zen: a next-generation high-performance × 86 core. In: Solid-State Circuits Conference (ISSCC), 2017 IEEE International, pp 52–53. IEEE
3.
Zurück zum Zitat McNairy Cameron, Soltis Don (2003) Itanium 2 processor microarchitecture. IEEE Micro 23(2):44–55CrossRef McNairy Cameron, Soltis Don (2003) Itanium 2 processor microarchitecture. IEEE Micro 23(2):44–55CrossRef
4.
Zurück zum Zitat Konstadinidis GK, Li HP, Schumacher F, Krishnaswamy V, Cho H, Dash S, Masleid RP et al (2016) SPARC M7: A 20 nm 32-core 64 MB L3 cache processor. IEEE J Solid-State Circuits 51(1):79–91CrossRef Konstadinidis GK, Li HP, Schumacher F, Krishnaswamy V, Cho H, Dash S, Masleid RP et al (2016) SPARC M7: A 20 nm 32-core 64 MB L3 cache processor. IEEE J Solid-State Circuits 51(1):79–91CrossRef
5.
Zurück zum Zitat Sinharoy B, Van Norstrand JA, Eickemeyer RJ, Le HQ, Leenstra J, Nguyen DQ, Konigsburg B et al (2015) IBM POWER8 processor core microarchitecture. IBM J Res Dev 59(1):1–2 Sinharoy B, Van Norstrand JA, Eickemeyer RJ, Le HQ, Leenstra J, Nguyen DQ, Konigsburg B et al (2015) IBM POWER8 processor core microarchitecture. IBM J Res Dev 59(1):1–2
6.
Zurück zum Zitat Albericio J, Ibáñez P, Viñals V, Llabería JM (2013) Exploiting reuse locality on inclusive shared last-level caches. ACM Trans Archit Code Optim (TACO) 9(4):38 Albericio J, Ibáñez P, Viñals V, Llabería JM (2013) Exploiting reuse locality on inclusive shared last-level caches. ACM Trans Archit Code Optim (TACO) 9(4):38
7.
Zurück zum Zitat Jaleel A, Theobald KB, Steely SC Jr, Emer J (2010) High performance cache replacement using re-reference interval prediction (RRIP). ACM SIGARCH Comput Archit News 38(3):60–71CrossRef Jaleel A, Theobald KB, Steely SC Jr, Emer J (2010) High performance cache replacement using re-reference interval prediction (RRIP). ACM SIGARCH Comput Archit News 38(3):60–71CrossRef
8.
Zurück zum Zitat Wu CJ, Jaleel A, Hasenplaugh W, Martonosi M, Steely Jr SC, Emer J (2011) SHiP: signature-based hit predictor for high performance caching. In: Proceedings of the 44th Annual Ieee/Acm International Symposium on Microarchitecture , pp 430–441. IEEE Wu CJ, Jaleel A, Hasenplaugh W, Martonosi M, Steely Jr SC, Emer J (2011) SHiP: signature-based hit predictor for high performance caching. In: Proceedings of the 44th Annual Ieee/Acm International Symposium on Microarchitecture , pp 430–441. IEEE
9.
Zurück zum Zitat Albericio J, Ibáñez P, Viñals V, Llabería JM (2013) The reuse cache: downsizing the shared last-level cache. In: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, pp 310–321. ACM Albericio J, Ibáñez P, Viñals V, Llabería JM (2013) The reuse cache: downsizing the shared last-level cache. In: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, pp 310–321. ACM
10.
Zurück zum Zitat Das S, Kapoor HK (2016) Towards a better cache utilization by selective data storage for CMP last level caches. In: 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID) , pp 92–97. IEEE Das S, Kapoor HK (2016) Towards a better cache utilization by selective data storage for CMP last level caches. In: 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID) , pp 92–97. IEEE
11.
Zurück zum Zitat Zhao L, Iyer R, Makineni S, Newell D, Cheng L (2010) NCID: a non-inclusive cache, inclusive directory architecture for flexible and efficient cache hierarchies. In: Proceedings of the 7th ACM International Conference on Computing Frontiers, pp 121–130. ACM Zhao L, Iyer R, Makineni S, Newell D, Cheng L (2010) NCID: a non-inclusive cache, inclusive directory architecture for flexible and efficient cache hierarchies. In: Proceedings of the 7th ACM International Conference on Computing Frontiers, pp 121–130. ACM
12.
Zurück zum Zitat Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill MD, Wood DA (2011) The Gem5 simulator. ACM SIGARCH Comput Archit News 39(2):1–7CrossRef Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill MD, Wood DA (2011) The Gem5 simulator. ACM SIGARCH Comput Archit News 39(2):1–7CrossRef
13.
Zurück zum Zitat Bienia C, Kumar S, Singh JP, Li K (2008) The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp 72–81, Oct 2008 Bienia C, Kumar S, Singh JP, Li K (2008) The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp 72–81, Oct 2008
14.
Zurück zum Zitat Thoziyoor, S., Muralimanohar, N., Ahn, J. H., & Jouppi, N., “Cacti 5.3.”, HP Laboratories, Palo Alto, CA., 2008 Thoziyoor, S., Muralimanohar, N., Ahn, J. H., & Jouppi, N., “Cacti 5.3.”, HP Laboratories, Palo Alto, CA., 2008
15.
Zurück zum Zitat Jain A, Lin C (2016) Back to the future: leveraging Belady’s algorithm for improved cache replacement. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture. IEEE Jain A, Lin C (2016) Back to the future: leveraging Belady’s algorithm for improved cache replacement. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture. IEEE
16.
Zurück zum Zitat Díaz J, Monreal T, Ibáñez P, Llabería JM, Viñals V (2019) ReD: a reuse detector for content selection in exclusive shared last-level caches. J Parallel Distrib Comput 125:106–120CrossRef Díaz J, Monreal T, Ibáñez P, Llabería JM, Viñals V (2019) ReD: a reuse detector for content selection in exclusive shared last-level caches. J Parallel Distrib Comput 125:106–120CrossRef
Metadaten
Titel
Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
verfasst von
Han Jun Bae
Lynn Choi
Publikationsdatum
29.01.2020
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 10/2020
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-020-03177-2

Weitere Artikel der Ausgabe 10/2020

The Journal of Supercomputing 10/2020 Zur Ausgabe