Skip to main content
Erschienen in: The Journal of Supercomputing 2/2016

01.02.2016

Last level cache size heterogeneity in embedded systems

verfasst von: Mario D. Marino, Kuan-Ching Li

Erschienen in: The Journal of Supercomputing | Ausgabe 2/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In typical multicore processors, last level caches are formed by distributed clusters of memory banks of the same size, namely homogeneous ones. By shutting down part of these clusters to save power along generations of multicore processors, clusters with non-homogeneous cache sizes can be originated, named as heterogeneous ones. Given that heterogeneous clusters have typically smaller sizes than the homogeneous ones, they present larger miss rates that are likely to deteriorate performance. In this investigation, we study the impact of heterogeneous caches in embedded microprocessors, by having an arbitrary mix of homogeneous and heterogeneous clusters. That is, we propose to evaluate the architectural implications of these heterogeneous caches and a flexible algorithm that can be used to explore them. From scientific applications’ experimental benchmarking, our findings show that microprocessors with heterogeneous clusters present a maximal performance degradation of about 10 % and maximal performance improvement of 16 %, while obtaining maximum miss hit rate of reduction and improvement up to 10 %. In addition, 10 % of coherence activity decrease when presenting maximum energy utilization up to 50 % and maximum energy reduction of 15 %.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Cook H et al (2013) A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In: ISCA’13, New York, NY, USA, ACM, pp 308–319 Cook H et al (2013) A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In: ISCA’13, New York, NY, USA, ACM, pp 308–319
2.
Zurück zum Zitat Benitez D, Moure JC, Rexachs D, Luque E (2010) A reconfigurable cache memory with heterogeneous banks. In: Proceedings of the conference on design, automation and test in Europe, DATE ’10 (3001 Leuven, Belgium, Belgium), pp 825–830. European Design and Automation Association Benitez D, Moure JC, Rexachs D, Luque E (2010) A reconfigurable cache memory with heterogeneous banks. In: Proceedings of the conference on design, automation and test in Europe, DATE ’10 (3001 Leuven, Belgium, Belgium), pp 825–830. European Design and Automation Association
3.
Zurück zum Zitat Benitez D, Moure JC, Rexachs DI, Luque E (2006) A reconfigurable data cache for adaptive processors. In: Reconfigurable computing: architectures and applications, vol 3985. Lecture notes in computer science, pp 230–242. Springer, Berlin Benitez D, Moure JC, Rexachs DI, Luque E (2006) A reconfigurable data cache for adaptive processors. In: Reconfigurable computing: architectures and applications, vol 3985. Lecture notes in computer science, pp 230–242. Springer, Berlin
4.
Zurück zum Zitat Mittal S, Zhang Z, Vetter J (2013) FlexiWay: a cache energy saving technique using fine-grained cache reconfiguration. In: 2013 IEEE 31st international conference on computer design (ICCD), pp 100–107 Mittal S, Zhang Z, Vetter J (2013) FlexiWay: a cache energy saving technique using fine-grained cache reconfiguration. In: 2013 IEEE 31st international conference on computer design (ICCD), pp 100–107
5.
Zurück zum Zitat Sleiman FM, Dreslinski RG, Wenisch TF (2012) Embedded way prediction for last-level caches. In: 2012 IEEE 30th international conference on computer design (ICCD), pp 167–174 Sleiman FM, Dreslinski RG, Wenisch TF (2012) Embedded way prediction for last-level caches. In: 2012 IEEE 30th international conference on computer design (ICCD), pp 167–174
6.
Zurück zum Zitat Sundararajan KT, Porpodas V, Jones TM, Topham NP, Franke B (2012) Cooperative partitioning: energy-efficient cache partitioning for high-performance CMPs. In: 2012 IEEE 18th international symposium on high performance computer architecture (HPCA), pp 1–12 Sundararajan KT, Porpodas V, Jones TM, Topham NP, Franke B (2012) Cooperative partitioning: energy-efficient cache partitioning for high-performance CMPs. In: 2012 IEEE 18th international symposium on high performance computer architecture (HPCA), pp 1–12
7.
Zurück zum Zitat Mittal S, Zhang Z (2012) Encache: Improving cache energy efficiency using a software-controlled profiling cache. In: IEEE international conference on electro/information technology, pp 1–12 Mittal S, Zhang Z (2012) Encache: Improving cache energy efficiency using a software-controlled profiling cache. In: IEEE international conference on electro/information technology, pp 1–12
8.
Zurück zum Zitat Mittal S, Zhang Z, Cao Y (2013) Cashier: a cache energy saving technique for QoS systems. In: 2013 26th international conference on VLSI design and 2013 12th international conference on embedded systems (VLSID), pp 43–48 Mittal S, Zhang Z, Cao Y (2013) Cashier: a cache energy saving technique for QoS systems. In: 2013 26th international conference on VLSI design and 2013 12th international conference on embedded systems (VLSID), pp 43–48
9.
Zurück zum Zitat Wang W, Mishra P, Ranka S (2011) Dynamic cache reconfiguration and partitioning for energy optimization in real-time multi-core systems. In: Design automation conference (DAC), 2011 48th ACM/EDAC/IEEE, pp 948–953 Wang W, Mishra P, Ranka S (2011) Dynamic cache reconfiguration and partitioning for energy optimization in real-time multi-core systems. In: Design automation conference (DAC), 2011 48th ACM/EDAC/IEEE, pp 948–953
10.
Zurück zum Zitat Hajimiri H, Mishra P, Bhunia S (2013) Dynamic cache tuning for efficient memory based computing in multicore architectures. In: International conference on VLSI design, pp 49–54. Pune, India, January 5–10, 2013 Hajimiri H, Mishra P, Bhunia S (2013) Dynamic cache tuning for efficient memory based computing in multicore architectures. In: International conference on VLSI design, pp 49–54. Pune, India, January 5–10, 2013
11.
Zurück zum Zitat Mittal S, Zhang Z (2013) Palette: a cache leakage energy saving technique for green computing. In: Series advances in parallel computing Ebook, pp 46–61 Mittal S, Zhang Z (2013) Palette: a cache leakage energy saving technique for green computing. In: Series advances in parallel computing Ebook, pp 46–61
12.
Zurück zum Zitat Kotera I, Abe K, Egawa R, Takizawa H, Kobayashi H (2011) Power-aware dynamic cache partitioning for CMPs. Transactions on high-performance embedded architectures and compilers iii, pp 135–153 Kotera I, Abe K, Egawa R, Takizawa H, Kobayashi H (2011) Power-aware dynamic cache partitioning for CMPs. Transactions on high-performance embedded architectures and compilers iii, pp 135–153
13.
Zurück zum Zitat Sundararajan et al (2013) The smart cache: an energy-efficient cache architecture through dynamic adaptation. Int J Parallel Program 41:305–330 Sundararajan et al (2013) The smart cache: an energy-efficient cache architecture through dynamic adaptation. Int J Parallel Program 41:305–330
14.
Zurück zum Zitat Paul M, Petrov P (2011) Dynamically adaptive I-cache partitioning for energy-efficient embedded multitasking. IEEE Trans Very Large Scale Integr Syst 19:2067–2080CrossRef Paul M, Petrov P (2011) Dynamically adaptive I-cache partitioning for energy-efficient embedded multitasking. IEEE Trans Very Large Scale Integr Syst 19:2067–2080CrossRef
15.
Zurück zum Zitat Abella J, González A (2006) Heterogeneous way-size cache. In: Proceedings of the 20th annual international conference on supercomputing, ICS ’06, New York, NY, USA, pp 239–248. ACM Abella J, González A (2006) Heterogeneous way-size cache. In: Proceedings of the 20th annual international conference on supercomputing, ICS ’06, New York, NY, USA, pp 239–248. ACM
16.
Zurück zum Zitat Bardine A, Foglia P, Gabrielli G, Prete CA, Stenstrm P (2007) Improving power efficiency of D-NUCA caches. ACM SIGARCH Comput Arch News 35(4):53–58CrossRef Bardine A, Foglia P, Gabrielli G, Prete CA, Stenstrm P (2007) Improving power efficiency of D-NUCA caches. ACM SIGARCH Comput Arch News 35(4):53–58CrossRef
17.
Zurück zum Zitat Lodde M, Flich J, Acacio ME (2012) Dynamic last-level cache allocation to reduce area and power overhead in directory coherence protocols. In: Proceedings of the 18th international conference on parallel processing. Euro-Par’12, pp 206–218. Springer, Berlin Lodde M, Flich J, Acacio ME (2012) Dynamic last-level cache allocation to reduce area and power overhead in directory coherence protocols. In: Proceedings of the 18th international conference on parallel processing. Euro-Par’12, pp 206–218. Springer, Berlin
18.
Zurück zum Zitat Marino MD (2006) L2-cache hierarchical organizations for multi-core architectures. Frontiers of high performance computing and networking–ISPA 2006 workshops, pp 74–83, Springer, Sorrento, Italy Marino MD (2006) L2-cache hierarchical organizations for multi-core architectures. Frontiers of high performance computing and networking–ISPA 2006 workshops, pp 74–83, Springer, Sorrento, Italy
19.
Zurück zum Zitat Gebhart M, Maher BA, Coons KE, Diamond J, Gratz P, Marino M, Ranganathan N, Robatmili B, Smith A, Burrill J, Keckler SW, Burger D, McKinley KS (2009) An evaluation of the trips computer system. In: Proceedings of the 14th international conference on architectural support for programming languages and operating systems, ASPLOS XIV, New York, NY, USA, pp 1–12. ACM Gebhart M, Maher BA, Coons KE, Diamond J, Gratz P, Marino M, Ranganathan N, Robatmili B, Smith A, Burrill J, Keckler SW, Burger D, McKinley KS (2009) An evaluation of the trips computer system. In: Proceedings of the 14th international conference on architectural support for programming languages and operating systems, ASPLOS XIV, New York, NY, USA, pp 1–12. ACM
20.
Zurück zum Zitat Muralimanohar N, Balasubramonian R (2007) Interconnect design considerations for large NUCA caches. In: Proceedings of the 34th annual international symposium on computer architecture, New York, NY, USA, ACM Muralimanohar N, Balasubramonian R (2007) Interconnect design considerations for large NUCA caches. In: Proceedings of the 34th annual international symposium on computer architecture, New York, NY, USA, ACM
21.
Zurück zum Zitat Tian Y et al (2014) Last-level cache deduplication. In: International conference on supercomputing, New York, NY, USA. ACM, pp 53–62 Tian Y et al (2014) Last-level cache deduplication. In: International conference on supercomputing, New York, NY, USA. ACM, pp 53–62
22.
Zurück zum Zitat Hameed F et al (2014) Reducing latency in an SRAM/DRAM cache hierarchy via a novel tag-cache architecture. In: DAC ’14, New York, NY, USA. ACM Hameed F et al (2014) Reducing latency in an SRAM/DRAM cache hierarchy via a novel tag-cache architecture. In: DAC ’14, New York, NY, USA. ACM
24.
Zurück zum Zitat Marino MD (2006) 32-core CMP with Multi-sliced L2: 2 and 4 cores sharing a L2 slice. In: International symposium on computer architecture and high performance computing, IEEE, pp 141–150 Marino MD (2006) 32-core CMP with Multi-sliced L2: 2 and 4 cores sharing a L2 slice. In: International symposium on computer architecture and high performance computing, IEEE, pp 141–150
25.
Zurück zum Zitat Marino MD (2012) RFiop: RF-memory path to address on-package I/O pad and memory controller scalability. In: ICCD, 2012, Montreal, Quebec, Canada. IEEE Marino MD (2012) RFiop: RF-memory path to address on-package I/O pad and memory controller scalability. In: ICCD, 2012, Montreal, Quebec, Canada. IEEE
27.
Zurück zum Zitat Ortego PM, Sack P (2004) Sesc: superescalar simulator. Technical report, University of Illinois Ortego PM, Sack P (2004) Sesc: superescalar simulator. Technical report, University of Illinois
28.
Zurück zum Zitat Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd annual international symposium on computer architecture. ISCA ’95, New York, NY, USA, pp 24–36. ACM Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd annual international symposium on computer architecture. ISCA ’95, New York, NY, USA, pp 24–36. ACM
29.
Zurück zum Zitat Bardine A, Comparetti M, Foglia P, Prete CA (2014) Evaluation of leakage reduction alternatives for deep submicron dynamic nonuniform cache architecture caches. IEEE Trans Very Large Scale Integr Syst 22:185–190CrossRef Bardine A, Comparetti M, Foglia P, Prete CA (2014) Evaluation of leakage reduction alternatives for deep submicron dynamic nonuniform cache architecture caches. IEEE Trans Very Large Scale Integr Syst 22:185–190CrossRef
30.
Zurück zum Zitat Abella J, González A, Vera X, O’Boyle MFP (2005) Iatac: a smart predictor to turn-off l2 cache lines. ACM Trans Archit Code Optim 2:55–77CrossRef Abella J, González A, Vera X, O’Boyle MFP (2005) Iatac: a smart predictor to turn-off l2 cache lines. ACM Trans Archit Code Optim 2:55–77CrossRef
32.
Zurück zum Zitat Marino MD, Li KC (2014) Insights on memory controller scalability in heterogeneous multi-core embedded systems 6(4) Marino MD, Li KC (2014) Insights on memory controller scalability in heterogeneous multi-core embedded systems 6(4)
33.
Zurück zum Zitat Marino MD (2013) RFiof: an RF approach to the I/O-pin and memory controller scalability for off-chip memories, in CF, May 14–16 , Ischia, Italy. ACM, pp 100–110 Marino MD (2013) RFiof: an RF approach to the I/O-pin and memory controller scalability for off-chip memories, in CF, May 14–16 , Ischia, Italy. ACM, pp 100–110
34.
Zurück zum Zitat Marino MD (2012) On-package scalability of RF and inductive memory controllers. In: Euromicro DSD, Turkey. IEEE, pp 923–930 Marino MD (2012) On-package scalability of RF and inductive memory controllers. In: Euromicro DSD, Turkey. IEEE, pp 923–930
Metadaten
Titel
Last level cache size heterogeneity in embedded systems
verfasst von
Mario D. Marino
Kuan-Ching Li
Publikationsdatum
01.02.2016
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 2/2016
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-015-1576-8

Weitere Artikel der Ausgabe 2/2016

The Journal of Supercomputing 2/2016 Zur Ausgabe