Skip to main content
Erschienen in: International Journal of Parallel Programming 1/2016

01.02.2016

A Phase Behavior Aware Dynamic Cache Partitioning Scheme for CMPs

verfasst von: Xiaofei Liao, Rentong Guo, Danping Yu, Hai Jin, Li Lin

Erschienen in: International Journal of Parallel Programming | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In multi-program environment, cache contention among processors can significantly degrade system performance. Cache partitioning served as an effective measure has been widely studied, especially for dynamic cache partitioning. However, it is difficult to decide the best cache quota which should be allocated to co-scheduled programs and the best time when a cache adjusting should be performed in dynamic cache partitioning scheme. This paper presents a novel dynamic cache partitioning mechanism based on the phase behavior of programs. It uses the performance monitoring units of modern processors and detects the phase behavior of programs to guide the cache partitioning at run-time. Since programs have recurring phase behavior during the whole execution time, on one hand, we can adjust the cache quota when a phase change occurs, on the other hand, we can make cache partitioning policy with higher accuracy and lower overhead by classifying phases. The method proposed in this work is validated in the measured results for applications from SPEC CPU 2006 benchmark suite. Compared with the performance of shared cache scheme, our method can achieve a speedup up to 1.214 for co-scheduled applications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Chandra, D., Guo, F., Kim, S., Solihin, Y.: Predicting inter-thread cache contention on a chip multi-processor architecture. In: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, HPCA-11, pp. 340–351 (2005) Chandra, D., Guo, F., Kim, S., Solihin, Y.: Predicting inter-thread cache contention on a chip multi-processor architecture. In: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, HPCA-11, pp. 340–351 (2005)
2.
Zurück zum Zitat Cho, S., Jin, L.: Managing distributed, shared l2 caches through os-level page allocation. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 39, pp. 455–468. IEEE Computer Society, Washington, DC, USA (2006) Cho, S., Jin, L.: Managing distributed, shared l2 caches through os-level page allocation. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 39, pp. 455–468. IEEE Computer Society, Washington, DC, USA (2006)
3.
Zurück zum Zitat Davies, B., Bouguet, J., Polito, M., Annavaram, M.: ipart: an automated phase analysis and recognition tool. Tech. rep., IR-TR-2004-1-iPART, Intel Corporation (2004) Davies, B., Bouguet, J., Polito, M., Annavaram, M.: ipart: an automated phase analysis and recognition tool. Tech. rep., IR-TR-2004-1-iPART, Intel Corporation (2004)
4.
Zurück zum Zitat Dhodapkar, A., Smith, J.: Managing multi-configuration hardware via dynamic working set analysis. In: Proceedings of the 29th Annual International Symposium on Computer Architecture, ISCA 29, pp. 233–244 (2002) Dhodapkar, A., Smith, J.: Managing multi-configuration hardware via dynamic working set analysis. In: Proceedings of the 29th Annual International Symposium on Computer Architecture, ISCA 29, pp. 233–244 (2002)
5.
Zurück zum Zitat Dhodapkar, A.S., Smith, J.E.: Comparing program phase detection techniques. In: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 36, pp. 217–228. IEEE Computer Society, Washington, DC, USA (2003) Dhodapkar, A.S., Smith, J.E.: Comparing program phase detection techniques. In: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 36, pp. 217–228. IEEE Computer Society, Washington, DC, USA (2003)
6.
Zurück zum Zitat He, L., Yu, Z., Jin, H.: Fractalmrc: online cache miss rate curve prediction on commodity systems. In: Proceedings of the IEEE 26th International Parallel Distributed Processing Symposium, IPDPS-26, pp. 1341–1351 (2012) He, L., Yu, Z., Jin, H.: Fractalmrc: online cache miss rate curve prediction on commodity systems. In: Proceedings of the IEEE 26th International Parallel Distributed Processing Symposium, IPDPS-26, pp. 1341–1351 (2012)
7.
8.
Zurück zum Zitat Isci, C., Contreras, G., Martonosi, M.: Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 39, pp. 359–370. IEEE Computer Society, Washington, DC, USA (2006) Isci, C., Contreras, G., Martonosi, M.: Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 39, pp. 359–370. IEEE Computer Society, Washington, DC, USA (2006)
9.
Zurück zum Zitat Kihm, J., Settle, A., Janiszewski, A., Connors, D.A.: Understanding the impact of inter-thread cache interference on ilp in modern smt processors. J Instr Level Parallelism 7(2), 1–28 (2005) Kihm, J., Settle, A., Janiszewski, A., Connors, D.A.: Understanding the impact of inter-thread cache interference on ilp in modern smt processors. J Instr Level Parallelism 7(2), 1–28 (2005)
10.
Zurück zum Zitat Kim, S., Chandra, D., Solihin, Y.: Fair cache sharing and partitioning in a chip multiprocessor architecture. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques. PACT ’04, pp. 111–122. IEEE Computer Society, Washington, DC, USA (2004) Kim, S., Chandra, D., Solihin, Y.: Fair cache sharing and partitioning in a chip multiprocessor architecture. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques. PACT ’04, pp. 111–122. IEEE Computer Society, Washington, DC, USA (2004)
11.
Zurück zum Zitat Lin, J., Lu, Q., Ding, X., Zhang, Z., Zhang, X., Sadayappan, P.: Gaining insights into multicore cache partitioning: bridging the gap between simulation and real systems. In: Proceedings of the 14th International Symposium on High Performance Computer Architecture, HPCA-14, pp. 367–378 (2008) Lin, J., Lu, Q., Ding, X., Zhang, Z., Zhang, X., Sadayappan, P.: Gaining insights into multicore cache partitioning: bridging the gap between simulation and real systems. In: Proceedings of the 14th International Symposium on High Performance Computer Architecture, HPCA-14, pp. 367–378 (2008)
12.
Zurück zum Zitat Lin, J., Lu, Q., Ding, X., Zhang, Z., Zhang, X., Sadayappan, P.: Enabling software management for multicore caches with a lightweight hardware support. In: Proceedings of the Conference on High Performance Computing Networking. Storage and Analysis, SC ’09, pp. 1–12. ACM, New York, NY, USA (2009) Lin, J., Lu, Q., Ding, X., Zhang, Z., Zhang, X., Sadayappan, P.: Enabling software management for multicore caches with a lightweight hardware support. In: Proceedings of the Conference on High Performance Computing Networking. Storage and Analysis, SC ’09, pp. 1–12. ACM, New York, NY, USA (2009)
13.
Zurück zum Zitat Patil, H., Cohn, R., Charney, M., Kapoor, R., Sun, A., Karunanidhi, A.: Pinpointing representative portions of large intel programs with dynamic instrumentation. In: Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 37, pp. 81–92. IEEE Computer Society, Washington, DC, USA (2004) Patil, H., Cohn, R., Charney, M., Kapoor, R., Sun, A., Karunanidhi, A.: Pinpointing representative portions of large intel programs with dynamic instrumentation. In: Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 37, pp. 81–92. IEEE Computer Society, Washington, DC, USA (2004)
14.
Zurück zum Zitat Perelman, E., Polito, M., Bouguet, J.Y., Sampson, J., Calder, B., Dulong, C.: Detecting phases in parallel applications on shared memory architectures. In: Proceedings of the 20th International Parallel and Distributed Processing Symposium, IPDPS 20 (2006) Perelman, E., Polito, M., Bouguet, J.Y., Sampson, J., Calder, B., Dulong, C.: Detecting phases in parallel applications on shared memory architectures. In: Proceedings of the 20th International Parallel and Distributed Processing Symposium, IPDPS 20 (2006)
15.
Zurück zum Zitat Qureshi, M.K., Patt, Y.N.: Utility-based cache partitioning: a low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 39, pp. 423–432. IEEE Computer Society, Washington, DC, USA (2006) Qureshi, M.K., Patt, Y.N.: Utility-based cache partitioning: a low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 39, pp. 423–432. IEEE Computer Society, Washington, DC, USA (2006)
16.
Zurück zum Zitat Ravindar, A., Srikant, Y.N.: Implications of program phase behavior on timing analysis. In: Proceedings of the 15th Workshop on the Interaction between Compilers and Computer Architectures, pp. 71–79 (2011) Ravindar, A., Srikant, Y.N.: Implications of program phase behavior on timing analysis. In: Proceedings of the 15th Workshop on the Interaction between Compilers and Computer Architectures, pp. 71–79 (2011)
17.
Zurück zum Zitat Sembrant, A., Eklov, D., Hagersten, E.: Efficient software-based online phase classification. In: Proceedings of the 2011 IEEE International Symposium on Workload Characterization, IISWC ’11, pp. 104–115 (2011) Sembrant, A., Eklov, D., Hagersten, E.: Efficient software-based online phase classification. In: Proceedings of the 2011 IEEE International Symposium on Workload Characterization, IISWC ’11, pp. 104–115 (2011)
18.
Zurück zum Zitat Shen, X., Zhong, Y., Ding, C.: Locality phase prediction. In: Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS XI, pp. 165–176. ACM, New York, NY, USA (2004) Shen, X., Zhong, Y., Ding, C.: Locality phase prediction. In: Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS XI, pp. 165–176. ACM, New York, NY, USA (2004)
19.
Zurück zum Zitat Sherwood, T., Calder, B.: Time varying behavior of programs. Tech. Rep. CS99-630. University of California, San Diego (1999) Sherwood, T., Calder, B.: Time varying behavior of programs. Tech. Rep. CS99-630. University of California, San Diego (1999)
20.
Zurück zum Zitat Sherwood, T., Sair, S., Calder, B.: Phase tracking and prediction. In: Proceedings of the 30th Annual International Symposium on Computer Architecture. ISCA ’03, pp. 336–349. ACM, New York, NY, USA (2003) Sherwood, T., Sair, S., Calder, B.: Phase tracking and prediction. In: Proceedings of the 30th Annual International Symposium on Computer Architecture. ISCA ’03, pp. 336–349. ACM, New York, NY, USA (2003)
21.
Zurück zum Zitat Srikantaiah, S., Kandemir, M., Irwin, M.J.: Adaptive set pinning: managing shared caches in chip multiprocessors. In: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS XIII, pp. 135–144. ACM, New York, NY, USA (2008) Srikantaiah, S., Kandemir, M., Irwin, M.J.: Adaptive set pinning: managing shared caches in chip multiprocessors. In: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS XIII, pp. 135–144. ACM, New York, NY, USA (2008)
22.
Zurück zum Zitat Srivastava, A., Eustace, A.: Atom: A system for building customized program analysis tools. In: Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation. PLDI ’94, pp. 196–205. ACM, New York, NY, USA (1994) Srivastava, A., Eustace, A.: Atom: A system for building customized program analysis tools. In: Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation. PLDI ’94, pp. 196–205. ACM, New York, NY, USA (1994)
23.
Zurück zum Zitat Suh, G., Rudolph, L., Devadas, S.: Dynamic partitioning of shared cache memory. J. Supercomput. 28(1), 7–26 (2004)CrossRefMATH Suh, G., Rudolph, L., Devadas, S.: Dynamic partitioning of shared cache memory. J. Supercomput. 28(1), 7–26 (2004)CrossRefMATH
24.
Zurück zum Zitat Sundararajan, K., Porpodas, V., Jones, T., Topham, N., Franke, B.: Cooperative partitioning: energy-efficient cache partitioning for high-performance cmps. In: Proceedings of the 18th International Symposium on High Performance Computer Architecture, HPCA-18, pp. 1–12 (2012) Sundararajan, K., Porpodas, V., Jones, T., Topham, N., Franke, B.: Cooperative partitioning: energy-efficient cache partitioning for high-performance cmps. In: Proceedings of the 18th International Symposium on High Performance Computer Architecture, HPCA-18, pp. 1–12 (2012)
25.
Zurück zum Zitat Tam, D., Azimi, R., Soares, L., Stumm, M.: Managing shared l2 caches on multicore systems in software. In: Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture, pp. 26–33 (2007) Tam, D., Azimi, R., Soares, L., Stumm, M.: Managing shared l2 caches on multicore systems in software. In: Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture, pp. 26–33 (2007)
26.
Zurück zum Zitat Taylor, G., Davies, P., Farmwald, M.: The tlb slice—a low-cost high-speed address translation mechanism. In: Proceedings of the 17th Annual International Symposium on Computer Architecture. ISCA ’90, pp. 355–363. ACM, New York, NY, USA (1990) Taylor, G., Davies, P., Farmwald, M.: The tlb slice—a low-cost high-speed address translation mechanism. In: Proceedings of the 17th Annual International Symposium on Computer Architecture. ISCA ’90, pp. 355–363. ACM, New York, NY, USA (1990)
27.
Zurück zum Zitat Van Biesbrouck, M., Sherwood, T., Calder, B.: A co-phase matrix to guide simultaneous multithreading simulation. In: Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS ’04, pp. 45–56 (2004) Van Biesbrouck, M., Sherwood, T., Calder, B.: A co-phase matrix to guide simultaneous multithreading simulation. In: Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS ’04, pp. 45–56 (2004)
28.
Zurück zum Zitat Yu, Z., Zhang, W., Tu, X.: Mt-profiler: a parallel dynamic analysis framework based on two-stage sampling. In: Olivier, T., Pen-Chung, Y., Binyu Z. (eds.) Advanced Parallel Processing Technologies, pp. 172–185. Springer, New York (2011) Yu, Z., Zhang, W., Tu, X.: Mt-profiler: a parallel dynamic analysis framework based on two-stage sampling. In: Olivier, T., Pen-Chung, Y., Binyu Z. (eds.) Advanced Parallel Processing Technologies, pp. 172–185. Springer, New York (2011)
Metadaten
Titel
A Phase Behavior Aware Dynamic Cache Partitioning Scheme for CMPs
verfasst von
Xiaofei Liao
Rentong Guo
Danping Yu
Hai Jin
Li Lin
Publikationsdatum
01.02.2016
Verlag
Springer US
Erschienen in
International Journal of Parallel Programming / Ausgabe 1/2016
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-014-0334-5

Weitere Artikel der Ausgabe 1/2016

International Journal of Parallel Programming 1/2016 Zur Ausgabe