Skip to main content

2011 | OriginalPaper | Buchkapitel

5. WCET-Aware Assembly Level Optimizations

verfasst von : Paul Lokuciejewski, Peter Marwedel

Erschienen in: Worst-Case Execution Time Aware Compilation Techniques for Real-Time Systems

Verlag: Springer Netherlands

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The major shortcoming of source code optimizations is their lack of intrinsic knowledge about the underlying architecture. Hence, the development of transformations that exploit processor-specific features is limited or even infeasible at all. As a result, a maximal optimization potential can not be explored. In contrast, assembly level optimizations operate on a code representation that reflects the finally executed code. Thus, the compiler is fully aware of numerous critical details about the utilized resources during execution. In this chapter, novel WCET-aware assembly level optimizations are discussed. In detail, the optimizations WCET-aware procedure positioning and WCET-aware trace scheduling are presented.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
CPI+05.
Zurück zum Zitat A.M. Campoy, I. Puaut, A.P. Ivars et al., Cache contents selection for statically-locked instruction caches: an algorithm comparison, in Proceedings of the 17th Euromicro Conference on Real-Time Systems (ECRTS), Palma de Mallorca, Spain, July 2005, pp. 49–56 A.M. Campoy, I. Puaut, A.P. Ivars et al., Cache contents selection for statically-locked instruction caches: an algorithm comparison, in Proceedings of the 17th Euromicro Conference on Real-Time Systems (ECRTS), Palma de Mallorca, Spain, July 2005, pp. 49–56
CM04.
Zurück zum Zitat J. Cavazos, J.E.B. Moss, Inducing heuristics to decide whether to schedule. SIGPLAN Not. 39(6), 183–194 (2004) CrossRef J. Cavazos, J.E.B. Moss, Inducing heuristics to decide whether to schedule. SIGPLAN Not. 39(6), 183–194 (2004) CrossRef
CNO+87.
Zurück zum Zitat R.P. Colwell, R.P. Nix, J.J. O’Donnell et al., A VLIW architecture for a trace scheduling compiler. ACM SIGPLAN Not. 22(10), 180–192 (1987) CrossRef R.P. Colwell, R.P. Nix, J.J. O’Donnell et al., A VLIW architecture for a trace scheduling compiler. ACM SIGPLAN Not. 22(10), 180–192 (1987) CrossRef
CT04.
Zurück zum Zitat K.D. Cooper, L. Torczon, Engineering A Compiler (Morgan Kaufmann, San Francisco, 2004) K.D. Cooper, L. Torczon, Engineering A Compiler (Morgan Kaufmann, San Francisco, 2004)
DP07.
Zurück zum Zitat J.F. Deverge, I. Puaut, WCET-directed dynamic scratchpad memory allocation of data, in Proceedings of the 19th Euromicro Conference on Real-Time Systems (ECRTS), Pisa, Italy, July 2007, pp. 179–190 J.F. Deverge, I. Puaut, WCET-directed dynamic scratchpad memory allocation of data, in Proceedings of the 19th Euromicro Conference on Real-Time Systems (ECRTS), Pisa, Italy, July 2007, pp. 179–190
Fal09.
Zurück zum Zitat H. Falk, WCET-aware register allocation based on graph coloring, in Proceedings of the 46th Design Automation Conference (DAC), San Francisco, USA, July 2009, pp. 726–731 H. Falk, WCET-aware register allocation based on graph coloring, in Proceedings of the 46th Design Automation Conference (DAC), San Francisco, USA, July 2009, pp. 726–731
FK09.
Zurück zum Zitat H. Falk, J.C. Kleinsorge, Optimal static WCET-aware scratchpad allocation of program code, in Proceedings of the 46th Design Automation Conference (DAC), San Francisco, USA, July 2009, pp. 732–737 H. Falk, J.C. Kleinsorge, Optimal static WCET-aware scratchpad allocation of program code, in Proceedings of the 46th Design Automation Conference (DAC), San Francisco, USA, July 2009, pp. 732–737
FPT07.
Zurück zum Zitat H. Falk, S. Plazar, H. Theiling, Compile-time decided instruction cache locking using worst-case execution paths, in Proceedings of the 5th IEEE/ACM International Conference on Hardware/software Codesign and System Synthesis (CODES+ISSS), Salzburg, Austria, September 2007, pp. 143–148 H. Falk, S. Plazar, H. Theiling, Compile-time decided instruction cache locking using worst-case execution paths, in Proceedings of the 5th IEEE/ACM International Conference on Hardware/software Codesign and System Synthesis (CODES+ISSS), Salzburg, Austria, September 2007, pp. 143–148
FHL+01.
Zurück zum Zitat C. Ferdinand, R. Heckmann, M. Langenbach et al., Reliable and precise WCET determination for a real-life processor, in Proceedings of the 1st International Workshop on Embedded Software (EMSOFT), Tahoe City, USA, October 2001, pp. 496–485 C. Ferdinand, R. Heckmann, M. Langenbach et al., Reliable and precise WCET determination for a real-life processor, in Proceedings of the 1st International Workshop on Embedded Software (EMSOFT), Tahoe City, USA, October 2001, pp. 496–485
Fis81.
Zurück zum Zitat J.A. Fisher, Trace scheduling: a technique for global microcode compaction. IEEE Trans. Comput. 30(7), 478–490 (1981) CrossRef J.A. Fisher, Trace scheduling: a technique for global microcode compaction. IEEE Trans. Comput. 30(7), 478–490 (1981) CrossRef
GRE+01.
Zurück zum Zitat M. Guthaus, J. Ringenberg, D. Ernst et al., MiBench: a free, commercially representative embedded benchmark suite, in Proceedings of the 4th IEEE International Workshop on Workload Characteristics (WWC), Austin, USA, December 2001, pp. 3–14 M. Guthaus, J. Ringenberg, D. Ernst et al., MiBench: a free, commercially representative embedded benchmark suite, in Proceedings of the 4th IEEE International Workshop on Workload Characteristics (WWC), Austin, USA, December 2001, pp. 3–14
HP03.
Zurück zum Zitat J.L. Hennessy, D.A. Patterson, Computer Architecture: A Quantitative Approach (Morgan Kaufmann, San Francisco, 2003) J.L. Hennessy, D.A. Patterson, Computer Architecture: A Quantitative Approach (Morgan Kaufmann, San Francisco, 2003)
HS89.
Zurück zum Zitat M. Hill, A. Smith, Evaluating associativity in CPU caches. IEEE Trans. Comput. 38(12), 1612–1630 (1989) CrossRef M. Hill, A. Smith, Evaluating associativity in CPU caches. IEEE Trans. Comput. 38(12), 1612–1630 (1989) CrossRef
HC89.
Zurück zum Zitat W.W. Hwu, P.P. Chang, Achieving high instruction cache performance with an optimizing compiler. ACM SIGARCH Comput. Archit. News 17(3), 242–251 (1989) CrossRef W.W. Hwu, P.P. Chang, Achieving high instruction cache performance with an optimizing compiler. ACM SIGARCH Comput. Archit. News 17(3), 242–251 (1989) CrossRef
HMC+93.
Zurück zum Zitat W.W. Hwu, S.A. Mahlke, W.Y. Chen et al., The superblock: an effective technique for VLIW and superscalar compilation. J. Supercomput. 7, 229–248 (1993) CrossRef W.W. Hwu, S.A. Mahlke, W.Y. Chen et al., The superblock: an effective technique for VLIW and superscalar compilation. J. Supercomput. 7, 229–248 (1993) CrossRef
LW94.
Zurück zum Zitat A.R. Lebeck, D.A. Wood, Cache profiling and the SPEC benchmarks: a case study. IEEE Comput. 27(10), 16–26 (1994) CrossRef A.R. Lebeck, D.A. Wood, Cache profiling and the SPEC benchmarks: a case study. IEEE Comput. 27(10), 16–26 (1994) CrossRef
LPMS97.
Zurück zum Zitat C. Lee, M. Potkonjak, W.H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communications systems, in Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO), Research Triangle Park, USA, December 1997, pp. 330–335 C. Lee, M. Potkonjak, W.H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communications systems, in Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO), Research Triangle Park, USA, December 1997, pp. 330–335
LJC+10.
Zurück zum Zitat Y. Liang, L. Ju, S. Chakraborty et al., Cache-aware optimization of BAN applications. ACM Trans. Des. Automat. Electron. Syst. (2010) Y. Liang, L. Ju, S. Chakraborty et al., Cache-aware optimization of BAN applications. ACM Trans. Des. Automat. Electron. Syst. (2010)
LFM08.
Zurück zum Zitat P. Lokuciejewski, H. Falk, P. Marwedel, WCET-driven cache-based procedure positioning optimizations, in Proceedings of the 21st Euromicro Conference on Real-Time Systems (ECRTS), Prague, Czech Republic, July 2008, pp. 321–330 P. Lokuciejewski, H. Falk, P. Marwedel, WCET-driven cache-based procedure positioning optimizations, in Proceedings of the 21st Euromicro Conference on Real-Time Systems (ECRTS), Prague, Czech Republic, July 2008, pp. 321–330
MLC+92.
Zurück zum Zitat S.A. Mahlke, D.C. Lin, W.Y. Chen et al., Effective compiler support for predicated execution using the hyperblock. ACM SIGMICRO Newsl. 23(1–2), 45–54 (1992) CrossRef S.A. Mahlke, D.C. Lin, W.Y. Chen et al., Effective compiler support for predicated execution using the hyperblock. ACM SIGMICRO Newsl. 23(1–2), 45–54 (1992) CrossRef
MPS94.
Zurück zum Zitat A. Mendlson, S.S. Pinter, R. Shtokhamer, Compile time instruction cache optimizations. ACM SIGARCH Comput. Archit. News 22(1), 44–51 (1994) CrossRef A. Mendlson, S.S. Pinter, R. Shtokhamer, Compile time instruction cache optimizations. ACM SIGARCH Comput. Archit. News 22(1), 44–51 (1994) CrossRef
MPSR95.
Zurück zum Zitat R. Motwani, K.V. Palem, V. Sarkar, S. Reyen, Combining register allocation and instruction scheduling, Technical report, Stanford University, Stanford, USA, 1995 R. Motwani, K.V. Palem, V. Sarkar, S. Reyen, Combining register allocation and instruction scheduling, Technical report, Stanford University, Stanford, USA, 1995
Muc97.
Zurück zum Zitat S.S. Muchnick, Advanced Compiler Design and Implementation (Morgan Kaufmann, San Francisco, 1997) S.S. Muchnick, Advanced Compiler Design and Implementation (Morgan Kaufmann, San Francisco, 1997)
MG04.
Zurück zum Zitat S.S. Muchnick, P.B. Gibbons, Efficient instruction scheduling for a pipelined architecture. ACM SIGPLAN Not. 39(4), 167–174 (2004) CrossRef S.S. Muchnick, P.B. Gibbons, Efficient instruction scheduling for a pipelined architecture. ACM SIGPLAN Not. 39(4), 167–174 (2004) CrossRef
NGE+99.
Zurück zum Zitat X. Nie, L. Gazsi, F. Engel et al., A new network processor architecture for high-speed communications, in Proceedings of the IEEE Workshop on Signal Processing Systems (SiPS), Taipei, Taiwan, October 1999, pp. 548–557 X. Nie, L. Gazsi, F. Engel et al., A new network processor architecture for high-speed communications, in Proceedings of the IEEE Workshop on Signal Processing Systems (SiPS), Taipei, Taiwan, October 1999, pp. 548–557
NP93.
Zurück zum Zitat C. Norris, L.L. Pollock, A schedular-sensitive global register allocator, in Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, Portland, USA, November 1993, pp. 804–813 C. Norris, L.L. Pollock, A schedular-sensitive global register allocator, in Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, Portland, USA, November 1993, pp. 804–813
PS93.
Zurück zum Zitat K.V. Palem, B.B. Simons, Scheduling time-critical instructions on RISC machines. ACM Trans. Program. Lang. Syst. (TOPLAS) 15(4), 632–658 (1993) CrossRef K.V. Palem, B.B. Simons, Scheduling time-critical instructions on RISC machines. ACM Trans. Program. Lang. Syst. (TOPLAS) 15(4), 632–658 (1993) CrossRef
PH90.
Zurück zum Zitat K. Pettis, R.C. Hansen, Profile guided code positioning. ACM SIGPLAN Not. 25(6), 16–27 (1990) CrossRef K. Pettis, R.C. Hansen, Profile guided code positioning. ACM SIGPLAN Not. 25(6), 16–27 (1990) CrossRef
PLM09.
Zurück zum Zitat S. Plazar, P. Lokuciejewski, P. Marwedel, WCET-aware software based cache partitioning for multi-task real-time systems, in Proceedings of the 9th International Workshop on Worst-Case Execution Time Analysis (WCET), Dublin, Ireland, June 2009, pp. 78–88 S. Plazar, P. Lokuciejewski, P. Marwedel, WCET-aware software based cache partitioning for multi-task real-time systems, in Proceedings of the 9th International Workshop on Worst-Case Execution Time Analysis (WCET), Dublin, Ireland, June 2009, pp. 78–88
PLM10.
Zurück zum Zitat S. Plazar, P. Lokuciejewski, P. Marwedel, WCET-driven cache-aware memory content selection, in Proceedings of the 13th IEEE International Symposium on Object/Component/Service-oriented Real-time Distributed Computing (ISORC), Carmona, Spain, 2010, pp. 107–114 S. Plazar, P. Lokuciejewski, P. Marwedel, WCET-driven cache-aware memory content selection, in Proceedings of the 13th IEEE International Symposium on Object/Component/Service-oriented Real-time Distributed Computing (ISORC), Carmona, Spain, 2010, pp. 107–114
Pua06.
Zurück zum Zitat I. Puaut, WCET-centric software-controlled instruction caches for hard real-time systems, in Proceedings of the 18th Euromicro Conference on Real-Time Systems (ECRTS), Dresden, Germany, July 2006, pp. 217–226 I. Puaut, WCET-centric software-controlled instruction caches for hard real-time systems, in Proceedings of the 18th Euromicro Conference on Real-Time Systems (ECRTS), Dresden, Germany, July 2006, pp. 217–226
PP07.
Zurück zum Zitat I. Puaut, C. Pais, Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison, in Proceedings of the Conference on Design, Automation and Test in Europe (DATE), Nice, France, March 2007, pp. 1484–1489 I. Puaut, C. Pais, Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison, in Proceedings of the Conference on Design, Automation and Test in Europe (DATE), Nice, France, March 2007, pp. 1484–1489
RTG+07.
Zurück zum Zitat H. Rong, Z. Tang, R. Govindarajan et al., Single-dimension software pipelining for multidimensional loops. ACM Trans. Archit. Code Optim. 4(1), 7–51 (2007) CrossRef H. Rong, Z. Tang, R. Govindarajan et al., Single-dimension software pipelining for multidimensional loops. ACM Trans. Archit. Code Optim. 4(1), 7–51 (2007) CrossRef
RMC+09.
Zurück zum Zitat T. Russell, A.M. Malik, M. Chase et al., Learning heuristics for the superblock instruction scheduling problem. IEEE Trans. Knowl. Data Eng. 21(10), 1489–1502 (2009) CrossRef T. Russell, A.M. Malik, M. Chase et al., Learning heuristics for the superblock instruction scheduling problem. IEEE Trans. Knowl. Data Eng. 21(10), 1489–1502 (2009) CrossRef
SM08.
Zurück zum Zitat V. Suhendra, T. Mitra, Exploring locking & partitioning for predictable shared caches on multi-cores, in Proceedings of the 45th annual Design Automation Conference (DAC), Anaheim, California, June 2008, pp. 300–303 V. Suhendra, T. Mitra, Exploring locking & partitioning for predictable shared caches on multi-cores, in Proceedings of the 45th annual Design Automation Conference (DAC), Anaheim, California, June 2008, pp. 300–303
SRM08.
Zurück zum Zitat V. Suhendra, A. Roychoudhury, T. Mitra, Scratchpad allocation for concurrent embedded software, in Proceedings of the 6th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Atlanta, USA, October 2008, pp. 37–42 V. Suhendra, A. Roychoudhury, T. Mitra, Scratchpad allocation for concurrent embedded software, in Proceedings of the 6th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Atlanta, USA, October 2008, pp. 37–42
SMR+05.
Zurück zum Zitat V. Suhendra, T. Mitra, A. Roychoudhury et al., WCET centric data allocation to scratchpad memory, in Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS), Miami, USA, December 2005, pp. 223–232 V. Suhendra, T. Mitra, A. Roychoudhury et al., WCET centric data allocation to scratchpad memory, in Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS), Miami, USA, December 2005, pp. 223–232
Inf08a.
Zurück zum Zitat Tc1796 32-bit single-chip microcontroller tricore—data sheet. Infineon Technologies AG, Document Revision 2008-04 (2008) Tc1796 32-bit single-chip microcontroller tricore—data sheet. Infineon Technologies AG, Document Revision 2008-04 (2008)
TY97.
Zurück zum Zitat H. Tomiyama, H. Yasuura, Code placement techniques for cache miss rate reduction. ACM Trans. Des. Automat. Electron. Syst. 2(4), 410–429 (1997) CrossRef H. Tomiyama, H. Yasuura, Code placement techniques for cache miss rate reduction. ACM Trans. Des. Automat. Electron. Syst. 2(4), 410–429 (1997) CrossRef
Inf04.
Zurück zum Zitat Tricore 1 pipeline behaviour & instruction execution timing. Infineon Technologies AG, Document Revision 2004-06 (2004) Tricore 1 pipeline behaviour & instruction execution timing. Infineon Technologies AG, Document Revision 2004-06 (2004)
Inf08b.
Zurück zum Zitat TriCore 1 32-bit unified processor core v1.3 architecture—architecture manual. Infineon Technologies AG, Document Revision 2008-01 (2008) TriCore 1 32-bit unified processor core v1.3 architecture—architecture manual. Infineon Technologies AG, Document Revision 2008-01 (2008)
VLX03.
Zurück zum Zitat X. Vera, B. Lisper, J. Xue, Data cache locking for higher program predictability, in Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), San Diego, USA, July 2003, pp. 272–282 X. Vera, B. Lisper, J. Xue, Data cache locking for higher program predictability, in Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), San Diego, USA, July 2003, pp. 272–282
VM07.
Zurück zum Zitat M. Verma, P. Marwedel, Advanced Memory Optimization Techniques for Low-Power Embedded Processors (Springer, Berlin, 2007) MATH M. Verma, P. Marwedel, Advanced Memory Optimization Techniques for Low-Power Embedded Processors (Springer, Berlin, 2007) MATH
WHSB92.
Zurück zum Zitat N.J. Warter, G.E. Haab, K. Subramanian, J.W. Bockhaus, Enhanced modulo scheduling for loops with conditional branches. ACM SIGMICRO Newsl. 23(1–2), 170–179 (1992) CrossRef N.J. Warter, G.E. Haab, K. Subramanian, J.W. Bockhaus, Enhanced modulo scheduling for loops with conditional branches. ACM SIGMICRO Newsl. 23(1–2), 170–179 (1992) CrossRef
ZWH+05.
Zurück zum Zitat W. Zhao, D. Whalley, C. Healy et al., Improving WCET by applying a WC code-positioning optimization. ACM Trans. Archit. Code Optim. 2(4), 335–365 (2005) CrossRef W. Zhao, D. Whalley, C. Healy et al., Improving WCET by applying a WC code-positioning optimization. ACM Trans. Archit. Code Optim. 2(4), 335–365 (2005) CrossRef
ZVS+94.
Zurück zum Zitat V. Zivojnović, J. Martínez Velarde, C. Schläger et al., DSPstone: a DSP-oriented benchmarking methodology, in Proceedings of the International Conference on Signal Processing and Technology (ICSPAT), Dallas, USA, January 1994, pp. 715–720 V. Zivojnović, J. Martínez Velarde, C. Schläger et al., DSPstone: a DSP-oriented benchmarking methodology, in Proceedings of the International Conference on Signal Processing and Technology (ICSPAT), Dallas, USA, January 1994, pp. 715–720
Metadaten
Titel
WCET-Aware Assembly Level Optimizations
verfasst von
Paul Lokuciejewski
Peter Marwedel
Copyright-Jahr
2011
Verlag
Springer Netherlands
DOI
https://doi.org/10.1007/978-90-481-9929-7_5

Neuer Inhalt