Skip to main content

2013 | OriginalPaper | Buchkapitel

6. Dynamic Optimization Techniques

verfasst von : Antonio Carlos Schneider Beck

Erschienen in: Adaptable Embedded Systems

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As has been emphasized throughout this book, it is necessary a high level of adaptability to cope with the high heterogeneous behavior of recent applications. At the same time, binary code compatibility is mandatory, so the large amount of already existing software can be reused without any kind of modification. In this scenario, this chapter discusses dynamic optimization techniques, how they can be used to improve performance, how they maintain binary compatibility and some case studies. The chapter starts presenting Binary translation. Its main concepts are clarified, as well as the main challenges that a binary translator mechanism must handle to work properly. The section ends with a detailed view of some examples of Binary Translation machines. Then, Reuse is discussed, and diverse types of it are covered: instruction reuse, value prediction, basic block, trace reuse and dynamic trace memoization. Furthermore, according to the discussion made in Chap.3, even though reconfigurable systems present huge potentials in terms of performance and energy, they alone cannot deal with the high heterogeneous behavior of recent applications neither maintain binary compatibility. Therefore, this chapter ends presenting approaches that use reconfigurable architectures together with mechanisms that somehow reassembles the behavior of the dynamic optimization techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Altman, E.R., Kaeli, D.R., Sheffer, Y.: Welcome to the opportunities of binary translation. IEEE Comput. 33(3), 40–45 (2000)CrossRef Altman, E.R., Kaeli, D.R., Sheffer, Y.: Welcome to the opportunities of binary translation. IEEE Comput. 33(3), 40–45 (2000)CrossRef
2.
Zurück zum Zitat Altman, E.R., Ebcioglu, K., Gschwind, M., Sathaye, S.: Advances and future challenges in binary – translation and optimization. Proc. IEEE 89(11), 1710–1722 (2001)CrossRef Altman, E.R., Ebcioglu, K., Gschwind, M., Sathaye, S.: Advances and future challenges in binary – translation and optimization. Proc. IEEE 89(11), 1710–1722 (2001)CrossRef
4.
Zurück zum Zitat Bala, V., Duesterwald, E., Banerjia, S.: Dynamo: a transparent dynamic optimization system. In: PLDI ’00: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, pp. 1–12. ACM, New York (2000). doi:http://doi.acm.org/10.1145/349299.349303 Bala, V., Duesterwald, E., Banerjia, S.: Dynamo: a transparent dynamic optimization system. In: PLDI ’00: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, pp. 1–12. ACM, New York (2000). doi:http://​doi.​acm.​org/​10.​1145/​349299.​349303
5.
7.
Zurück zum Zitat Beck, A.C.S., Carro, L.: Application of binary translation to java reconfigurable architectures. In: IPDPS ’05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05) – Workshop 3, p. 156.2. IEEE Computer Society, Washington, DC (2005). doi:http://dx.doi.org/10.1109/IPDPS.2005.111 Beck, A.C.S., Carro, L.: Application of binary translation to java reconfigurable architectures. In: IPDPS ’05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05) – Workshop 3, p. 156.2. IEEE Computer Society, Washington, DC (2005). doi:http://​dx.​doi.​org/​10.​1109/​IPDPS.​2005.​111
8.
Zurück zum Zitat Beck, A.C.S., Carro, L.: Dynamic reconfiguration with binary translation: breaking the ilp barrier with software compatibility. In: DAC ’05: Proceedings of the 42nd Annual Design Automation Conference, pp. 732–737. ACM, New York (2005). doi http://doi.acm.org/10.1145/1065579.1065771 Beck, A.C.S., Carro, L.: Dynamic reconfiguration with binary translation: breaking the ilp barrier with software compatibility. In: DAC ’05: Proceedings of the 42nd Annual Design Automation Conference, pp. 732–737. ACM, New York (2005). doi http://​doi.​acm.​org/​10.​1145/​1065579.​1065771
10.
Zurück zum Zitat Beck, A.C.S., Carro, L.: Transparent acceleration of data dependent instructions for general purpose processors. In: IFIP VLSI-SoC 2007, IFIP WG 10.5 International Conference on Very Large Scale Integration of System-on-Chip, Atlanta, GA, USA, 15–17 October 2007, pp. 66–71. Atlanta/USA IEEE (2007) Beck, A.C.S., Carro, L.: Transparent acceleration of data dependent instructions for general purpose processors. In: IFIP VLSI-SoC 2007, IFIP WG 10.5 International Conference on Very Large Scale Integration of System-on-Chip, Atlanta, GA, USA, 15–17 October 2007, pp. 66–71. Atlanta/USA IEEE (2007)
12.
Zurück zum Zitat Beck Filho, A.C.S., Mattos, J.C.B., Wagner, F.R., Carro, L.: Caco-ps: A general purpose cycle-accurate configurable power simulator. In: SBCCI ’03: Proceedings of the 16th Symposium on Integrated circuits and systems design, p. 349. IEEE Computer Society, Washington, DC (2003) Beck Filho, A.C.S., Mattos, J.C.B., Wagner, F.R., Carro, L.: Caco-ps: A general purpose cycle-accurate configurable power simulator. In: SBCCI ’03: Proceedings of the 16th Symposium on Integrated circuits and systems design, p. 349. IEEE Computer Society, Washington, DC (2003)
13.
Zurück zum Zitat Beck, A.C.S., Gomes, V.F., Carro, L.: Exploiting java through binary translation for low power embedded reconfigurable systems. In: SBCCI ’05: Proceedings of the 18th Annual Symposium on Integrated Circuits and System Design, pp. 92–97. ACM, New York (2005). doi:http://doi.acm.org/10.1145/1081081.1081109 Beck, A.C.S., Gomes, V.F., Carro, L.: Exploiting java through binary translation for low power embedded reconfigurable systems. In: SBCCI ’05: Proceedings of the 18th Annual Symposium on Integrated Circuits and System Design, pp. 92–97. ACM, New York (2005). doi:http://​doi.​acm.​org/​10.​1145/​1081081.​1081109
14.
Zurück zum Zitat Beck, A.C.S., Gomes, V.F., Carro, L.: Automatic dataflow execution with reconfiguration and dynamic instruction merging. In: IFIP VLSI-SoC 2006, IFIP WG 10.5 International Conference on Very Large Scale Integration of System-on-Chip, Nice, France, 16–18 October 2006, pp. 30–35. Nice/France IEEE (2006) Beck, A.C.S., Gomes, V.F., Carro, L.: Automatic dataflow execution with reconfiguration and dynamic instruction merging. In: IFIP VLSI-SoC 2006, IFIP WG 10.5 International Conference on Very Large Scale Integration of System-on-Chip, Nice, France, 16–18 October 2006, pp. 30–35. Nice/France IEEE (2006)
15.
Zurück zum Zitat Beck, A.C.S., Gomes, V.F., Carro, L.: Dynamic instruction merging and a reconfigurable array: Dataflow execution with software compatibility. In: Reconfigurable Computing: Architectures and Applications. Lecture Notes in Computer Science, vol. 3985, pp. 449–454. Springer, Berlin/Heidelberg (2006). http://www.springerlink.com/content/86458544617q0366/ Beck, A.C.S., Gomes, V.F., Carro, L.: Dynamic instruction merging and a reconfigurable array: Dataflow execution with software compatibility. In: Reconfigurable Computing: Architectures and Applications. Lecture Notes in Computer Science, vol. 3985, pp. 449–454. Springer, Berlin/Heidelberg (2006). http://​www.​springerlink.​com/​content/​86458544617q0366​/​
16.
Zurück zum Zitat Beck, A.C.S., Rutzig, M.B., Gaydadjiev, G., Carro, L.: Transparent reconfigurable acceleration for heterogeneous embedded applications. In: DATE ’08: Proceedings of the Conference on Design, Automation and Test in Europe, pp. 1208–1213. ACM, New York (2008). doi:http://doi.acm.org/10.1145/1403375.1403669 Beck, A.C.S., Rutzig, M.B., Gaydadjiev, G., Carro, L.: Transparent reconfigurable acceleration for heterogeneous embedded applications. In: DATE ’08: Proceedings of the Conference on Design, Automation and Test in Europe, pp. 1208–1213. ACM, New York (2008). doi:http://​doi.​acm.​org/​10.​1145/​1403375.​1403669
18.
19.
Zurück zum Zitat Berticelli Lo, T., Beck, A., Rutzig, M., Carro, L.: A low-energy approach for context memory in reconfigurable systems. In: 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–8 (2010). doi:10.1109/IPDPSW.2010.5470745 Berticelli Lo, T., Beck, A., Rutzig, M., Carro, L.: A low-energy approach for context memory in reconfigurable systems. In: 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–8 (2010). doi:10.1109/IPDPSW.2010.5470745
24.
Zurück zum Zitat Clark, N., Tang, W., Mahlke, S.: Automatically generating custom instruction set extensions. In: Workshop on Application-Specific Processors (WASP), pp. 94–101 (2002) Clark, N., Tang, W., Mahlke, S.: Automatically generating custom instruction set extensions. In: Workshop on Application-Specific Processors (WASP), pp. 94–101 (2002)
25.
Zurück zum Zitat Clark, N., Zhong, H., Mahlke, S.: Processor acceleration through automated instruction set customization. In: MICRO 36: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, p. 129. IEEE Computer Society, Washington, DC (2003) Clark, N., Zhong, H., Mahlke, S.: Processor acceleration through automated instruction set customization. In: MICRO 36: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, p. 129. IEEE Computer Society, Washington, DC (2003)
26.
Zurück zum Zitat Clark, N., Kudlur, M., Park, H., Mahlke, S., Flautner, K.: Application-specific processing on a general-purpose core via transparent instruction set customization. In: MICRO 37: Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 30–40. IEEE Computer Society, Washington, DC (2004). doi:http://dx.doi.org/10.1109/MICRO.2004.5 Clark, N., Kudlur, M., Park, H., Mahlke, S., Flautner, K.: Application-specific processing on a general-purpose core via transparent instruction set customization. In: MICRO 37: Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 30–40. IEEE Computer Society, Washington, DC (2004). doi:http://​dx.​doi.​org/​10.​1109/​MICRO.​2004.​5
27.
Zurück zum Zitat Clark, N., Blome, J., Chu, M., Mahlke, S., Biles, S., Flautner, K.: An architecture framework for transparent instruction set customization in embedded processors. In: ISCA ’05: Proceedings of the 32nd Annual International Symposium on Computer Architecture, pp. 272–283. IEEE Computer Society, Washington, DC (2005). doi:http://dx.doi.org/10.1109/ISCA.2005.9 Clark, N., Blome, J., Chu, M., Mahlke, S., Biles, S., Flautner, K.: An architecture framework for transparent instruction set customization in embedded processors. In: ISCA ’05: Proceedings of the 32nd Annual International Symposium on Computer Architecture, pp. 272–283. IEEE Computer Society, Washington, DC (2005). doi:http://​dx.​doi.​org/​10.​1109/​ISCA.​2005.​9
28.
Zurück zum Zitat Costa, A.T.D., Franca, F.M., Filho, E.M.C.: The dynamic trace memoization reuse technique. In: 9th PACT, p. 9299, 2000, IEEE Computer Society, Los Alamitos, pp. 92–99 (2000) Costa, A.T.D., Franca, F.M., Filho, E.M.C.: The dynamic trace memoization reuse technique. In: 9th PACT, p. 9299, 2000, IEEE Computer Society, Los Alamitos, pp. 92–99 (2000)
29.
Zurück zum Zitat Dehnert, J.C., Grant, B.K., Banning, J.P., Johnson, R., Kistler, T., Klaiber, A., Mattson, J.: The transmeta code morphingTMsoftware: using speculation, recovery, and adaptive retranslation to address real-life challenges. In: CGO ’03: Proceedings of the International Symposium on Code Generation and Optimization, pp. 15–24. IEEE Computer Society, Washington, DC (2003) Dehnert, J.C., Grant, B.K., Banning, J.P., Johnson, R., Kistler, T., Klaiber, A., Mattson, J.: The transmeta code morphingTMsoftware: using speculation, recovery, and adaptive retranslation to address real-life challenges. In: CGO ’03: Proceedings of the International Symposium on Code Generation and Optimization, pp. 15–24. IEEE Computer Society, Washington, DC (2003)
30.
Zurück zum Zitat de Mattos, J.C.B., Beck, A.C.S., Carro, L.: Object-oriented reconfiguration. In: 18th IEEE International Workshop on Rapid System Prototyping (RSP 2007), 28–30 May 2007, Porto Alegre, RS, Brazil, pp. 69–74. IEEE Computer Society, Washington, DC (2007) de Mattos, J.C.B., Beck, A.C.S., Carro, L.: Object-oriented reconfiguration. In: 18th IEEE International Workshop on Rapid System Prototyping (RSP 2007), 28–30 May 2007, Porto Alegre, RS, Brazil, pp. 69–74. IEEE Computer Society, Washington, DC (2007)
31.
Zurück zum Zitat Ebcioglu, K., Fritts, J., Kosonocky, S., Gschwind, M., Altman, E., Kailas, K., Brigh, T.: An eight issue tree-vliw processor for dynamic binary translation. In: ICCD ’98: Proceedings of the International Conference on Computer Design, p. 488. IEEE Computer Society, Washington, DC (1998) Ebcioglu, K., Fritts, J., Kosonocky, S., Gschwind, M., Altman, E., Kailas, K., Brigh, T.: An eight issue tree-vliw processor for dynamic binary translation. In: ICCD ’98: Proceedings of the International Conference on Computer Design, p. 488. IEEE Computer Society, Washington, DC (1998)
33.
Zurück zum Zitat Ebcioğlu, K., Altman, E.R.: Daisy: dynamic compilation for 100 architectural compatibility. In: ISCA ’97: Proceedings of the 24th Annual International Symposium on Computer Architecture, pp. 26–37. ACM, New York (1997) Ebcioğlu, K., Altman, E.R.: Daisy: dynamic compilation for 100 architectural compatibility. In: ISCA ’97: Proceedings of the 24th Annual International Symposium on Computer Architecture, pp. 26–37. ACM, New York (1997)
34.
Zurück zum Zitat Ferreira, R., Laure, M., Rutzig, M.B., Beck, A.C., Carro, L.: Reducing interconnection cost in coarse-grained dynamic computing through multistage network. In: FPL 2008, International Conference on Field Programmable Logic and Applications, Heidelberg, Germany, 8–10 September 2008, pp. 47–52. IEEE, New York (2008) Ferreira, R., Laure, M., Rutzig, M.B., Beck, A.C., Carro, L.: Reducing interconnection cost in coarse-grained dynamic computing through multistage network. In: FPL 2008, International Conference on Field Programmable Logic and Applications, Heidelberg, Germany, 8–10 September 2008, pp. 47–52. IEEE, New York (2008)
35.
Zurück zum Zitat Ferreira, R., Laure, M., Beck, A.C., Lo, T., Rutzig, M., Carro, L.: A low cost and adaptable routing network for reconfigurable systems. In: 23nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2009, Rome, Italy, May 23–29, 2009, pp. 1–8. IEEE, Los Alamitos (2009) Ferreira, R., Laure, M., Beck, A.C., Lo, T., Rutzig, M., Carro, L.: A low cost and adaptable routing network for reconfigurable systems. In: 23nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2009, Rome, Italy, May 23–29, 2009, pp. 1–8. IEEE, Los Alamitos (2009)
36.
Zurück zum Zitat Gabbay, F., Gabbay, F.: Speculative execution based on value prediction. Tech. rep., EE Department TR 1080, Technion – Israel Institue of Technology (1996) Gabbay, F., Gabbay, F.: Speculative execution based on value prediction. Tech. rep., EE Department TR 1080, Technion – Israel Institue of Technology (1996)
38.
Zurück zum Zitat Gomes, V.F., Beck, A.C.S., Carro, L.: Trading time and space on low power embedded architectures with dynamic instruction merging. J. Low Power Electron. 1(3), 249–258 (2005)CrossRef Gomes, V.F., Beck, A.C.S., Carro, L.: Trading time and space on low power embedded architectures with dynamic instruction merging. J. Low Power Electron. 1(3), 249–258 (2005)CrossRef
39.
Zurück zum Zitat Gonzalez, A., Tubella, J., Molina, C.: Trace-level reuse. In: ICPP ’99: Proceedings of the 1999 International Conference on Parallel Processing, p. 30. IEEE Computer Society, Washington, DC (1999) Gonzalez, A., Tubella, J., Molina, C.: Trace-level reuse. In: ICPP ’99: Proceedings of the 1999 International Conference on Parallel Processing, p. 30. IEEE Computer Society, Washington, DC (1999)
40.
Zurück zum Zitat Gschwind, M., Ebcioğlu, K., Altman, E., Sathaye, S.: Binary translation and architecture convergence issues for ibm system/390. In: ICS ’00: Proceedings of the 14th International Conference on Supercomputing, pp. 336–347. ACM, New York (2000). doi:http://doi.acm.org/10.1145/335231.335264 Gschwind, M., Ebcioğlu, K., Altman, E., Sathaye, S.: Binary translation and architecture convergence issues for ibm system/390. In: ICS ’00: Proceedings of the 14th International Conference on Supercomputing, pp. 336–347. ACM, New York (2000). doi:http://​doi.​acm.​org/​10.​1145/​335231.​335264
41.
Zurück zum Zitat Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: Mibench: A free, commercially representative embedded benchmark suite. In: 2001 IEEE International Workshop on Workload Characterization, 2001. WWC-4, pp. 3–14. IEEE Computer Society, Washington, DC (2001) Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: Mibench: A free, commercially representative embedded benchmark suite. In: 2001 IEEE International Workshop on Workload Characterization, 2001. WWC-4, pp. 3–14. IEEE Computer Society, Washington, DC (2001)
42.
Zurück zum Zitat Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach, 4th edn. Morgan Kaufmann (2006) Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach, 4th edn. Morgan Kaufmann (2006)
43.
Zurück zum Zitat Hookway, R.J., Herdeg, M.A.: Digital fx!32: combining emulation and binary translation. Digital Tech. J. 9(1), 3–12 (1997) Hookway, R.J., Herdeg, M.A.: Digital fx!32: combining emulation and binary translation. Digital Tech. J. 9(1), 3–12 (1997)
44.
Zurück zum Zitat Huang, J., Lilja, D.: Exploiting basic block value locality with block reuse. In: HPCA ’99: Proceedings of the 5th International Symposium on High Performance Computer Architecture, p. 106. IEEE Computer Society, Washington, DC (1999) Huang, J., Lilja, D.: Exploiting basic block value locality with block reuse. In: HPCA ’99: Proceedings of the 5th International Symposium on High Performance Computer Architecture, p. 106. IEEE Computer Society, Washington, DC (1999)
46.
Zurück zum Zitat Hwu, W.M.W., Mahlke, S.A., Chen, W.Y., Chang, P.P., Warter, N.J., Bringmann, R.A., Quellette, R.G., Hank, R.E., Kiyohara, T., Haab, G.E., Holm, J.G., Lavery, D.M.: The superblock: an effective technique for vliw and superscalar compilation. In: Instruction-Level Parallel Processors, pp. 234–253. Kluwer, Hingham (1995) Hwu, W.M.W., Mahlke, S.A., Chen, W.Y., Chang, P.P., Warter, N.J., Bringmann, R.A., Quellette, R.G., Hank, R.E., Kiyohara, T., Haab, G.E., Holm, J.G., Lavery, D.M.: The superblock: an effective technique for vliw and superscalar compilation. In: Instruction-Level Parallel Processors, pp. 234–253. Kluwer, Hingham (1995)
47.
Zurück zum Zitat Junior, J.F., Rutzig, M.B., Beck, A.C.S., Carro, L.: Towards an adaptable multiple-isa reconfigurable processor. In: Proceedings of the 7th International Conference on Reconfigurable Computing: Architectures, Tools and Applications, ARC’11, pp. 157–168. Springer, Berlin/Heidelberg (2011). http://dl.acm.org/citation.cfm?id=1987535.1987558 Junior, J.F., Rutzig, M.B., Beck, A.C.S., Carro, L.: Towards an adaptable multiple-isa reconfigurable processor. In: Proceedings of the 7th International Conference on Reconfigurable Computing: Architectures, Tools and Applications, ARC’11, pp. 157–168. Springer, Berlin/Heidelberg (2011). http://​dl.​acm.​org/​citation.​cfm?​id=​1987535.​1987558
48.
Zurück zum Zitat Lee, C., Potkonjak, M., Mangione-smith, W.H.: Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In: International Symposium on Microarchitecture, pp. 330–335. IEEE Computer Society, Washington, DC (1997) Lee, C., Potkonjak, M., Mangione-smith, W.H.: Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In: International Symposium on Microarchitecture, pp. 330–335. IEEE Computer Society, Washington, DC (1997)
49.
Zurück zum Zitat Lipasti, M.H., Shen, J.P.: Exceeding the dataflow limit via value prediction. In: MICRO 29: Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 226–237. IEEE Computer Society, Washington, DC (1996) Lipasti, M.H., Shen, J.P.: Exceeding the dataflow limit via value prediction. In: MICRO 29: Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 226–237. IEEE Computer Society, Washington, DC (1996)
50.
Zurück zum Zitat Lipasti, M.H., Wilkerson, C.B., Shen, J.P.: Value locality and load value prediction. In: ASPLOS-VII: Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 138–147. ACM, New York (1996). doi:http://doi.acm.org/10.1145/237090.237173 Lipasti, M.H., Wilkerson, C.B., Shen, J.P.: Value locality and load value prediction. In: ASPLOS-VII: Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 138–147. ACM, New York (1996). doi:http://​doi.​acm.​org/​10.​1145/​237090.​237173
51.
Zurück zum Zitat Lysecky, R., Vahid, F.: A configurable logic architecture for dynamic hardware/software partitioning. In: DATE ’04: Proceedings of the Conference on Design, Automation and Test in Europe, p. 10480. IEEE Computer Society, Washington, DC (2004) Lysecky, R., Vahid, F.: A configurable logic architecture for dynamic hardware/software partitioning. In: DATE ’04: Proceedings of the Conference on Design, Automation and Test in Europe, p. 10480. IEEE Computer Society, Washington, DC (2004)
52.
Zurück zum Zitat Lysecky, R., Vahid, F.: A study of the speedups and competitiveness of fpga soft processor cores using dynamic hardware/software partitioning. In: DATE ’05: Proceedings of the Conference on Design, Automation and Test in Europe, pp. 18–23. IEEE Computer Society, Washington, DC (2005). doi:http://dx.doi.org/10.1109/DATE.2005.38 Lysecky, R., Vahid, F.: A study of the speedups and competitiveness of fpga soft processor cores using dynamic hardware/software partitioning. In: DATE ’05: Proceedings of the Conference on Design, Automation and Test in Europe, pp. 18–23. IEEE Computer Society, Washington, DC (2005). doi:http://​dx.​doi.​org/​10.​1109/​DATE.​2005.​38
55.
Zurück zum Zitat Memik, G., Mangione-Smith, W.H., Hu, W.: Netbench: a benchmarking suite for network processors. In: ICCAD ’01: Proceedings of the 2001 IEEE/ACM International Conference on Computer-Aided Design, pp. 39–42. IEEE, Piscataway (2001) Memik, G., Mangione-Smith, W.H., Hu, W.: Netbench: a benchmarking suite for network processors. In: ICCAD ’01: Proceedings of the 2001 IEEE/ACM International Conference on Computer-Aided Design, pp. 39–42. IEEE, Piscataway (2001)
57.
Zurück zum Zitat Peng, L., Nakano, A., Tan, G., Vashishta, P., Fan, D., Zhang, H., Kalia, R.K., Song, F.: Performance analysis and optimization of molecular dynamics simulation on godson-t many-core processor. In: Proceedings of the 8th ACM International Conference on Computing Frontiers, CF ’11, pp. 32:1–32:10. ACM, New York (2011). doi:10.1145/2016604.2016643. http://doi.acm.org/10.1145/2016604.2016643 Peng, L., Nakano, A., Tan, G., Vashishta, P., Fan, D., Zhang, H., Kalia, R.K., Song, F.: Performance analysis and optimization of molecular dynamics simulation on godson-t many-core processor. In: Proceedings of the 8th ACM International Conference on Computing Frontiers, CF ’11, pp. 32:1–32:10. ACM, New York (2011). doi:10.1145/2016604.2016643. http://​doi.​acm.​org/​10.​1145/​2016604.​2016643
58.
Zurück zum Zitat Peng, L., Tan, G., Kalia, R.K., Nakano, A., Vashishta, P., Fan, D., Sun, N.: Preliminary investigation of accelerating molecular dynamics simulation on godson-t many-core processor. In: Proceedings of the 2010 Conference on Parallel Processing, Euro-Par 2010, pp. 349–356. Springer, Berlin/Heidelberg (2011). http://dl.acm.org/citation.cfm?id=2031978.2032026 Peng, L., Tan, G., Kalia, R.K., Nakano, A., Vashishta, P., Fan, D., Sun, N.: Preliminary investigation of accelerating molecular dynamics simulation on godson-t many-core processor. In: Proceedings of the 2010 Conference on Parallel Processing, Euro-Par 2010, pp. 349–356. Springer, Berlin/Heidelberg (2011). http://​dl.​acm.​org/​citation.​cfm?​id=​2031978.​2032026
59.
Zurück zum Zitat Pilla, M.L., da Costa, A.T., França, F.M.G., Childers, B.R., Soffa, M.L.: The limits of speculative trace reuse on deeply pipelined processors. In: SBAC-PAD ’03: Proceedings of the 15th Symposium on Computer Architecture and High Performance Computing, p. 36. IEEE Computer Society, Washington, DC (2003) Pilla, M.L., da Costa, A.T., França, F.M.G., Childers, B.R., Soffa, M.L.: The limits of speculative trace reuse on deeply pipelined processors. In: SBAC-PAD ’03: Proceedings of the 15th Symposium on Computer Architecture and High Performance Computing, p. 36. IEEE Computer Society, Washington, DC (2003)
60.
Zurück zum Zitat Pilla, M.L., Childers, B.R., da Costa, A.T., Franca, F.M.G., Navaux, P.O.A.: A speculative trace reuse architecture with reduced hardware requirements. In: SBAC-PAD ’06: Proceedings of the 18th International Symposium on Computer Architecture and High Performance Computing, pp. 47–54. IEEE Computer Society, Washington, DC (2006). doi:http://dx.doi.org/10.1109/SBAC-PAD.2006.7 Pilla, M.L., Childers, B.R., da Costa, A.T., Franca, F.M.G., Navaux, P.O.A.: A speculative trace reuse architecture with reduced hardware requirements. In: SBAC-PAD ’06: Proceedings of the 18th International Symposium on Computer Architecture and High Performance Computing, pp. 47–54. IEEE Computer Society, Washington, DC (2006). doi:http://​dx.​doi.​org/​10.​1109/​SBAC-PAD.​2006.​7
61.
Zurück zum Zitat Puttaswamy, K., Choi, K.W., Park, J.C., Mooney III, V.J., Chatterjee, A., Ellervee, P.: System level power-performance trade-offs in embedded systems using voltage and frequency scaling of off-chip buses and memory. In: ISSS ’02: Proceedings of the 15th International Symposium on System Synthesis, pp. 225–230. ACM, New York (2002). doi:http://doi.acm.org/10.1145/581199.581249 Puttaswamy, K., Choi, K.W., Park, J.C., Mooney III, V.J., Chatterjee, A., Ellervee, P.: System level power-performance trade-offs in embedded systems using voltage and frequency scaling of off-chip buses and memory. In: ISSS ’02: Proceedings of the 15th International Symposium on System Synthesis, pp. 225–230. ACM, New York (2002). doi:http://​doi.​acm.​org/​10.​1145/​581199.​581249
62.
Zurück zum Zitat Rotenberg, E., Bennett, S., Smith, J.E.: Trace cache: a low latency approach to high bandwidth instruction fetching. In: MICRO 29: Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 24–35. IEEE Computer Society, Washington, DC (1996) Rotenberg, E., Bennett, S., Smith, J.E.: Trace cache: a low latency approach to high bandwidth instruction fetching. In: MICRO 29: Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 24–35. IEEE Computer Society, Washington, DC (1996)
63.
Zurück zum Zitat Rutzig, M.B., Beck, A.C.S., Carro, L.: Transparent dataflow execution for embedded applications. In: ISVLSI ’07: Proceedings of the IEEE Computer Society Annual Symposium on VLSI, pp. 47–54. IEEE Computer Society, Washington, DC (2007). doi:http://dx.doi.org/10.1109/ISVLSI.2007.98 Rutzig, M.B., Beck, A.C.S., Carro, L.: Transparent dataflow execution for embedded applications. In: ISVLSI ’07: Proceedings of the IEEE Computer Society Annual Symposium on VLSI, pp. 47–54. IEEE Computer Society, Washington, DC (2007). doi:http://​dx.​doi.​org/​10.​1109/​ISVLSI.​2007.​98
64.
Zurück zum Zitat Rutzig, M.B., Madruga, F.L., Alves, M.A.Z., de Freitas, H.C., Beck, A.C.S., Maillard, N., Navaux, P.O.A., Carro, L.: Tlp and ilp exploitation through a reconfigurable multiprocessor system. In: IPDPS Workshops, pp. 1–8. IEEE, Piscataway (2010) Rutzig, M.B., Madruga, F.L., Alves, M.A.Z., de Freitas, H.C., Beck, A.C.S., Maillard, N., Navaux, P.O.A., Carro, L.: Tlp and ilp exploitation through a reconfigurable multiprocessor system. In: IPDPS Workshops, pp. 1–8. IEEE, Piscataway (2010)
65.
Zurück zum Zitat Rutzig, M., Beck, A., Carro, L.: Creams: An embedded multiprocessor platform. In: Koch, A., Krishnamurthy, R., McAllister, J., Woods, R., El-Ghazawi, T. (eds.) Reconfigurable Computing: Architectures, Tools and Applications. Lecture Notes in Computer Science, vol. 6578, pp. 118–124. Springer, Berlin/Heidelberg (2011)CrossRef Rutzig, M., Beck, A., Carro, L.: Creams: An embedded multiprocessor platform. In: Koch, A., Krishnamurthy, R., McAllister, J., Woods, R., El-Ghazawi, T. (eds.) Reconfigurable Computing: Architectures, Tools and Applications. Lecture Notes in Computer Science, vol. 6578, pp. 118–124. Springer, Berlin/Heidelberg (2011)CrossRef
66.
Zurück zum Zitat Rutzig, M.B., Beck, A.C.S., Madruga, F., Alves, M.A., Freitas, H.C., Maillard, N., Navaux, P.O.A., Carro, L.: Boosting parallel applications performance on applying dim technique in a multiprocessing environment. Int. J. Reconfig. Comput. 2011, 4:1–4:13 (2011). doi:10.1155/2011/546962. http://dx.doi.org/10.1155/2011/546962 Rutzig, M.B., Beck, A.C.S., Madruga, F., Alves, M.A., Freitas, H.C., Maillard, N., Navaux, P.O.A., Carro, L.: Boosting parallel applications performance on applying dim technique in a multiprocessing environment. Int. J. Reconfig. Comput. 2011, 4:1–4:13 (2011). doi:10.1155/2011/546962. http://​dx.​doi.​org/​10.​1155/​2011/​546962
67.
Zurück zum Zitat Sager, D., Group, D.P., Corp, I.: The microarchitecture of the pentium 4 processor. Intel Technol. J. 1(2001) (2001) Sager, D., Group, D.P., Corp, I.: The microarchitecture of the pentium 4 processor. Intel Technol. J. 1(2001) (2001)
68.
Zurück zum Zitat Schneider Beck Fl., A.C., Carro, L.: Dynamic Reconfigurable Architectures and Transparent Optimization Techniques: Automatic Acceleration of Software Execution, 1st edn. Springer, Dordrecht (2010) Schneider Beck Fl., A.C., Carro, L.: Dynamic Reconfigurable Architectures and Transparent Optimization Techniques: Automatic Acceleration of Software Execution, 1st edn. Springer, Dordrecht (2010)
74.
Zurück zum Zitat Sodani, A., Sohi, G.S.: Understanding the differences between value prediction and instruction reuse. In: MICRO 31: Proceedings of the 31st Annual ACM/IEEE International Symposium on Microarchitecture, pp. 205–215. IEEE Computer Society, Los Alamitos (1998) Sodani, A., Sohi, G.S.: Understanding the differences between value prediction and instruction reuse. In: MICRO 31: Proceedings of the 31st Annual ACM/IEEE International Symposium on Microarchitecture, pp. 205–215. IEEE Computer Society, Los Alamitos (1998)
76.
Zurück zum Zitat Stitt, G., Vahid, F., McGregor, G., Einloth, B.: Hardware/software partitioning of software binaries: a case study of h.264 decode. In: CODES+ISSS ’05: Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, pp. 285–290. ACM, New York (2005). doi:http://doi.acm.org/10.1145/1084834.1084905 Stitt, G., Vahid, F., McGregor, G., Einloth, B.: Hardware/software partitioning of software binaries: a case study of h.264 decode. In: CODES+ISSS ’05: Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, pp. 285–290. ACM, New York (2005). doi:http://​doi.​acm.​org/​10.​1145/​1084834.​1084905
78.
Zurück zum Zitat Yang, B.S., Moon, S.M., Park, S., Lee, J., Lee, S., Park, J., Chung, Y.C., Kim, S., Ebcioglu, K., Altman, E.R.: Latte: A java vm just-in-time compiler with fast and efficient register allocation. In: IEEE PACT, pp. 128–138. IEEE Computer Society, Washington, DC (1999) Yang, B.S., Moon, S.M., Park, S., Lee, J., Lee, S., Park, J., Chung, Y.C., Kim, S., Ebcioglu, K., Altman, E.R.: Latte: A java vm just-in-time compiler with fast and efficient register allocation. In: IEEE PACT, pp. 128–138. IEEE Computer Society, Washington, DC (1999)
Metadaten
Titel
Dynamic Optimization Techniques
verfasst von
Antonio Carlos Schneider Beck
Copyright-Jahr
2013
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-1746-0_6

Neuer Inhalt