Skip to main content

2017 | OriginalPaper | Buchkapitel

7. Parallelizing Compiler for Single and Multicore Computing

verfasst von : Abderazek Ben Abdallah

Erschienen in: Advanced Multicore Systems-On-Chip

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

To overcome challenges from high power densities and thermal hot spots in microprocessors, multicore computing platforms have emerged as the ubiquitous computing platform from servers to embedded systems. But, providing multiple cores does not directly translate into increased performance for most applications. The burden is placed on software developers to find and exploit coarse-grain parallelism to effectively make use of the abundance of computing resources provided by the systems. With the rise of multicore systems and many-core processors, concurrency becomes a major issue in the daily life of a programmer. Thus, compiler and software development tools will be critical to help programmers create high-performance software. This chapter covers software issues of a so-called parallelizing queue compiler targeted for future single- and multicore embedded systems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat S.S. Muchnick, Advanced Compiler Design and Implementation (Morgan Kaufman, Burlington, 1997) S.S. Muchnick, Advanced Compiler Design and Implementation (Morgan Kaufman, Burlington, 1997)
2.
Zurück zum Zitat J. Hennessy, D. Patterson, Computer Architecture: A Quantitative Approach (Morgan Kaufman, Burlington, 1990) J. Hennessy, D. Patterson, Computer Architecture: A Quantitative Approach (Morgan Kaufman, Burlington, 1990)
3.
Zurück zum Zitat R. Allen, K. Kennedy, Optimizing Compilers for Modern Architectures, (Morgan Kaufman, Burlington, 2002) R. Allen, K. Kennedy, Optimizing Compilers for Modern Architectures, (Morgan Kaufman, Burlington, 2002)
4.
Zurück zum Zitat M. Wolfe, High Performance Compilers for Parallel Computing (Addison-Wesley, 1996) M. Wolfe, High Performance Compilers for Parallel Computing (Addison-Wesley, 1996)
5.
Zurück zum Zitat M. Lam, Software pipelining: an effective scheduling technique for VLIW machines, in Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation, (1988), pp. 318–328 M. Lam, Software pipelining: an effective scheduling technique for VLIW machines, in Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation, (1988), pp. 318–328
6.
Zurück zum Zitat R. Rau, Iterative modulo scheduling: an algorithm for software pipelining loops, in Proceedings of the 27th annual international symposium on Microarchitecture, (1994), pp. 63–74 R. Rau, Iterative modulo scheduling: an algorithm for software pipelining loops, in Proceedings of the 27th annual international symposium on Microarchitecture, (1994), pp. 63–74
7.
Zurück zum Zitat J. Losa, E. Ayguade, M. Valero, Quantitative evaluation of register pressure on software pipelined loops. Int. J. Parallel Program. 26(2), 121–142 (1998)CrossRef J. Losa, E. Ayguade, M. Valero, Quantitative evaluation of register pressure on software pipelined loops. Int. J. Parallel Program. 26(2), 121–142 (1998)CrossRef
8.
Zurück zum Zitat S. Pinter, Register allocation with instruction scheduling, in Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation, (1993), pp. 248–257 S. Pinter, Register allocation with instruction scheduling, in Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation, (1993), pp. 248–257
9.
Zurück zum Zitat G. Kane, J. Heinrich, MIPS RISC Architecture, (Prentice Hall, 1992) G. Kane, J. Heinrich, MIPS RISC Architecture, (Prentice Hall, 1992)
10.
Zurück zum Zitat R. Kessler, The Alpha 21264 microprocessor. IEEE Micro 19(2), 24–36 (1999)CrossRef R. Kessler, The Alpha 21264 microprocessor. IEEE Micro 19(2), 24–36 (1999)CrossRef
11.
Zurück zum Zitat Sparc-International, The SPARC Architecture Manual, Version 8, (Prentice Hall, 1992) Sparc-International, The SPARC Architecture Manual, Version 8, (Prentice Hall, 1992)
12.
Zurück zum Zitat S.A. Mahlke, W.Y. Chen, P.P. Chang, W. mei, W. Hwu, Scalar program performance on muliple-instruction-issue processors with a limited number of registers, in Proceedings of the 25th Annual Hawaii Int’l Conference on System Sciences, (1992), pp. 34–44 S.A. Mahlke, W.Y. Chen, P.P. Chang, W. mei, W. Hwu, Scalar program performance on muliple-instruction-issue processors with a limited number of registers, in Proceedings of the 25th Annual Hawaii Int’l Conference on System Sciences, (1992), pp. 34–44
13.
Zurück zum Zitat M. Postiff, D. Greene, T. Mudge, The Need for Large Register File in Integer Codes, Technical Report CSE-TR-434-00, (University of Michigan, 2000) M. Postiff, D. Greene, T. Mudge, The Need for Large Register File in Integer Codes, Technical Report CSE-TR-434-00, (University of Michigan, 2000)
14.
Zurück zum Zitat J. Janssen, H. Corporaal, Partitioned register file for TTAs, in Proceedings of the 28th annual international symposium on Microarchitecture, (1995), pp. 303–312 J. Janssen, H. Corporaal, Partitioned register file for TTAs, in Proceedings of the 28th annual international symposium on Microarchitecture, (1995), pp. 303–312
15.
Zurück zum Zitat J. Zalamea, J. Llosa, E. Ayguade, M. Valero, Software and hardware techniques to optimize register file utilization in VLIW architectures. Int. J. Parallel Program. 32(6), 447–474 (2004)CrossRef J. Zalamea, J. Llosa, E. Ayguade, M. Valero, Software and hardware techniques to optimize register file utilization in VLIW architectures. Int. J. Parallel Program. 32(6), 447–474 (2004)CrossRef
16.
Zurück zum Zitat X. Huang, S. Carr, P. Sweany, Loop transformations for architectures with partitioned register banks, in Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers and Tools for Embedded Systems, (2001), pp. 48–55 X. Huang, S. Carr, P. Sweany, Loop transformations for architectures with partitioned register banks, in Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers and Tools for Embedded Systems, (2001), pp. 48–55
17.
Zurück zum Zitat S. Jang, S. Carr, P. Sweany, D. Kuras, A code generation framework for VLIW architectures with partitioned register banks, in Proceedings of the 3rd International Conference on Massively Parallel Computing Systems, (1998) S. Jang, S. Carr, P. Sweany, D. Kuras, A code generation framework for VLIW architectures with partitioned register banks, in Proceedings of the 3rd International Conference on Massively Parallel Computing Systems, (1998)
18.
Zurück zum Zitat M. Fernandes, J. Llosa, N. Topham, Using Queues for Register File Organization in VLIW. Technical Report ECS-CSG-29-97, (University of Edinburgh, Department of Computer Science, 1997) M. Fernandes, J. Llosa, N. Topham, Using Queues for Register File Organization in VLIW. Technical Report ECS-CSG-29-97, (University of Edinburgh, Department of Computer Science, 1997)
19.
Zurück zum Zitat G. Tyson, M. Smelyanskiy, E. Davidson, Evaluating the use of register queues in software pipelined loops. IEEE Trans. Comput. 50(8), 769–783 (2001)CrossRef G. Tyson, M. Smelyanskiy, E. Davidson, Evaluating the use of register queues in software pipelined loops. IEEE Trans. Comput. 50(8), 769–783 (2001)CrossRef
20.
Zurück zum Zitat R. Ravindran, R. Senger, E. Marsman, G. Dasika, M. Guthaus, S. Mahlke, R. Brown, Partitioning variables across register windows to reduce spill code in a low-power processor. IEEE Trans. Comput. 54(8), 998–1012 (2005)CrossRef R. Ravindran, R. Senger, E. Marsman, G. Dasika, M. Guthaus, S. Mahlke, R. Brown, Partitioning variables across register windows to reduce spill code in a low-power processor. IEEE Trans. Comput. 54(8), 998–1012 (2005)CrossRef
21.
Zurück zum Zitat G. Kucuk, O. Ergin, D. Ponomarev, K. Ghose, Energy efficient register renaming. Lect. Notes Comput. Sci. 2799(2003), 219–228 (2003)CrossRef G. Kucuk, O. Ergin, D. Ponomarev, K. Ghose, Energy efficient register renaming. Lect. Notes Comput. Sci. 2799(2003), 219–228 (2003)CrossRef
22.
Zurück zum Zitat B. Preiss, C. Hamacher, Data flow on queue machines, in 12th International IEEE Symposium on Computer, Architecture, (1985), pp. 342–351 B. Preiss, C. Hamacher, Data flow on queue machines, in 12th International IEEE Symposium on Computer, Architecture, (1985), pp. 342–351
23.
Zurück zum Zitat S. Okamoto, Design of a superscalar processor based on queue machine computation model, in IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, (1999), pp. 151–154 S. Okamoto, Design of a superscalar processor based on queue machine computation model, in IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, (1999), pp. 151–154
24.
Zurück zum Zitat H. Schmit, B. Levine, B. Ylvisaker, Queue machines: hardware compilation in hardware, in FCCM’02, 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, (2002), pp. 152–161 H. Schmit, B. Levine, B. Ylvisaker, Queue machines: hardware compilation in hardware, in FCCM’02, 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, (2002), pp. 152–161
25.
Zurück zum Zitat A. Ben Abdallah, T. Yoshinaga, M. Sowa, High-level modeling and FPGA prototyping of produced order parallel queue processor core. J. Supercomput. 38(1), 3–15 (2006)CrossRef A. Ben Abdallah, T. Yoshinaga, M. Sowa, High-level modeling and FPGA prototyping of produced order parallel queue processor core. J. Supercomput. 38(1), 3–15 (2006)CrossRef
26.
Zurück zum Zitat M. Sowa, A. Ben Abdallah, T. Yoshinaga, Parallel queue processor architecture based on produced order computation model. J. Supercomput. 32(3), 217–229 (2005) M. Sowa, A. Ben Abdallah, T. Yoshinaga, Parallel queue processor architecture based on produced order computation model. J. Supercomput. 32(3), 217–229 (2005)
27.
Zurück zum Zitat A. Canedo, A. Ben Abdallah, M. Sowa, A GCC-based Compiler for the queue register processor, in Proceedings of International Workshop on Modern Science and Technology, (May 2006), pp. 250–255 A. Canedo, A. Ben Abdallah, M. Sowa, A GCC-based Compiler for the queue register processor, in Proceedings of International Workshop on Modern Science and Technology, (May 2006), pp. 250–255
28.
Zurück zum Zitat A. Ben Abdallah, M. Masuda, A. Canedo, K. Kuroda, Natural instruction level parallelism-aware compiler for high-performance queuecore processor architecture. J. Supercomput. 57(3), 314–338 (2011) A. Ben Abdallah, M. Masuda, A. Canedo, K. Kuroda, Natural instruction level parallelism-aware compiler for high-performance queuecore processor architecture. J. Supercomput. 57(3), 314–338 (2011)
29.
Zurück zum Zitat A. Ben Abdallah, A. Canedo, T. Yoshinaga, M. Sowa, The QC-2 parallel queue processor architecture. J. Parallel Distrib. Comput. 68(2), 235–245 (2008)CrossRefMATH A. Ben Abdallah, A. Canedo, T. Yoshinaga, M. Sowa, The QC-2 parallel queue processor architecture. J. Parallel Distrib. Comput. 68(2), 235–245 (2008)CrossRefMATH
30.
Zurück zum Zitat A. Canedo, A. Ben Abdallah, M. Sowa, A new code generation algorithm for 2-offset producer order queue computation model. J. Comput. Lang. Syst. Struct. 34(4), 184–194 (2007) A. Canedo, A. Ben Abdallah, M. Sowa, A new code generation algorithm for 2-offset producer order queue computation model. J. Comput. Lang. Syst. Struct. 34(4), 184–194 (2007)
31.
Zurück zum Zitat A. Canedo, A. Ben Abdallah, M. Sowa, Compiling for reduced bit-width queue processors. J. Signal Process. Syst. 59(1), 45–55 (2010) A. Canedo, A. Ben Abdallah, M. Sowa, Compiling for reduced bit-width queue processors. J. Signal Process. Syst. 59(1), 45–55 (2010)
32.
Zurück zum Zitat A. Canedo, A. Ben Abdallah, M. Sowa, Efficient compilation for queue size-constrained queue processors. J. Parallel Comput. 35, 213–225 (2009) A. Canedo, A. Ben Abdallah, M. Sowa, Efficient compilation for queue size-constrained queue processors. J. Parallel Comput. 35, 213–225 (2009)
33.
Zurück zum Zitat D. Novillo, Design and implementation of tree SSA, in Proceedings of GCC Developers Summit, (2004), pp. 119–130 D. Novillo, Design and implementation of tree SSA, in Proceedings of GCC Developers Summit, (2004), pp. 119–130
34.
Zurück zum Zitat L.S. Heath, S.V. Pemmaraju, Stack and queue layouts of directed acyclic graphs: part I. SIAM J. Comput. 28(4), 1510–1539 (1999)MathSciNetCrossRefMATH L.S. Heath, S.V. Pemmaraju, Stack and queue layouts of directed acyclic graphs: part I. SIAM J. Comput. 28(4), 1510–1539 (1999)MathSciNetCrossRefMATH
35.
Zurück zum Zitat D. Wall, Limits of instruction-level parallelism. ACM SIGARCH Comput. Archit. News 19(2), 176–188 (1991)CrossRef D. Wall, Limits of instruction-level parallelism. ACM SIGARCH Comput. Archit. News 19(2), 176–188 (1991)CrossRef
36.
Zurück zum Zitat K. Kissel, MIPS16: High-density MIPS for the Embedded Market (Technical report, Silicon Graphics MIPS Group, 1997) K. Kissel, MIPS16: High-density MIPS for the Embedded Market (Technical report, Silicon Graphics MIPS Group, 1997)
37.
Zurück zum Zitat L. Goudge, S. Segars, Thumb: reducing the cost of 32-bit RISC performance in portable and consumer applications, in Proceedings of COMPCON 1996, (1996), pp. 176–181 L. Goudge, S. Segars, Thumb: reducing the cost of 32-bit RISC performance in portable and consumer applications, in Proceedings of COMPCON 1996, (1996), pp. 176–181
38.
Zurück zum Zitat A.V. Aho, R. Sethi, J.D. Ullman, Compilers Principles, Techniques, and Tools, (Addison Wesley, 1986) A.V. Aho, R. Sethi, J.D. Ullman, Compilers Principles, Techniques, and Tools, (Addison Wesley, 1986)
39.
Zurück zum Zitat A. Ben Abdallah, S. Kawata, M. Sowa, Design and architecture for an embedded 32-bit queuecore. J. Embed. Comput. 2(2), 191–205 (2006) A. Ben Abdallah, S. Kawata, M. Sowa, Design and architecture for an embedded 32-bit queuecore. J. Embed. Comput. 2(2), 191–205 (2006)
Metadaten
Titel
Parallelizing Compiler for Single and Multicore Computing
verfasst von
Abderazek Ben Abdallah
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-6092-2_7

Neuer Inhalt