nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

7. Parallelizing Compiler for Single and Multicore Computing

verfasst von : Abderazek Ben Abdallah

Erschienen in: Advanced Multicore Systems-On-Chip

Verlag: Springer Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

To overcome challenges from high power densities and thermal hot spots in microprocessors, multicore computing platforms have emerged as the ubiquitous computing platform from servers to embedded systems. But, providing multiple cores does not directly translate into increased performance for most applications. The burden is placed on software developers to find and exploit coarse-grain parallelism to effectively make use of the abundance of computing resources provided by the systems. With the rise of multicore systems and many-core processors, concurrency becomes a major issue in the daily life of a programmer. Thus, compiler and software development tools will be critical to help programmers create high-performance software. This chapter covers software issues of a so-called parallelizing queue compiler targeted for future single- and multicore embedded systems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel 3D Integration Technology for Multicore Systems On-Chip

Nächstes Kapitel Power Optimization Techniques for Multicore SoCs

S.S. Muchnick, Advanced Compiler Design and Implementation (Morgan Kaufman, Burlington, 1997)

J. Hennessy, D. Patterson, Computer Architecture: A Quantitative Approach (Morgan Kaufman, Burlington, 1990)

R. Allen, K. Kennedy, Optimizing Compilers for Modern Architectures, (Morgan Kaufman, Burlington, 2002)

M. Wolfe, High Performance Compilers for Parallel Computing (Addison-Wesley, 1996)

M. Lam, Software pipelining: an effective scheduling technique for VLIW machines, in Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation, (1988), pp. 318–328

R. Rau, Iterative modulo scheduling: an algorithm for software pipelining loops, in Proceedings of the 27th annual international symposium on Microarchitecture, (1994), pp. 63–74

J. Losa, E. Ayguade, M. Valero, Quantitative evaluation of register pressure on software pipelined loops. Int. J. Parallel Program. 26(2), 121–142 (1998)CrossRef

S. Pinter, Register allocation with instruction scheduling, in Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation, (1993), pp. 248–257

G. Kane, J. Heinrich, MIPS RISC Architecture, (Prentice Hall, 1992)

10.

R. Kessler, The Alpha 21264 microprocessor. IEEE Micro 19(2), 24–36 (1999)CrossRef

11.

Sparc-International, The SPARC Architecture Manual, Version 8, (Prentice Hall, 1992)

12.

S.A. Mahlke, W.Y. Chen, P.P. Chang, W. mei, W. Hwu, Scalar program performance on muliple-instruction-issue processors with a limited number of registers, in Proceedings of the 25th Annual Hawaii Int’l Conference on System Sciences, (1992), pp. 34–44

13.

M. Postiff, D. Greene, T. Mudge, The Need for Large Register File in Integer Codes, Technical Report CSE-TR-434-00, (University of Michigan, 2000)

14.

J. Janssen, H. Corporaal, Partitioned register file for TTAs, in Proceedings of the 28th annual international symposium on Microarchitecture, (1995), pp. 303–312

15.

J. Zalamea, J. Llosa, E. Ayguade, M. Valero, Software and hardware techniques to optimize register file utilization in VLIW architectures. Int. J. Parallel Program. 32(6), 447–474 (2004)CrossRef

16.

X. Huang, S. Carr, P. Sweany, Loop transformations for architectures with partitioned register banks, in Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers and Tools for Embedded Systems, (2001), pp. 48–55

17.

S. Jang, S. Carr, P. Sweany, D. Kuras, A code generation framework for VLIW architectures with partitioned register banks, in Proceedings of the 3rd International Conference on Massively Parallel Computing Systems, (1998)

18.

M. Fernandes, J. Llosa, N. Topham, Using Queues for Register File Organization in VLIW. Technical Report ECS-CSG-29-97, (University of Edinburgh, Department of Computer Science, 1997)

19.

G. Tyson, M. Smelyanskiy, E. Davidson, Evaluating the use of register queues in software pipelined loops. IEEE Trans. Comput. 50(8), 769–783 (2001)CrossRef

20.

R. Ravindran, R. Senger, E. Marsman, G. Dasika, M. Guthaus, S. Mahlke, R. Brown, Partitioning variables across register windows to reduce spill code in a low-power processor. IEEE Trans. Comput. 54(8), 998–1012 (2005)CrossRef

21.

G. Kucuk, O. Ergin, D. Ponomarev, K. Ghose, Energy efficient register renaming. Lect. Notes Comput. Sci. 2799(2003), 219–228 (2003)CrossRef

22.

B. Preiss, C. Hamacher, Data flow on queue machines, in 12th International IEEE Symposium on Computer, Architecture, (1985), pp. 342–351

23.

S. Okamoto, Design of a superscalar processor based on queue machine computation model, in IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, (1999), pp. 151–154

24.

H. Schmit, B. Levine, B. Ylvisaker, Queue machines: hardware compilation in hardware, in FCCM’02, 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, (2002), pp. 152–161

25.

A. Ben Abdallah, T. Yoshinaga, M. Sowa, High-level modeling and FPGA prototyping of produced order parallel queue processor core. J. Supercomput. 38(1), 3–15 (2006)CrossRef

26.

M. Sowa, A. Ben Abdallah, T. Yoshinaga, Parallel queue processor architecture based on produced order computation model. J. Supercomput. 32(3), 217–229 (2005)

27.

A. Canedo, A. Ben Abdallah, M. Sowa, A GCC-based Compiler for the queue register processor, in Proceedings of International Workshop on Modern Science and Technology, (May 2006), pp. 250–255

28.

A. Ben Abdallah, M. Masuda, A. Canedo, K. Kuroda, Natural instruction level parallelism-aware compiler for high-performance queuecore processor architecture. J. Supercomput. 57(3), 314–338 (2011)

29.

A. Ben Abdallah, A. Canedo, T. Yoshinaga, M. Sowa, The QC-2 parallel queue processor architecture. J. Parallel Distrib. Comput. 68(2), 235–245 (2008)CrossRefMATH

30.

A. Canedo, A. Ben Abdallah, M. Sowa, A new code generation algorithm for 2-offset producer order queue computation model. J. Comput. Lang. Syst. Struct. 34(4), 184–194 (2007)

31.

A. Canedo, A. Ben Abdallah, M. Sowa, Compiling for reduced bit-width queue processors. J. Signal Process. Syst. 59(1), 45–55 (2010)

32.

A. Canedo, A. Ben Abdallah, M. Sowa, Efficient compilation for queue size-constrained queue processors. J. Parallel Comput. 35, 213–225 (2009)

33.

D. Novillo, Design and implementation of tree SSA, in Proceedings of GCC Developers Summit, (2004), pp. 119–130

34.

L.S. Heath, S.V. Pemmaraju, Stack and queue layouts of directed acyclic graphs: part I. SIAM J. Comput. 28(4), 1510–1539 (1999)MathSciNetCrossRefMATH

35.

D. Wall, Limits of instruction-level parallelism. ACM SIGARCH Comput. Archit. News 19(2), 176–188 (1991)CrossRef

36.

K. Kissel, MIPS16: High-density MIPS for the Embedded Market (Technical report, Silicon Graphics MIPS Group, 1997)

37.

L. Goudge, S. Segars, Thumb: reducing the cost of 32-bit RISC performance in portable and consumer applications, in Proceedings of COMPCON 1996, (1996), pp. 176–181

38.

A.V. Aho, R. Sethi, J.D. Ullman, Compilers Principles, Techniques, and Tools, (Addison Wesley, 1986)

39.

A. Ben Abdallah, S. Kawata, M. Sowa, Design and architecture for an embedded 32-bit queuecore. J. Embed. Comput. 2(2), 191–205 (2006)

Titel: Parallelizing Compiler for Single and Multicore Computing
verfasst von: Abderazek Ben Abdallah
Verlag: Springer Singapore
Buch: Advanced Multicore Systems-On-Chip
Print ISBN: 978-981-10-6091-5

Electronic ISBN: 978-981-10-6092-2

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-981-10-6092-2_7

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Frank Urbansky/© Peter Eichler / Leipzig, CO2-Fußabdruck/© Jenny Sturm / stock.adobe.com, Interview Entropie Bild 1/© Bernhard Weßling, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.