Weitere Kapitel dieses Buchs durch Wischen aufrufen
This chapter explains the basics of speedup programs with simple examples for numerical computing. Parallel processing as well as the trends in computer hardware are explained in detail to understand how high-performance computing works. Several key technologies for code tuning, such as pipelining, cache optimizations, and numerical libraries, are also explained using sample programs from basic numerical computations.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
However, the size of problem n is large and data is on out of cache. Hence, the performance will go down even in the middle product form. To avoid this situation, cache blocking, which will be explained later, should be implemented.
Several computer architectures provide a software control mechanism for the movement of data for cache.
Another merit of loop unrolling is to reduce the time of branch control in loops.
However, the efficiency of loop unrolling depends on code optimizations by the compiler. To do an effective implementation of loop unrolling, programmers need to know in advance the details of the code optimization by the compiler.
In some numerical libraries, the best blocking size to do matrix–matrix multiplications is fixed by library developers in advance before the release time. Hence, library users do not tune the blocking size. In addition, some numerical libraries perform test execution to tune the blocking size automatically with respect to the amount of cache and considering computations of target loops in install time. These automations are called “ Software Auto-tuning.” [ 3– 7] Study on software auto-tuning is one of hot topics in high-performance computing. In Chap. 5, we will explain the details of the research on software auto-tuning.
T. Katagiri, K. Kise, H. Honda, Yuba, in Proceedings of the ACM International Symposium on High Performance Computing (2003), pp. 146–159
T. Katagiri, K. Kise, H. Honda, T. Yuba, in Proceedings of the ACM Computing Frontiers (2004), pp. 12–25
T. Katagiri, K. Kise, H. Honda, T. Yuba, Parallel Comput. 32(1), 92–112 (2006) CrossRef
K. Naono, K. Teranishi, J. Cavazos, R. Suda, Software Automatic Tuning (Springer, New York, 2010)
T. Katagiri, S. Ohshima, M. Matsumoto, in Proceedings of IEEE MCSoC 2014 (2014), pp. 91–98
- High-Performance Computing Basics
- Springer Singapore
- Chapter 1
Neuer Inhalt/© ITandMEDIA