ABSTRACT
High performance computing with thousands of cores relies on distributed memory due to memory consistency reasons. The resource management on such systems usually relies on static assignment of resources at the start of each application. Such a static scheduling is incapable of starting applications with required resources being used by others since a reduction of resources assigned to applications without stopping them is not possible. This lack of dynamic adaptive scheduling leads to idling resources until the remaining amount of requested resources gets available. Additionally, applications with changing resource requirements lead to idling or less efficiently used resources. The invasive computing paradigm suggests dynamic resource scheduling and applications able to dynamically adapt to changing resource requirements.
As a case study, we developed an invasive resource manager as well as a multigrid with dynamically changing resource demands. Such a multigrid has changing scalability behavior during its execution and requires data migration upon reallocation due to distributed memory systems.
To counteract the additional complexity introduced by the additional interfaces, e. g. for data migration, we use the X10 programming language for improved programmability. Our results show improved application throughput and the dynamic adaptivity. In addition, we show our extension for the distributed arrays of X10 to support data migration.
- G. M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18--20, 1967, spring joint computer conference, AFIPS '67 (Spring), pages 483--485, New York, NY, USA, 1967. ACM. Google ScholarDigital Library
- M. Bader, H.-J. Bungartz, and M. Schreiber. Invasive computing on high performance shared memory systems. In Facing the Multicore-Challenge III, volume 7686 of Lecture Notes in Computer Science, Sept. 2012.Google Scholar
- S. Borkar and A. A. Chien. The future of microprocessors. Commun. ACM, 54:67--77, 2011. Google ScholarDigital Library
- M. Braun, S. Buchwald, M. Mohr, and A. Zwinkau. An X10 compiler for invasive architectures. Technical Report 9, Karlsruhe Institute of Technology, 2012.Google Scholar
- W. L. Briggs, V. E. Henson, and S. F. McCormick. A Multigrid Tutorial. Society for Industrial Mathematics, 2000. Google ScholarDigital Library
- W. Carlson, J. Draper, D. Culler, K. Yelick, E. Brooks, and K. Warren. Introduction to UPC and Language Specification. 1999.Google Scholar
- J. Dokulil, E. Bajrovic, S. Benkner, S. Pllana, M. Sandrieser, and B. Bachmayer. Efficient Hybrid Execution of C++ Applications using Intel(R) Xeon Phi(TM) Coprocessor. CoRR, abs/1211.5530, 2012.Google Scholar
- M. Gerndt, A. Hollmann, M. Meyer, M. Schreiber, and J. Weidendorfer. Invasive computing with iomp. In Specification and Design Languages (FDL), pages 225--231, Sept. 2012.Google Scholar
- F. Hannig, S. Roloff, G. Snelting, J. Teich, and A. Zwinkau. Resource-aware programming and simulation of MPSoC architectures through extension of X10. In Proceedings of the 14th International Workshop on Software and Compilers for Embedded Systems (SCOPES), pages 48--55. ACM Press, June 2011. Google ScholarDigital Library
- S. Kobbe, L. Bauer, J. Henkel, D. Lohman, and W. Schröder-Preikschat. DistRM: Distributed resource management for on-chip many-core systems. In Proceedings of the IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), pages 119--128, Oct. 2011. Google ScholarDigital Library
- K. Murakami, N. Irie, and S. Tomita. SIMP (Single Instruction Stream/Multiple Instruction Pipelining): A Novel High-speed Single-processor Architecture. SIGARCH Comput. Archit. News, 17(3):78--85, Apr. 1989. Google ScholarDigital Library
- J. Nieplocha, R. Harrison, and R. Littlefield. Global arrays: A nonuniform memory access programming model for high-performance computers. The Journal of Supercomputing, 10:169--189, 1996. Google ScholarDigital Library
- R. W. Numrich and J. Reid. Co-array fortran for parallel programming. SIGPLAN Fortran Forum, 17(2):1--31, Aug. 1998. Google ScholarDigital Library
- B. Oechslein, J. Schedel, J. Kleinöder, L. Bauer, J. Henkel, D. Lohmann, and W. Schröder-Preikschat. OctoPOS: A parallel operating system for invasive computing. In R. McIlroy, J. Sventek, T. Harris, and T. Roscoe, editors, Proceedings of the International Workshop on Systems for Future Multi-CoreArchitectures (SFMA), volume USB Proceedings of Sixth International ACM/EuroSys European Conference on Computer Systems (EuroSys), pages 9--14. EuroSys, Apr. 2011.Google Scholar
- A. Peleg and U. Weiser. MMX Technology Extension to the Intel Architecture. Micro, IEEE, 16(4):42--50, Aug. 1996. Google ScholarDigital Library
- R. K. Pujari, T. Wild, A. Herkersdorf, B. Vogel, and J. Henkel. Hardware assisted thread assignment for RISC based MPSoCs in invasive computing. In Proceedings of the 13th International Symposium on Integrated Circuits (ISIC), Dec. 2011.Google ScholarCross Ref
- V. Saraswat, B. Bloom, I. Peshansky, O. Tardieu, and D. Grove. X10 Language Specification Version 2.3, Oct 2012.Google Scholar
- M. Schreiber, H.-J. Bungartz, and M. Bader. Shared memory parallelization of fully-adaptive simulations using a dynamic tree-split and -join approach. Puna, India, Dec. 2012. IEEE International Conference on High Performance Computing (HiPC), IEEE Xplore.Google ScholarCross Ref
- J. Speck, P. Sanders, and P. Flick. Malleable sorting. In International Symposium on Parallel and Distributed Processing. IEEE Computer Society, May 2013.Google Scholar
- J. Teich, J. Henkel, A. Herkersdorf, D. Schmitt-Landsiedel, W. Schröder-Preikschat, and G. Snelting. Invasive computing: An overview. In M. Hübner and J. Becker, editors, Multiprocessor System-on-Chip -- Hardware Design and Tool Integration, pages 241--268. Springer, Berlin, Heidelberg, 2011.Google Scholar
- U. Trottenberg, C. Oosterlee, and A. Schüller. Multigrid. Academic Press, 2001. Google ScholarDigital Library
- A. Zwinkau. Resource awareness for efficiency in high-level programming languages. Technical Report 12, Karlsruhe Institute of Technology, 2012.Google Scholar
Index Terms
- Invasive computing in HPC with X10
Recommendations
Resource-aware programming and simulation of MPSoC architectures through extension of X10
SCOPES '11: Proceedings of the 14th International Workshop on Software and Compilers for Embedded SystemsThe efficient use of future MPSoCs with 1000 or more processor cores requires new means of resource-aware programming to deal with increasing imperfections such as process variation, fault rates, aging effects, and power as well as thermal problems. In ...
Analyzing Resource Utilization in an HPC System: A Case Study of NERSC’s Perlmutter
High Performance ComputingAbstractResource demands of HPC applications vary significantly. However, it is common for HPC systems to primarily assign resources on a per-node basis to prevent interference from co-located workloads. This gap between the coarse-grained resource ...
Achieving load-balancing in power system parallel contingency analysis using X10 programming language
X10 '13: Proceedings of the third ACM SIGPLAN X10 WorkshopDue to recent trends of expansion and deregulation in power systems, the stress level of power systems has increased which has highlighted the importance of conducting stability analysis. Further, due to increasing emphasis on analyzing N -- k ...
Comments