2012 | OriginalPaper | Chapter
End-to-End Modeling and Simulation of High- Performance Computing Systems
Authors : Cyriel Minkenberg, Wolfgang Denzel, German Rodriguez, Robert Birke
Published in: Use Cases of Discrete Event Simulation
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Designing large-scale High-Performance Computing (HPC) systems, including architecture design space exploration and performance prediction, is a daunting task that can benefit enormously from discrete event simulation techniques, as the interactions between the various components of such a system generally render analytic approaches intractable. The work described in this chapter specifically deals with end-to-end, full-system simulation, as opposed to simulation of individual components or nodes. The tools described here can be used in the design phase of a new HPC system to optimize system design for a given set of workloads, or to create performance forecasts for new workloads on existing systems.
We have taken a network-centric approach, as the scale of current high-end HPC systems is in the range of hundreds of thousands of processing cores, so that the impact of the communication among so many cores will be a key factor in determining overall system performance. To this end, we developed an Omnest-based simulation environment that enables studying the impact of an HPC machine’s communication subsystem on the overall system’s performance for specific workloads.
Full system simulation at an abstraction level that still maintains a reasonably high level of detail is infeasible without resorting to parallel simulation, the main limiting factors being simulation run time and memory footprint. By applying Parallel Discrete Event Simulation techniques, the power of modern parallel computers can be exploited to great effect to perform these kinds of simulations at large scales.