Skip to main content
Top

2020 | Book

Parallel Algorithms in Computational Science and Engineering

insite
SEARCH

About this book

This contributed volume highlights two areas of fundamental interest in high-performance computing: core algorithms for important kernels and computationally demanding applications. The first few chapters explore algorithms, numerical techniques, and their parallel formulations for a variety of kernels that arise in applications. The rest of the volume focuses on state-of-the-art applications from diverse domains. By structuring the volume around these two areas, it presents a comprehensive view of the application landscape for high-performance computing, while also enabling readers to develop new applications using the kernels. Readers will learn how to choose the most suitable parallel algorithms for any given application, ensuring that theory and practicality are clearly connected. Applications using these techniques are illustrated in detail, including:Computational materials science and engineeringComputational cardiovascular analysisMultiscale analysis of wind turbines and turbomachineryWeather forecastingMachine learning techniquesParallel Algorithms in Computational Science and Engineering will be an ideal reference for applied mathematicians, engineers, computer scientists, and other researchers who utilize high-performance computing in their work.

Table of Contents

Frontmatter

High Performance Algorithms

Frontmatter
State-of-the-Art Sparse Direct Solvers
Abstract
In this chapter we will give an insight into the development of modern sparse elimination methods. These are driven by a preprocessing phase based on combinatorial algorithms which improve diagonal dominance, reduce fill-in, and improve concurrency to allow for parallel treatment. Moreover, these methods detect dense submatrices which can be handled by dense matrix kernels based on multithreaded level-3 BLAS. We will demonstrate how recent improvements in developing advanced direct solution methods have enabled speeding up parallel circuit simulation without sacrificing accuracy.
Matthias Bollhöfer, Olaf Schenk, Radim Janalik, Steve Hamm, Kiran Gullapalli
The Effect of Various Sparsity Structures on Parallelism and Algorithms to Reveal Those Structures
Abstract
Structured sparse matrices can greatly benefit parallel numerical methods in terms of parallel performance and convergence. In this chapter, we present combinatorial models for obtaining several different sparse matrix forms. There are four basic forms we focus on: singly-bordered block-diagonal form, doubly-bordered block-diagonal form, nonempty off-diagonal block minimization, and block diagonal with overlap form. For each of these forms, we first present the form in detail and describe what goals are sought within the form, and then examine the combinatorial models that attain the respective form while targeting the sought goals, and finally explain in which aspects the forms benefit certain parallel numerical methods and their relationship with the models. Our work focuses especially on graph and hypergraph partitioning models in obtaining the mentioned forms. Despite their relatively high preprocessing overhead compared to other heuristics, they have proven to model the given problem more accurately and this overhead can be often amortized due the fact that matrix structure does not change much during a typical numerical simulation. This chapter presents a number of models and their relationship with parallel numerical methods.
Oguz Selvitopi, Seher Acer, Murat Manguoğlu, Cevdet Aykanat
Structure-Exploiting Interior Point Methods
Abstract
Interior point methods are among the most popular techniques for large-scale nonlinear optimization, owing to their intrinsic ability of scaling to arbitrarily large problem sizes. Their efficiency has attracted, in recent years, a lot of attention due to increasing demand for large-scale optimization in industry and engineering. General purpose nonlinear programming solvers implementing interior point methods provide a generic framework that embraces almost every optimal control problem provided that the nonlinear functions are smooth and that their first and possibly the second derivatives are also available. However, a large class of industrial and engineering problems possesses a particular structure motivating the development of structure-exploiting interior point methods. We present an interior point framework that exploits the intrinsic structure of large-scale nonlinear optimization problems, with the purpose of accelerating the solution both for single-core or multicore execution and massively parallel high-performance computing infrastructures. Since the overall performance of interior point methods relies heavily on scalable sparse linear algebra solvers, particular emphasis is given to the underlying algorithms for the distributed solution of the associated sparse linear systems obtained at each iteration from the linearization of the optimality conditions. The interior point algorithm is implemented in an object-oriented parallel solver and applied for the solution of large-scale optimal control problems solved on a daily basis for the secure transmission and distribution of electricity in modern power grids.
Juraj Kardoš, Drosos Kourounis, Olaf Schenk
Parallel Hybrid Sparse Linear System Solvers
Abstract
In this chapter, we present the SPIKE family of algorithms for solving banded linear systems and its multithreaded implementation as well as direct-iterative hybrid variants for solving general sparse linear system of equations.
Murat Manguoğlu, Eric Polizzi, Ahmed H. Sameh

High Performance Computational Science and Engineering Applications

Frontmatter
Computational Materials Science and Engineering
Abstract
Computational methods in materials science have made huge strides in recent years and parallel computing methodologies have played a major role in enabling such a progress. The goal of this chapter is to discuss the current state of the art in computational materials science as it stands today, illustrating advances in the development of parallel algorithms and the impact such algorithms have had in the area. The paper is intended to be accessible to a diverse scientific computing audience. The focus of the paper will be the Density Functional Theory methodology and the solution of the eigenvalue problems that are encountered in solving the resulting equations.
Eric Polizzi, Yousef Saad
Computational Cardiovascular Analysis with the Variational Multiscale Methods and Isogeometric Discretization
Abstract
Computational cardiovascular analysis can provide valuable information to cardiologists and cardiovascular surgeons on a patient-specific basis. There are many computational challenges that need to be faced in this class of flow analyses. They include highly unsteady flows, complex cardiovascular geometries, moving boundaries and interfaces, such as the motion of the heart valve leaflets, contact between moving solid surfaces, such as the contact between the leaflets, and the fluid–structure interaction between blood and cardiovascular structure. Many of these challenges have been or are being addressed by the Space–Time Variational Multiscale (ST-VMS) method, the Arbitrary Lagrangian–Eulerian VMS (ALE-VMS) method, and VMS-based immersogeometric analysis (IMGA-VMS), which serve as the core computational methods, and other special methods used in combination with them. We provide an overview of these methods and present examples of challenging computations carried out with them, including aortic and heart valve flow analyses. We also point out that these methods are general computational fluid dynamics techniques and have broad applicability to a wide range of other areas of science and engineering.
Thomas J. R. Hughes, Kenji Takizawa, Yuri Bazilevs, Tayfun E. Tezduyar, Ming-Chen Hsu
ALE and Space–Time Variational Multiscale Isogeometric Analysis of Wind Turbines and Turbomachinery
Abstract
Many of the challenges encountered in computational analysis of wind turbines and turbomachinery are being addressed by the Arbitrary Lagrangian–Eulerian (ALE) and Space–Time (ST) Variational Multiscale (VMS) methods and isogeometric discretization. The computational challenges include turbulent rotational flows, complex geometries, moving boundaries and interfaces, such as the rotor motion, and the fluid–structure interaction (FSI), such as the FSI between the wind turbine blade and the air. The core computational methods are the ALE-VMS and ST-VMS methods. These are supplemented with special methods like the Slip Interface (SI) method and ST Isogeometric Analysis with NURBS basis functions in time. We describe the core and special methods and present, as examples of challenging computations performed, computational analysis of horizontal- and vertical-axis wind turbines and flow-driven string dynamics in pumps.
Yuri Bazilevs, Kenji Takizawa, Tayfun E. Tezduyar, Ming-Chen Hsu, Yuto Otoguro, Hiroki Mochizuki, Michael C.  H. Wu
Variational Multiscale Flow Analysis in Aerospace, Energy and Transportation Technologies
Abstract
Computational flow analysis is now playing a key role in aerospace, energy and transportation technologies, bringing solution in challenging problems such as aerodynamics of parachutes, thermo-fluid analysis of ground vehicles and tires, and fluid–structure interaction (FSI) analysis of wind turbines. The computational challenges include complex geometries, moving boundaries and interfaces, FSI, turbulent flows, rotational flows, and large problem sizes. The Residual-Based Variational Multiscale (RBVMS), ALE-VMS and Space–Time VMS (ST-VMS) methods have been quite successful serving as core methods in addressing the computational challenges. The core methods are supplemented with special methods targeting specific classes of problems, such as the Slip Interface (SI) method, Multi-Domain Method, and the “ST-C” data compression method. We describe the core and special methods. We present, as examples of challenging computations performed with these methods, aerodynamic analysis of a ram-air parachute, thermo-fluid analysis of a freight truck and its rear set of tires, and aerodynamic and FSI analysis of two back-to-back wind turbines in atmospheric boundary layer flow.
Kenji Takizawa, Yuri Bazilevs, Tayfun E. Tezduyar, Artem Korobenko
Multiscale Crowd Dynamics Modeling and Safety Problems Towards Parallel Computing
Abstract
This chapter is devoted to the study of possible developments of parallel computing methods in crowd dynamics. The content is presented as scientific speculations on three key topics focused on the modeling of human crowds related to safety and the development of computational tools that include a possible vision of parallel computing. The chapter proposes a dialogue between applied mathematicians, crisis managers, and scientific computing. Research perspectives are brought to the attention of the reader with the aim of addressing research activity on crowd dynamics towards new challenging paths which look at crowd dynamics from the viewpoint of end-users. It is shown how parallel computing can contribute to a multiscale vision of crowd dynamics and to artificial intelligence methods to support decision-making by crisis managers.
Bouchra Aylaj, Nicola Bellomo
HPC for Weather Forecasting
Abstract
Numerical weather prediction (NWP) is one of the first applications of scientific computing and remains an insatiable consumer of high-performance computing today. In the face of a half-century’s exponential and sometimes disruptive growth in HPC capability, major weather services around the world continuously develop and incorporate new meteorological research into large expensive operational forecasting software suites. At the heart is the weather model itself: a computational fluid dynamics core on a spherical domain with physics. The mapping of a planar grid to the geometry of the earth’s atmosphere presents one of a number of challenges for stability, accuracy, and computational efficiency that lead to development of the major numerical formulations and grid systems in use today: grid point, globally spectral, and finite/spectral element. Significant challenges await in the exascale era: large memory working sets, overall low computational intensity, load imbalance, and a fundamental lack of weak scalability in the face of critical real-time forecasting speed requirements. This chapter provides a history of weather and climate modeling on high-performance computing systems, a discussion of each of the major types of model dynamics formulations, grids, and model physics, and directions going forward on emerging HPC architectures.
John Michalakes
A Simple Study of Pleasing Parallelism on Multicore Computers
Abstract
Pleasingly parallel computations are those that involve completely independent work. We investigate these in the context of a problem we call AllPageRank. The AllPageRank problem involves computing a subset of accurate PageRank entries for each possible seeded PageRank vector. AllPageRank is representative of a wider class of possible computational procedures that will run a large number of experiments on a single graph structure. Our study involves computing the AllPageRank vectors for a multi-million node graph within a reasonable timeframe on a modern shared memory, high-core count computer. For this setting, we parallelize over all of the seeded PageRank vector computations, which are all independent. The experiments demonstrate that there are non-trivial complexities in obtaining performance even in this ideal situation. For instance, threading computational environments gave scaling problems with a shared graph structure in memory. Also sparse matrix ordering techniques and multivector, or SIMD, optimizations were required to get a total runtime of a few days. We also show how different algorithms for PageRank that have different algorithmic advances and memory access patterns behave to guide future investigation of similar problems.
Yanfei Ren, David F. Gleich
Parallel Fast Time-Domain Integral-Equation Methods for Transient Electromagnetic Analysis
Abstract
Transient and broadband electromagnetic scattering and radiation phenomena involving perfect electrically conducting (PEC) and dielectric objects can be efficiently simulated using marching-on-in-time (MOT)-based time-domain (TD) integral-equation (IE) methods. This chapter reviews recent advances in the development of highly parallel, fast, accurate, stable, and rapidly converging TDIE solvers. These solvers apply to very large-scale problems within the realm of scattering analysis, antenna/RF device modeling, electromagnetic compatibility modeling, optical imaging, and biomedical applications, to name but a few.
Yang Liu, Eric Michielssen
Parallel Optimization Techniques for Machine Learning
Abstract
In this chapter we discuss higher-order methods for optimization problems in machine learning applications. We also present underlying theoretical background as well as detailed experimental results for each of these higher order methods and also provide their in-depth comparison with respect to competing methods in the context of real-world datasets. We show that higher-order methods, contrary to popular understanding, can achieve significantly superior results compared to state-of-the-art competing methods in shorter wall-clock times yielding orders of magnitude of relative speedup for typical real-world datasets.
Sudhir Kylasa, Chih-Hao Fang, Fred Roosta, Ananth Grama
Metadata
Title
Parallel Algorithms in Computational Science and Engineering
Editors
Prof. Ananth Grama
Prof. Ahmed H. Sameh
Copyright Year
2020
Electronic ISBN
978-3-030-43736-7
Print ISBN
978-3-030-43735-0
DOI
https://doi.org/10.1007/978-3-030-43736-7

Premium Partner