Skip to main content
main-content

Über dieses Buch

Supercomputing is an important science and technology that enables the scientist or the engineer to simulate numerically very complex physical phenomena related to large-scale scientific, industrial and military applications. It has made considerable progress since the first NATO Workshop on High-Speed Computation in 1983 (Vol. 7 of the same series). This book is a collection of papers presented at the NATO Advanced Research Workshop held in Trondheim, Norway, in June 1989. It presents key research issues related to: - hardware systems, architecture and performance; - compilers and programming tools; - user environments and visualization; - algorithms and applications. Contributions include critical evaluations of the state-of-the-art and many original research results.

Inhaltsverzeichnis

Frontmatter

Introduction

Frontmatter

Supercomputing: Key Issues and Challenges

Abstract
In this treatise research issues related to high speed computing will be explored. Important elements of this topic include hardware, algorithms, software and their interrelationships. One quickly can observe that, as in the past, the nature of computing will change dramatically with technological developments. The continued growth in computer power/capacity, workstation capabilities, animation, and host of new technologies such as machine vision, natural language interfaces, can all have a great impact on research requirements and thrusts in scientific computing. To better appreciate what might be needed in future systems, one must explore those likely “drivers” of computation — important science that cannot be done or understood today because of limited computing technology. In this brief introduction, drivers of computation will be explored, and in subsequent sections, the research challenges they suggest will be elaborated.
Kenneth W. Neves, Janusz S. Kowalik

Supercomputing Centers

Frontmatter

Supercomputing Facilities for the 1990s

Abstract
Historically, the rate of change within computing technology has always been high relative to other technological areas and for the next five years, change within supercomputing technology will be particularly fast paced. For example, by the mid-’90s:
  • We will see an increase in computational processing and memory capacity by an order of magnitude,
  • Visualization will become an integral part of state-of-the-art supercomputer centers,
  • The transition to UNIX and related developments will result in a common and rich software environment, and
  • Very high-speed networks will be available.
Bill Buzbee

Research and Development in the Numerical Aerodynamic Simulation Program

Abstract
NASA’s Numerical Aerodynamic Simulation (NAS) Program provides a leading-edge computational capability supporting aerospace research and development applications. The NAS facility, located at NASA’s Ames Research Center (ARC), provides supercomputing services to over 1000 researchers from NASA, the Department of Defense, the aerospace industry, and university sites across the United States. In addition to providing advanced computational resources, the NAS Program acts as a pathfinder in the development, integration and testing of new, highly advanced computer systems for computational fluid dynamics and related disciplines. In fulfilling this role, the Program acquired early serial number supercomputers and integrated them into a new system paradigm that stresses networking, interactive processing, and supercomputer-workstation connectivity within a single uniform UNIX® software environment. The NAS system is described; and its pioneering advances in supercomputing operating system and communication software, local and remote networking, scientific workstations, interactive graphics processing and user interfaces are highlighted. Finally, some results of the newly initiated applied research effort are presented.
F. Ron Bailey

Supercomputing at KFA Jülich: Experiences and Applications on CRAY X-MP

Abstract
KFA Jülich is one of the largest big-science research centers in Europe. At KFA, computational science based on supercomputer techniques has received high priority for many years. Primarily, CRAY supercomputer power has been exploited in the single processor mode by applying vectorization and optimization techniques to numerical kernels and large applications. However, on the multiprocessor vector-supercomputers CRAY X-MP and Y-MP, parallelism — beyond vectorization — can be exploited by multitasking. With increasing number of powerful processors, multitasking strategies like macrotasking, microtasking, and autotasking will become more important and profitable to utilize these multiprocessor systems efficiently. Multitasking results and experiences have been gained by applying these modes to linear-algebra and non-numerical algorithms as well as to large codes. While comparing the results and the concepts of multitasking, the problems, benefits, and perspectives of multitasking programming will be discussed.
F. Hossfeld

Supercomputing: Experience With Operational Parallel Processing

Abstract
Operational numerical weather prediction requires that all the computing power available is harnessed to the one task, otherwise the weather will have changed before it has been predicted. Therefore, in 1983 after the advent of multiprocessor systems as the most powerful computers, the European Centre for Medium-Range Weather Forecasts (ECMWF) undertook to modify its main algorithms and to run its forecast programs in multitasking mode. The decisions for achieving this goal, taken with regard to program structure and synchronization methods, will be discussed. The constraints imposed by the operational nature of the work will be highlighted. A set of requirements will be developed which will satisfy the needs of ECMWF and similar centres.
Geerd-R. Hoffmann

The Evolving Supercomputer Environment at the San Diego Supercomputer Center

Abstract
In the early 1980’s recognition was growing in the United States that computational simulation using supercomputers was becoming an important aspect of the scientific research process. However, academic researchers in the U.S. had limited access to supercomputers. At that time more European universities had either purchased or had access to American supercomputers than did U.S. universities. A number of studies (the Bardon-Curtis Report1, the Lax Report2, the Press Report3, the FCCSET report4) revealed that the United States must take steps to ensure improved supercomputing access for American researchers and scientists.
Sidney Karin

Supercomputer Architecture

Frontmatter

SUPRENUM: Architecture and Applications

Abstract
In the last decade the numerical simulation of physical or chemical processes has gained increasing importance in many fields of science and engineering. Computer experiments, as the numerical simulations are often called, offer the advantage that they are free of principal limitations, whereas experiments are either not always accurate (e.g. wind tunnel) or even not possible (e.g. reentrant space vehicles, astrophysics, elementary particle physics, etc.).
Karl Solchenbach

Advanced Architecture and Technology of the NEC SX-3 Supercomputer

Abstract
Based on the expertise of the SX-2 Supercomputer announced in 1983 which was the first supercomputer to break the 1 GFLOPS performance barrier, the NEC SX-3 Supercomputer has been developed to meet the growing demands for large and high speed computations, offering a maximum performance of 22 GFLOPS. The SX-3 consists of seven models based on the number of processors and the number of vector pipelines per processor. The models range from 1.4 GFLOPS uniprocessor entry model to a 22 GFLOPS quad processor top end model.
Tadashi Watanabe

Programming Tools and Scheduling

Frontmatter

PROCESS SCHEDULING: Parallel Languages Versus Operating Systems

Abstract
This paper deals with multiple processes performing a parallel program, how they are scheduled on physical processors, and their effectiveness in completing the total work to which the individual processes contribute. The current ideas in process scheduling have, to a large extent, grown out of operating systems research on scheduling relatively independent processes on one or a few processors, with the goal being maximal utilization of available resources, especially the processors. Multiprocessor parallel programs may trade processor utilization efficiency for decreased completion time and may employ very fine grained interaction among processes. This yields a different set of requirements on process scheduling than those arising from optimal use of shared resources among weakly dependent or independent tasks. It is argued that coscheduling specifications are a good way to coordinate the possibly conflicting requirements of parallel program completion and multiple job resource utilization. Coscheduling requirements specify the simultaneous availability of a set of resources, processors being the foremost example. Coscheduling requirements may be met by simultaneous assignment of the required number of units or by fine grained time multiplexing of fewer units.
Harry F. Jordan

SARA: A Cray Assembly Language Speedup Tool

Abstract
SARA (Single Assignment Register Assembler) is an extended form of CAL (Cray Assembly Language) meant for obtaining near optimal performance from relatively short (100’s of instruction) Cray X-MP basic block code sequences. The SARA Optimizing Preprocessor (informally referred to also as “SARA”) converts SARA source files into a form that is acceptable as input to standard Cray Research Inc. CAL Assemblers. The SARA Optimizing Preprocessor can greatly speed up the job of CAL coding by automating the difficult, tedious, and error-prone tasks of assigning registers and ordering instruction sequences to take maximum advantage of the Cray X-MP architecture.
Robert G. Babb

Modeling the Memory of the Cray2 for Compile Time Optimization

Abstract
In a previous work [3], a cyclic scheduling method was shown efficient to generate vector code for the Cray-2 architecture, and compared to existing compilers. This method was using the framework of microcode compaction through a simplified model of the Cray-2 vector instruction stream. In this paper, we further elaborate on how to model the machine architecture within the underlying cyclic scheduling method. The impact of the choice of the model on the code generated is analyzed and performance results are presented.
C. Eisenbeis, W. Jalby, A. Lichnewsky

On the Parallelization of Sequential Programs

Abstract
Parallelization of sequential programs for MIMD computers is considered. In general, the major steps included in the parallelization process are: program partitioning into a task system, derivation of a parallel task system, scheduling and execution of this task system on a multiprocessor system. We present a general framework for an automatic maximally parallel task system generator which may be useful as a component of the code parallelization process. The framework is based on the concept of maximally parallel task systems, a concept which has been mainly used for designing operating systems.
Swarn P. Kumar, Janusz S. Kowalik

User Environments and Visualization

Frontmatter

The Role of Graphics Super-Workstations in a Supercomputing Environment

Abstract
A new class of very powerful workstations has recently become available which integrate near-supercomputer computational performance with very powerful and high quality graphics capability. These “graphics super-workstations” are expected to play an increasingly important role in providing an enhanced environment for supercomputer users. Their potential uses include; off-loading the supercomputer (by serving as stand-alone processors, by post-processing of the output of supercomputer calculations, and by distributed or shared processing), scientific visualization (understanding of results, communication of results), and by real time interaction with the supercomputer (to “steer” an iterative computation, to abort a bad run, or to explore and develop new algorithms).
E. Levin

Issues in Scientific Visualization

Abstract
Scientific visualization, used in partnership with today’s computational tools, whether supercomputer or medical scanner, provides the foremost communications medium in the scientific and engineering world for analyzing and describing phenomena from the atomic, through the anatomic, to the astrophysic.
Several new issues have arisen with the advent of scientific visualization:
  • Visually steering computation
  • Volume visualization and modeling
  • Assembly vs growth of complex structures
  • Personal, peer or presentation graphics
  • Televisualization: scientific visualization over networks
A framework for scientific visualization environments, responsive to these issues, is presented.
Bruce H. McCormick

Requirements and Performance

Frontmatter

Requirements for Multidisciplinary Design of Aerospace Vehicles on High Performance Computers

Abstract
The design of aerospace vehicles is becoming increasingly complex as the various contributing disciplines and physical components become more tightly coupled. This coupling leads to computational problems that will be tractable only if significant advances in high performance computing systems are made. In this paper we discuss some of the modeling, algorithmic and software requirements generated by the design problem.
Robert G. Voigt

Supercomputer Performance Evaluation: The PERFECT Benchmarks

Abstract
Supercomputers, designed to solve problems that would be otherwise intractable, have received considerable attention in recent years. It has been claimed that they are critical to national defense, that the economic well-being of companies and nations depends on their exploitation, and that the grand challenges facing today’s scientists — global change phenomena and human genome definition, e.g. — would remain as open challenges without supercomputer access [1].
Joanne L. Martin

Measurements of Problem-Related Performance Parameters

Abstract
Measurements are presented of the problem-related performance parameters r ,n 1/2,s 1/2,f 1/2 on the Cray-2, ETA-10, IBM 3090/VF and the INMOS T800 Transputer.
Roger W. Hockney

Methods and Algorithms

Frontmatter

A Parallel Iterative Method for Solving 3-D Elliptic Partial Differential Equations

Abstract
In this paper, we will consider the parallel solution of the Dirichlet problem for a second-order uniformly elliptic equation in two and three dimensions. Specifically, we shall consider the problem
$$\begin{gathered} \hfill L\left( u \right) = \,f\,in\,\Omega \\ \hfill u = g\,on\,\Gamma \\ \end{gathered} $$
(1)
where
$$L\left( u \right) = - \sum\limits_{{\text{j}} = 1}^3 {\frac{\partial }{{\partial {x_i}}}} \left( {{a_i}\frac{{\partial u}}{{\partial {x_i}}}} \right) $$
with a i positive, bounded, and piecewise smooth on bounded Ω with boundary Γ. For sake of exposition, we will assume the equation is 2-dimensional, however, the extension to 3-dimensions of the numerical methods to be described will be described will be obvious.
Anne Greenbaum, Edna Reiter, Garry Rodrigue

Solving Linear Equations by Extrapolation

Abstract
This is a survey paper on extrapolation methods for vector sequences. We have simplified some derivations and we give some numerical results which illustrate the theory.
Walter Gander, Gene H. Golub, Dominik Gruntz

Super Parallel Algorithms

Abstract
Serial algorithmic design concentrates on how to solve a single instance of a given task on a single processor. The natural extension for parallel processing usually concentrates on how to solve the given single problem on a set of processors. When algorithms for massively parallel systems are designed it often occurs that the algorithm for parallel solution of a given problem turns out to be more general and automatically solves multiple instances of the task simultaneously. These algorithms we call Super Parallel algorithms. This paper discusses some examples of Super Parallel Algorithms.
D. Parkinson

Vectorization and Parallelization of Transport Monte Carlo Simulation Codes

Abstract
In recent years, the demand for solving large scale scientific and engineering problems has grown enormously. Since many programs for solving these problems inherently contain a very high degree of parallelism, they can be processed very efficiently if algorithms employed therein expose the parallelism to the architecture of a supercomputer.
Kenichi Miura

Very Large Database Applications of the Connection Machine System

Abstract
The architecture of the Connection Machine system is particularly appropriate for large database applications. The Connection Machine system consists of 65,536 processors, each with its own memory, coupled by a high speed communications network. In large database applications, individual data elements are stored in separate processors and are operated on simultaneously. This paper examines two applications of this technology. The first, which will be examined in the greatest detail, is the use of the Connection Machine System for document retrieval. The second topic is the application of the Connection Machine to associative memory or content addressable memory tasks. This ability has been put to use in a method called “memory-based reasoning” which can produce expert system-like behavior from a database of records of earlier decisions. These classes of applications both scale well; the application of supercomputer power to such problems offers unprecedented functionality and opportunities.
David Waltz, Craig Stanfill

Data Structures for Parallel Computation on Shared-Memory Machines

Abstract
One of the major bottlenecks in developing parallel algorithms is the lack of “parallel data structures,” particularly for nonnumerical and seminumerical problems. Many of the traditional data structures that have served so well in sequential computing do not easily lend themselves to parallel processing, hi this paper we develop three practical paradigms for designing parallel data structures in a shared-memory environment and illustrate them through parallelization of a queue, a stack, and a heap. Other related structures are also discussed.
Narsingh Deo

Parallel Evaluation of Some Recurrence Relations by Recursive Doubling

Abstract
The second order linear recurrence formulae which result from the Fourier series coefficients of the Jacobian elliptic functions snm(u,k), cnm(u,k) and dnm(u,k) with m⩾l, are evaluated by the method of recursive doubling on a parallel computer and performance results are discussed.
A. Kiper, D. J. Evans

An Asynchronous Newton-Raphson Method

Abstract
We consider a parallel variant of the Newton-Raphson method for unconstrained optimization, which uses as many finite differences of gradients as possible to approximate rows and columns of the inverse Hessian matrix. The method is based on the Gauss-Seidel type of updating for quasi-Newton methods originally proposed by Straeter (1973). It incorporates the finite-difference approximations via the Barnes-Rosen corrections analysed by Van Laarhoven (1985). At the end of the paper we discuss the potential of the method for on-line, real-time optimization.
Freerk A. Lootsma

Networking

Frontmatter

Interconnection Networks for Highly Parallel Supercomputing Architectures

Abstract
This paper treats (Multiple) Interconnection Networks (MINs), dealing in particular with their adoption in supercomputing architectures. The main properties and characteristics of MINs are recalled and some typical parallel computer architectures adopting MINs are summarized. Two main application classes of MINs are considered: parallel computer systems implemented by connecting together powerful processors and large shared memories, and dedicated supercomputing structures directly implementing highly parallel algorithms. For both application classes, the adoption of fault tolerance methods is discussed. Fault tolerance can be usefully adopted both to overcome production defects and faults arising during systems working life. Classic approaches to fault tolerance in MINs for parallel computer systems and some recent results in the less known field of fault tolerance in dedicated supercomputing structures are surveyed.
A. Antola, R. Negrini, M. G. Sami, R. Stefanelli

Computer Networking for Interactive Supercomputing

Abstract
Supercomputers are now integral parts of scientific research. Resources and researchers are scattered over a wide geographic area. The distances involved are sufficiently large that wide area data network access is and will continue to be absolutely necessary.
It is our responsibility to provide an infrastructure such that anywhere someone happens to be, he or she can communicate with whatever people and computational resources that they desire.
The distribution of resources across such wide areas can and is being accomplished with data networks by NASA, DARPA, NSF, and many other US government agencies. Universities and many corporations are also so connected. Widespread cooperation occurs, with exceptional results. Information can and does move between people across continents, on demand.
Here we describe those networking paradigms in use now and in the future at the NASA Numerical Aerodynamic Simulation (NAS) Division. These methods are used very effectively by researchers across the United States. The key ideas which drive these paradigms are: Interactive access to resources is essential, and standardized, effective operating systems and networking protocols are essential. The NAS systems all use UNIX1 and TCP/ IP to fulfill these ideas.
John Lekashman

Storage

Frontmatter

You’re Not Waiting, You’re Doing I/O

Abstract
We discuss some of the growing problem in computer usage related to storage. As computers continue to get faster and support more memory and larger, more sophisticated applications we are seeing that the I/O and storage services are inadequate. Two rather bad omens are that the disparities are growing, and nothing seems to be being done about it by vendors.
George A. Michael

Backmatter

Weitere Informationen