Skip to main content

Über dieses Buch

Distributed and Parallel Systems: Cluster and Grid Computing is the proceedings of the fourth Austrian-Hungarian Workshop on Distributed and Parallel Systems organized jointly by Johannes Kepler University, Linz, Austria and the MTA SZTAKI Computer and Automation Research Institute.

The papers in this volume cover a broad range of research topics presented in four groups. The first one introduces cluster tools and techniques, especially the issues of load balancing and migration. Another six papers deal with grid and global computing including grid infrastructure, tools, applications and mobile computing. The next nine papers present general questions of distributed development and applications. The last four papers address a crucial issue in distributed computing: fault tolerance and dependable systems.

This volume will be useful to researchers and scholars interested in all areas related to parallel and distributed computing systems.



Cluster Computing: Tools and Techniques


Toward a Cluster Operating System that Offers a Single System Image

The lack of a Single System Image (SSI) is a major obstacle for non-clusters being routinely employed by ordinary engineers, managers, and accountants; and supporting next generation application software on non-dedicated clusters. We show that SSI clusters could be achieved through SSI operating systems.
Andrzej M. Goscinski

Load Balancing for P-Grade Parallel Applications

A centralized, process-based load balancer for P-GRADE has been designed and implemented. The program estimates the computation and communication demands of the application processes, and executes the diffusion or simulated annealing algorithm to get a nearly optimal process-host mapping. Both the estimation and the optimization contain several parameters which are subject to empirical tuning.
Márton László Tóth, Norbert Podhorszki, Peter Kacsuk

Exploiting Runtime Information in Load Balancing Strategies

Adaptability to irregularities in the execution of an application, and to changes in resource availabilities are two features to deal with in order to achieve performance in execution of applications. In ADAJ, efficiency is obtained through a load balancing mechanism which makes use of an observation tool which monitors the application during execution. The evaluation of the load balancing mechanism for a concrete application shows gains of 32% up to 48% in applications’ execution times.
Violeta Felea

Server Based Migration of Parallel Applications

In this paper we present solution for checkpointing and migrating parallel applications using a message passing layer. The checkpoint system is based on a user-level checkpointer, where a central server keeps track of the coordination of the parallel checkpoint and migration. In order to migrate the whole application including the server itself, the server is able to perform self-checkpoint. First version of the migration system has been implemented in P-GRADE generated parallel applications.
Jozsef Kovacs, Peter Kacsuk

Predicting Performance of SMP Clusters

The paper reports on simulation-based performance prediction of parallel applications running on SMP clusters. First the typical message-passing programs are investigated, which demonstrate different computation/ communication patterns: process farm, pipeline and data parallel computing on a linear processor chain. The usefulness of various SMP cluster configurations is discussed with regard to obtained results. Then the message-passing model of shared memory and SMP clusters is used to simulate execution of hybrid message-passing/shared objects programs. A large system of linear equations is used as an example. A single SMP, a SMP cluster and a workstation cluster are compared in performance with a limited L1-cache size.
Rudolf Čejka, Jiří Staroba, Václav Dvořák

A Data Intensive Computation on a Cluster

Parallel Elementwise Processing
We have investigated the applicability of PC clusters with terabyte disk-servers for data-intensive parallel computing. We used parallel element-wise processing as our testcase. We have searched for the optimal value for parameters of the algorithm running on our hardware environment. The performance of several communication frameworks has been tested, such as C/PVM, C/MPI, Distributed Haskell and socket interface in C. On large inputs with heavy operations our implementations showed considerable speedups.
Zoltán Horváth, Zoltán Hernyák, Tamás Kozsik, Máté Tejfel, Attila Ulbert

Global and Grid Computing


Information System Architecture for Brokering in Large Scale Grids

Information systems are inevitable parts of grid architectures. Existing systems however have weaknesses in supporting every requirement of a resource broker. In this paper a new grid information system architecture is presented that aims to overcome the limits of other systems in large scale grid applications. This architecture offers more efficient query processing, greater flexibility, better scalability and fault tolerance. It also has the advantage that it is built on already existing and proven technologies.
Zoltán Balaton, Gábor Gombás, Zsolt Németh

Experiments in Load Balancing Across the Grid Via a Code Transformation

We propose a code transformation to adapt a parallel MPI application to the grid. It aims at balancing the computational load across the processors in order to reduce the global execution time. This transformation may be applied to a rather wide range of parallel codes. It was originally designed for a Vlasov equation solver, which is particularly challenging due to the dependencies scheme it involves. Experimental results show the advantage of our code transformation compared with others system support approaches. This work is part of the TAG project.
Eric Violard, Romaric David, Benjamin Schwarz

Towards a Robust and Fault-Tolerant Multicast Discovery Architecture for Global Computing Grids

Global grid systems with potentially millions of services require a very effective and efficient service discovery/location mechanism. Current grid environments, due to their smaller size, rely mainly on centralised service directories. Large-scale systems need a decentralised service discovery system that operates reliably in a dynamic and error-prone environment. Work has been done in studying flat, decentralised discovery architectures. In this paper we argue that hierarchical discovery architecture provides a more scalable and efficient approach. We describe our design rationale for a k-ary tree-based fault-tolerant discovery architecture that also acts as an intelligent routing network for client requests. The network is self-configuring as each node performs discovery itself. We describe the architecture of the system, demonstrate its operation and report on our first experiments with the system.
Zoltan Juhasz, Arpad Andics, Szabolcs Pota

C3 Power Tools

The Next Generation...
The 3rd generation of the Cluster Command and Control (C3) Power Tools is an even more powerful and secure command line cluster interface than its predecessors. Furthermore, C3 continues to extend the Single System Illusion (SSi) concept from single clusters to federated clusters (multiple clusters viewed as one computing entity) by providing secure remote access to the C3 Power Tools for both users and administrators in a manner that is as easy to use and as transparent as command line operations issued on a single workstation. This paper will discuss the evolution of C3 and expand on its new capabilities.
Brian Luethke, Thomas Naughton, Stephen L. Scott

Interactive Virtual Reality Volume Visualization on the Grid

Grid computing evolves into a standard method for processing large datasets. Consequently most available grid applications focus on high performance computing and high-throughput computing. The interactive visualization of the acquired simulation results can be performed directly on the grid using the Grid Visualization Kernel GVK, which is a grid middleware extension built on top of the Globus Toolkit. An example is the visualization of volume data within Virtual Reality environments, where the data for visualization is generated somewhere on the grid, while the user explores the visual representation at some other place on the grid.
P. Heinzlreiter, A. Wasserbauer, H. Baumgartner, D. Kranzlmüller, G. Kurka, J. Volkert

Ubiquitous Context Sensing in Wireless Environments

The immanent and pervasive use of mobile devices, especially in wireless environments, raises issues about the context awareness and sensitivity of applications. As the use of embedded mobile devices grows in vast quantity, the need for the efficient gathering, representation and delivery of so called ‘context information’ evolves. With regard to this lack of context oriented computing methods, this work describes issues related to context sensing, representation and delivery, and proposes a new approach for context based computing: Time and event triggered context sensing for mobile devices and an abstract (application and platform independent) representation of context information is introduced. The paper presents different showcases of time and event triggered context sensing in wireless environments.
Alois Ferscha, Simon Vogl, Wolfgang Beer

Parallel and Distributed Software Development and Applications


Application of P-Grade Development Environment in Meteorology

The main objective of a meteorological nowcasting system is to analyse and predict in ultra short range those weather phenomena, which might be dangerous for life and property. The Hungarian Meteorological Service developed a nowcasting system, called MEANDER and its most computational intensive calculations have been parallelised by the help of P-GRADE graphical programming environment. In order to demonstrate the efficient application of P-GRADE in real-size problems we give an overview on the parallelisation of MEANDER system using the P-GRADE environment at the different stages of parallel program development; design, debugging and performance analysis.
Róbert Lovas, Péter Kacsuk, Ákos Horváth, Ándrás Horányi

MRT — An Approach to Minimize the Replay Time During Debugging Message Passing Programs

Cyclic debugging, where a program is executed over and over again, is a popular methodology for tracking down and eliminating bugs. To debug parallel programs, it requires additional techniques due to nondeterministic behavior. Such techniques are record&replay mechanisms. A problem is the cost associated with restarting the program’s execution every time from the beginning. A corresponding solution is offered by combining checkpointing and debugging, which allows restarting executions at an intermediate state. However, minimizing the replay time is still a challenge. Previous methods cannot ensure that the replay time has an upper bound. This quality is important for developing a debugging tool, in which some degree of interactivity for the user’s investigations is required. This problem is discussed in this paper and an approach to minimize the replay time, the MRT method, is described. It ensures a small upper bound for the replay time with low overhead. The resulting technique is able to reduce the waiting time and the costs of cyclic debugging.
Nam Thoai, Jens Volkert

Ant — A Testing Environment for Nondeterministic Parallel Programs

Testing nondeterministic programs is one of the most difficult activities of parallel software engineering. The code under consideration may exhibit different program behavior during successive executions, even if the same input data is provided. This increases the number of required testing cycles, since correctness must be investigated for all possible program executions. Although exhaustive testing is practically not feasible in most cases, the Automatic Nondetermin-ism Tester ANT offers the theoretical capabilities to perform complete testing of nondeterministic parallel programs. Control mechanisms allow the user to balance the amount of testing between a sufficient assessment of quality and the usual constraints of software production.
Dieter Kranzlmüller, Martin Maurer, Markus Löberbauer, Christian Schaubschläger, Jens Volkert

Semantic Elements for Parallel Computing in ORB(M)

The behaviour of distributed object frameworks usually cannot be influenced by their users, thus they can hardly be adapted to the actual application being developed and to the actual hardware environment. In this paper we present our flexible approach of extensible Object Request Broker middlewares. Our model of Pluggable Semantic Elements (PSE) allows the user to implement and arbitrarily combine the well-defined functional components of invocation semantics. The PSE model is implemented by our ORB(M) framework, which allows the user to exploit the special characteristics of the application and the actual computing environment.
Attila Ulbert

A Mobile Agent-Based Infrastructure for an Adaptive Multimedia Server

This paper introduces a mobile agent-based infrastructure for an adaptive multi-media server enabling a dynamic migration or replication of certain multimedia applications among a set of available server nodes. It discusses the requirements from both, the server’s and the middleware’s point of view to each other and comes up with a specification and implementation of a CORBA-based interface between them.
Balázs Goldschmidt, Roland Tusch, László Böszörményi

An Adaptive MPEG-4 Proxy Cache

Multimedia is gaining ever more importance on the Internet. This increases the need for intelligent and efficient video caches. A promising approach to improve caching efficiency is to adapt videos. With the availability of MPEG-4 it is possible to develop a standard compliant proxy that allows fast and efficient adaptation.
We propose a modular design for an adaptive MPEG-4 video proxy that supports efficient full and partial video caching in combination with filtering options that are driven by the terminal capabilities of the client. We use the native scalability operations provided by MPEG-4 and use the emerging MPEG-7 standard to describe the scalability options for a video. In this paper, we will restrict ourselves to full video caching.
The combination of adaptation with MPEG-4, MPEG-7 and client terminal capabilities is to the best of our knowledge unique and will increase the quality of service for end users.
Peter Schojer, Laszlo Böszörményi, Hermann Hellwagner

Distributed Training of Universal Multi-Nested Neurons

Training universal multi-nested neurons to represent arbitrary Boolean functions can be performed using a greedy-type stochastic optimization algorithm. However, the optimization search requires a long execution time so that the distribution of the computations on multiprocessors or distributed systems has to be considered in order to speedup the execution. This paper presents a distributed version of the stochastic optimization algorithm based on multiple search paths executed as separated processes on different hosts in a network of workstations. When a process reaches the global optimum, it stops all other processes, which abandon their search paths and a new optimization search (for a different Boolean function) can be started. Since the distributed processes interact only when a solution is found, the communication overhead is very low, which leads to high speedup and efficiency. The software support of the design is CORBA middleware, which allows efficient development of applications in heterogeneous distributed systems.
Felicia Ionescu, Radu Dogaru, Doina Profeta

Parallel Traffic Simulation in Spider Programming Environment

In this paper, we present the implementation of parallel road traffic simulation using the concept of Lane Cut Points (LCPs) in the Spider programming environment. LCPs are storage buffers that are inserted at the end of lanes, which cut across two partitions. Vehicles for other partitions enter the LCPs at the end of the lanes and are removed from the LCP buffers at the beginning of every simulation step. Spider, a programming environment, which runs on PVM, coordinates the execution of the parallel traffic simulation.
Damian Igbe, N. Kalantery, S. E. Ijaha, S. C. Winter

Conformance Testing of Parallel Languages

This paper reports on the ongoing project aimed at developing a formalized approach to generation, execution and evaluation of conformance tests for parallel programming languages and libraries. This issue is gaining more importance today, as modern distributed applications rely more and more on heterogeneous software products that must conform to some common standards despite of their point of origin.
Lukasz Garstecki, Pawel Kaczmarek, Henryk Krawczyk, Bogdan Wiszniewski

Dependable and Fault Tolerant Systems


A Dependable High Performance Serial Storage Architecture

We introduce a novel serial storage architecture that exploits both high availability and high performance characteristics. High availability is achieved by the introduction of redundant loop based interconnections. Our design spouses reliability with high performance by introducing a multiple data path that enables optimistic routing and spatial reuse of the interconnecting strings. By comparing different loop topologies, we demonstrate the benefits arising from the proposed smart serial storage architecture and the effective usage in conjunction with data striping and disk mirroring techniques. The paper also presents a ring mirroring approach that outperforms the conservative RAIDS architecture, when applied to I/O intensive workloads.
G. Rotondi, S. Losco, S. Serbassi

Modeling Uncertainty in System-Level Fault Diagnosis Using Process Graphs

This paper presents a novel approach to solving the probabilistic diagnosis problem in multiprocessor systems. The main idea of the algorithm is based on the reformulation of the error propagation model as a Process Network Synthesis (PNS) problem. This idea is illustrated by deriving a maximum likelihood decision procedure. The diagnostic accuracy of the solution is considered on the basis of simulation measurements, and a method of constructing a general framework for different aspects of a complex problem with the use of P-Graph models is demonstrated.
Balázs Polgár, Endre Selényi, Tamás Bartha

Tolerating Stop Failures in Distributed Maple

Earlier we introduced some fault tolerance mechanisms to the parallel computer algebra system Distributed Maple such that a session may tolerate the failure of nodes and connections without overall failure. We have extended this fault tolerance by some advanced mechanisms. The first is the reconnection of a node after a connection failure such that a session does not deadlock. The second mechanism is the restarting of a node after a failure such that the session does not fail. The third mechanism is the change of the root node such that a session may tolerate also the failure of the root without overall failure.
Károly Bósa, Wolfgang Schreiner

A Mechanism to Detect Wireless Network Failures for MPI Programs

Recent advances in wireless communication technology are making Wireless Local Area Networks (WLAN) an appealing vehicle for parallel and distributed network based computing. The features of the wireless physical level lead new challenges when designing parallel applications. One important challenge is to guarantee the successful completion of the parallel program in presence of wireless link failures. A great concern in wireless communications is the efficient management of spurious disconnections. In this paper we propose an extension of our library to provide transparent network failure detection for Master-Worker MPI parallel programs with or without dependencies among iterations and its execution in a LAN-WLAN infrastructure.
E. M. Macías, A. Suárez


Weitere Informationen