EU Funded Grid Development in Europe

Abstract

Several Grid projects have been established that deploy a “first generation Grid”. In order to categorise existing projects in Europe, we have developed a taxonomy and applied it to 20 European Grid projects funded by the European Commission through the Framework 5 IST programme. We briefly describe the projects and thus provide an overview of current Grid activities in Europe. Next, we suggest future trends based on both the European Grid activities as well as progress of the world-wide Grid community. The work we present here is a source of information that aims to help to promote European Grid development.

Paul Graham, Matti Heikkurinen, Jarek Nabrzyski, Ariel Oleksiak, Mark Parsons, Heinz Stockinger, Kurt Stockinger, Maciej Stroiński, Jan Węglarz

Pegasus: Mapping Scientific Workflows onto the Grid

Abstract

In this paper we describe the Pegasus system that can map complex workflows onto the Grid. Pegasus takes an abstract description of a workflow and finds the appropriate data and Grid resources to execute the workflow. Pegasus is being released as part of the GriPhyN Virtual Data Toolkit and has been used in a variety of applications ranging from astronomy, biology, gravitational-wave science, and high-energy physics. A deferred planning mode of Pegasus is also introduced.

Ewa Deelman, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Sonal Patil, Mei-Hui Su, Karan Vahi, Miron Livny

A Low-Cost Rescheduling Policy for Dependent Tasks on Grid Computing Systems

Abstract

A simple model that can be used for the representation of certain workflows is a directed acyclic graph. Although many heuristics have been proposed to schedule such graphs on heterogeneous environments, most of them assume accurate prediction of computation and communication costs; this limits their direct applicability to a dynamically changing environment, such as the Grid. To deal with this, run-time rescheduling may be needed to improve application performance. This paper presents a low-cost rescheduling policy, which considers rescheduling at a few, carefully selected points in the execution. Yet, this policy achieves performance results, which are comparable with those achieved by a policy that dynamically attempts to reschedule before the execution of every task.

Henan Zhao, Rizos Sakellariou

An Advanced Architecture for a Commercial Grid Infrastructure

Abstract

Grid Infrastructures have been used to solve large scale scientific problems that do not have special requirements on QoS. However, the introduction and success of the Grids in commercial applications as well, entails the provision of QoS mechanisms which will allow for meeting the special requirements of the users-customers. In this paper we present an advanced Grid Architecture which incorporates appropriate mechanisms so as to allow guarantees of the diverse and contradictory users’ QoS requirements. We present a runtime estimation model, which is the heart of any scheduling and resource allocation algorithm, and we propose a scheme able to predict the runtime of submitted jobs for any given application on any computer by introducing a general prediction model. Experimental results are presented which indicate the robustness and reliability of the proposed architecture. The scheme has been implemented in the framework of GRIA IST project (Grid Resources for Industrial Applications).

Antonios Litke, Athanasios Panagakis, Anastasios Doulamis, Nikolaos Doulamis, Theodora Varvarigou, Emmanuel Varvarigos

Managing MPI Applications in Grid Environments

Abstract

One of the goals of the EU CrossGrid project is to provide a basis for supporting the efficient execution of parallel and interactive applications on Grid environments. CrossGrid jobs typically consist of computationally intensive simulations that are often programmed using a parallel programming model and a parallel programming library (MPI). This paper describes the key components that we have included in our resource management system in order to provide effective and reliable execution of parallel applications on a Grid environment. The general architecture of our resource management system is briefly introduced first and we focus afterwards on the description of the main components of our system. We provide support for executing parallel applications written in MPI either in a single cluster or over multiple clusters.

Elisa Heymann, Miquel A. Senar, Enol Fernández, Alvaro Fernández, José Salt

Flood Forecasting in CrossGrid Project

Abstract

This paper presents a prototype of flood forecasting system based on Grid technologies. The system consists of workflow system for executing simulation cascade of meteorological, hydrological and hydraulic models, data management system for storing and accessing different computed and measured data, and web portals as user interfaces. The whole system is tied together by Grid technology and is used to support a virtual organization of experts, developers and users.

L. Hluchy, V. D. Tran, O. Habala, B. Simo, E. Gatial, J. Astalos, M. Dobrucky

MPICH-G2 Implementation of an Interactive Artificial Neural Network Training

Abstract

Distributed Training of an Artificial Neural Network (ANN) has been implemented using MPICH-G2, and deployed on the testbed of the european CrossGrid project. Load balancing, including adaptative techniques, has been used to cope with the heterogeneous setup of computing resources. First results show the feasibility of this approach, and the opportunity for a Quality of Service framework. To give an example, a reduction in the training time from 20 minutes using a single local node downto less than 3 minutes using 10 nodes distributed across Spain, Poland, and Portugal, has been obtained.

D. Rodríguez, J. Gomes, J. Marco, R. Marco, C. Martínez-Rivero

OpenMolGRID, a GRID Based System for Solving Large-Scale Drug Design Problems

Abstract

Pharmaceutical companies are screening millions of molecules in silico. These processes require fast and accurate predictive QSAR models. Unfortunately, nowadays these models do not include information-rich quantum-chemical descriptors, because of their time-consuming calculation procedure. Collection of experimental data is also difficult, because the sources are usually located in disparate resources. These challenges make indispensable the usage of GRID systems. OpenMolGRID (Open Computing GRID for Molecular Science and Engineering) is one of the first realizations of the GRID technology in drug design. The system is designed to build QSAR models based on thousands of different type of descriptors, and apply these models to find novel structures with targeted properties. An implemented data warehouse technology makes possible to collect data from geographically distributed, heterogeneous resources. The system will be tested in real-life situations: Predictive models will be built on in vitro human toxicity values determined for 30,000 novel and diverse chemical structures.

Ferenc Darvas, Ákos Papp, István Bágyi, Géza Ambrus, László Ürge

Integration of Blood Flow Visualization on the Grid: The FlowFish/GVK Approach

Abstract

We have developed the FlowFish package for blood flow visualization of vascular disorder simulations, such as aneurysms and stenosis. We use a Lattice-Boltzmann solver for flow process simulation to test the efficiency of the visualization classes, and experiment with the combination of grid applications and corresponding visualization clients on the European Crossgrid testbed, to assess grid accessability and visualization data transfer performance.

Alfredo Tirado-Ramos, Hans Ragas, Denis Shamonin, Herbert Rosmanith, Dieter Kranzmueller

A Migration Framework for Executing Parallel Programs in the Grid

Abstract

The paper describes a parallel program checkpointing mechanism and its potential application in Grid systems in order to migrate applications among Grid sites. The checkpointing mechanism can automatically (without user interaction) support generic PVM programs created by the PGRADE Grid programming environment. The developed checkpointing mechanism is general enough to be used by any Grid job manager but the current implementation is connected to Condor. As a result, the integrated Condor/PGRADE system can guarantee the execution of any PVM program in the Grid. Notice that the Condor system can only guarantee the execution of sequential jobs. Integration of the Grid migration framework and the Mercury Grid monitor results in an observable Grid execution environment where the performance monitoring and visualization of PVM applications are supported even when the PVM application migrates in the Grid.

József Kovács, Péter Kacsuk

Implementations of a Service-Oriented Architecture on Top of Jini, JXTA and OGSI

Abstract

This paper presents the design of an implementation-independent, Service-Oriented Architecture (SOA), which is the main basis of the ICENI Grid middleware. Three implementations of this architecture have been provided on top of Jini, JXTA and the Open Grid Services Infrastructure (OGSI). The main goal of this paper is to discuss these different implementations and provide an analysis of their advantages and disadvantages.

Nathalie Furmento, Jeffrey Hau, William Lee, Steven Newhouse, John Darlington

Dependable Global Computing with JaWS++

Abstract

In this paper we propose a computational grid platform called JaWS++ that seeks to harvest the power of idle pools of workstations connected through the Internet and integrate them in a grid computing platform for the execution of embarrassingly parallel computations. The computations are developed in the portable Java programming language and an API is provided for application development. JaWS++ is a compromise between scavenging and reservation-based computational grids. Its service layer is composed by pools of workstations that are autonomously administered by different organizations. Each pool participates in JaWS++ under a well defined timetable to reduce unforeseen availability problems, increase dependability and favor batch work allocation and offline execution.

George Kakarontzas, Spyros Lalis

Connecting Condor Pools into Computational Grids by Jini

Abstract

The paper describes how Condor-pools could be joined together to form a large computational cluster-grid. In the architecture Jini provides the infrastructure for resource lookup, while Condor manages the job execution on the individual clusters. Semi on-line application monitoring is also available in this structure, moreover it works even through firewalls. Beside Condor the presented Jini based Grid can support other local jobmanager implementations, thus various types of sequential or parallel jobs could be executed with the same framework.

Gergely Sipos, Péter Kacsuk

Overview of an Architecture Enabling Grid Based Application Service Provision

Abstract

In this short paper we examine the integration of three emerging trends in Information Technology (Utility Computing, Grid Computing, and Web Services) into a new Computing paradigm (Grid-based Application Service Provision) that is taking place in the context of the European research project GRASP. In the first part of the paper, we explain how the integration of emerging trends can support enterprises in creating competitive advantage. In the second part, we summarise an architecture blueprint of Grid-based Application Service Provision (GRASP), which enables a new technology-driven business paradigm on top of such integration.

S. Wesner, B. Serhan, T. Dimitrakos, D. Mac Randal, P. Ritrovato, G. Laria

A Grid-Enabled Adaptive Problem Solving Environment

Abstract

As complexity of computational applications and their environments has been increased due to the heterogeneity of resources; complexity, continuous changes of the applications as well as the resources states, and the large number of resources involved, the importance of problem solving environments has been more emphasized. As a PSE for metacomputing environment, Adaptive Distributed Computing Environment (ADViCE) has been developed before the emergence of Grid computing services. Current runtime systems for computing mainly focus on executing applications with static resource configuration and do not adequately change the configuration of application execution environments dynamically to optimize the application performance. In this paper, we present an architectural overview of ADViCE and discuss how it is evolving to incorporate Grid computing services to extend its range of services and decrease the cost of development, deployment, execution and maintenance for an application. We provide that ADViCE optimize the application execution at runtime adaptively optimize based on application requirements in both non-Grid and Grid environment with optimal execution options. We have implemented the ADViCE prototype and currently evaluating the prototype and its adaptive services for a larger set of Grid applications.

Yoonhee Kim, Ilkyun Ra, Salim Hariri, Yangwoo Kim

Workflow Support for Complex Grid Applications: Integrated and Portal Solutions

Abstract

In this paper we present a workflow solution to support graphically the design, execution, monitoring, and performance visualisation of complex grid applications. The described workflow concept can provide interoperability among different types of legacy applications on heterogeneous computational platforms, such as Condor or Globus based grids. The major design and implementation issues concerning the integration of Condor tools, Mercury grid monitoring infrastructure, PROVE performance visualisation tool, and the new workflow layer of P-GRADE are discussed in two scenarios. The integrated version of P-GRADE represents the thick client concept, while the portal version needs only a thin client and can be accessed by a standard web browser. To illustrate the application of our approach in the grid, an ultra-short range weather prediction system is presented that can be executed in a grid testbed and visualised not only at workflow level but at the level of individual parallel jobs, too.

Róbert Lovas, Gábor Dózsa, Péter Kacsuk, Norbert Podhorszki, Dániel Drótos

Debugging MPI Grid Applications Using Net-dbx

Abstract

Application-development in Grid environments is a challenging process, thus the need for grid enabled development tools is also one that has to be fulfilled. In our work we describe the development of a Grid Interface for the Net-dbx parallel debugger, that can be used to debug MPI grid applications. Net-dbx is a web-based debugger enabling users to use it for debugging from anywhere in the Internet. The proposed debugging architecture is platform independent, because it uses Java, and it is accessible from anywhere, anytime because it is web based. Our architecture provides an abstraction layer between the debugger and the grid middleware and MPI implementation used. This makes the debugger easily adaptable to different middlewares. The grid-enabled architecture of our debugger carries the portability and usability advantages of Net-dbx on which we have based our design. A prototype has been developed and tested.

Panayiotis Neophytou, Neophytos Neophytou, Paraskevas Evripidou

Towards an UML Based Graphical Representation of Grid Workflow Applications

Abstract

Grid workflow applications are emerging as one of the most interesting programming models for the Grid. In this paper we present a novel approach for graphically modeling and describing Grid workflow applications based on the Unified Modeling Language (UML). Our approach provides a graphic representation of Grid applications based on a widely accepted standard (UML) that is more amenable than pure textual-oriented specifications (such as XML). We describe some of the most important elements for modeling control flow, data flow, synchronization, notification, and constraints. We also introduce new features that have not been included by other Grid workflow specification languages which includes broadcast and parallel loops. Our UML-based graphical editor Teuta provides the corresponding tool support. We demonstrate our approach by describing a UML-based Grid workflow model for an advanced 3D medical image reconstruction application.

Sabri Pllana, Thomas Fahringer, Johannes Testori, Siegfried Benkner, Ivona Brandic

Support for User-Defined Metrics in the Online Performance Analysis Tool G-PM

Abstract

This paper presents the support for user-defined metrics in the G-PM performance analysis tool. G-PM addresses the demand for aggressive optimisation of Grid applications by using a new approach to performance monitoring. The tool provides developers, integrators, and end users with the ability to analyse the performance characteristics of an application at a high level of abstraction. In particular, it allows to relate an application’s performance to the Grid’s performance, and also supports application-specific metrics. This is achieved by introducing a language for the specification of performance metrics (PMSL) and the concept of probes for providing application specific events and data. PMSL enables an easy specification of performance metrics, yet allowing an efficient implementation, required for the reduction of monitoring overhead.

Roland Wismüller, Marian Bubak, Włodzimierz Funika, Tomasz Arodź, Marcin Kurdziel

Software Engineering in the EU CrossGrid Project

Abstract

This paper details the software engineering process utilized by the CrossGrid project, which is a major European undertaking, involving nearly two dozen separate organizations from 11 EU member and candidate countries. A scientific project of this magnitude requires the creation of custom-tailored procedures for ensuring uniformity of purpose and means throughout the Project Consortium.

Marian Bubak, Maciej Malawski, Grzegorz Młynarczyk, Piotr Nowakowski, Robert Paja̧k, Katarzyna Rycerz, Michał Turała

Monitoring Message-Passing Parallel Applications in the Grid with GRM and Mercury Monitor

Abstract

The combination of the GRM application monitoring tool and the Mercury resource and job monitoring infrastructure provides an on-line grid performance monitoring tool-set for message-passing parallel applications.

Norbert Podhorszki, Zoltán Balaton, Gábor Gombás

Lhcmaster – A System for Storage and Analysis of Data Coming from the ATLAS Simulations

Abstract

This paper presents the Lhcmaster system designed to aid the physicist in the work of organizing and managing the large number of files that are produced by simulations of High Energy Physics experiments. The implemented system stores and manages data files produced by the simulations of the ATLAS detector, making them available for physicists. We will also present an outline of the Lhcmaster-G, a Grid version of the system, that may be implemented in the future in order to subsistute the Lhcmaster for a more effective and powerful tool.

Maciej Malawski, Marek Wieczorek, Marian Bubak, Elżbieta Richter-Wąs

Using Global Snapshots to Access Data Streams on the Grid

Abstract

Data streams are a prevalent and growing source of timely data. As streams become more prevalent, richer interrogation of the contents of the streams is required. Value of the content increases dramatically when streams are aggregated and distributed global behavior can be interrogated. In this paper, we demonstrate that access to multiple data streams should be viewed as one of deriving meaning from a distributed global snapshot. We define an architecture for a data streams resource based on the Data Access and Integration [2] model proposed in the Global Grid Forum. We demonstrate that access to streams by means of database queries can be intuitive. Finally, we discuss key research issues in realizing the data streams model.

Beth Plale

SCALEA-G: A Unified Monitoring and Performance Analysis System for the Grid

Abstract

This paper describes SCALEA-G, a unified monitoring and performance analysis system for the Grid. SCALEA-G is implemented as a set of grid services based on the Open Grid Services Architecture (OGSA). SCALEA-G provides an infrastructure for conducting online monitoring and performance analysis of a variety of Grid services including computational and network resources, and Grid applications. Both push and pull models are supported, providing flexible and scalable monitoring and performance analysis. Source code and dynamic instrumentation are exploited to perform profiling and monitoring of Grid applications. A novel instrumentation request language has been developed to facilitate the interaction between client and instrumentation services.

Hong-Linh Truong, Thomas Fahringer

Application Monitoring in CrossGrid and Other Grid Projects

Abstract

Monitoring of applications is important for performance analysis, visualization, and other tools for parallel application development. While current Grid research is focused mainly on batch-oriented processing, there is a growing interest in interactive applications, where the user’s interactions are an important element of the execution. This paper presents the OMIS/OCM-G approach to monitoring interactive applications, developed in the framework of the CrossGrid project. We also overview the currently existing application monitoring approaches in other Grid projects.

Bartosz Baliś, Marian Bubak, Marcin Radecki, Tomasz Szepieniec, Roland Wismüller

Grid Infrastructure Monitoring as Reliable Information Service

Abstract

A short overview of Grid infrastructure status monitoring is given followed by a discussion of key concepts for advanced status monitoring systems: passive information gathering based on direct application instrumentation, indirect one based on service and middleware instrumentation, multidimensional matrix testing, and on-demand active testing using non-dedicated user identities. We also propose an idea of augmenting information provided traditionally using Grid information services by information from the infrastructure status monitoring which gives verified and thus valid information only. The approach is demonstrated using a Testbed Status Monitoring Tool prototype developed for a GridLab project.

Petr Holub, Martin Kuba, Luděk Matyska, Miroslav Ruda

Towards a Protocol for the Attachment of Semantic Descriptions to Grid Services

Abstract

Service discovery in large scale, open distributed systems is difficult because of the need to filter out services suitable to the task at hand from a potentially huge pool of possibilities. Semantic descriptions have been advocated as the key to expressive service discovery, but the most commonly used service descriptions and registry protocols do not support such descriptions in a general manner. In this paper, we present a protocol, its implementation and an api for registering semantic service descriptions and other task/user-specific metadata, and for discovering services according to these. Our approach is based on a mechanism for attaching structured and unstructured metadata, which we show to be applicable to multiple registry technologies. The result is an extremely flexible service registry that can be the basis of a sophisticated semantically-enhanced service discovery engine, an essential component of a Semantic Grid.

Simon Miles, Juri Papay, Terry Payne, Keith Decker, Luc Moreau

Semantic Matching of Grid Resource Descriptions

Abstract

The ability to describe the Grid resources needed by applications is essential for developing seamless access to resources on the Grid. We consider the problem of resource description in the context of a resource broker being developed in the Grid Interoperability Project (GRIP) which is able to broker for resources described by several Grid middleware systems, GT2, GT3 and Unicore. We consider it necessary to utilise a semantic matching of these resource descriptions, firstly because there is currently no common standard, but more fundamentally because we wish to make the Grid transparent at the application level. We show how the semantic approach to resource description facilitates both these aims and present the GRIP broker as a working prototype of this approach.

John Brooke, Donal Fellows, Kevin Garwood, Carole Goble

Enabling Knowledge Discovery Services on Grids

Abstract

The Grid is mainly used today for supporting high-performance compute intensive applications. However, it is going to be effectively exploited for deploying data-driven and knowledge discovery applications. To support these classes of applications, high-level tools and services are vital. The Knowledge Grid is a high-level system for providing Grid-based knowledge discovery services. These services allow professionals and scientists to create and manage complex knowledge discovery applications composed as workflows that integrate data sets and mining tools provided as distributed services on a Grid. This paper presents and discusses how knowledge discovery applications can be designed and deployed on Grids. The contribution of novel technologies and models such as OGSA, P2P, and ontologies is also discussed.

Antonio Congiusta, Carlo Mastroianni, Andrea Pugliese, Domenico Talia, Paolo Trunfio

A Grid Service Framework for Metadata Management in Self-e-Learning Networks

Abstract

Metadata management is critical for Grid systems. More specifically, semantically meaningful resource descriptions constitute a highly beneficial extension to Grid environments that started to gain significant attention. In this work we contribute to the effort of enhancing current Grid technologies to support semantic descriptors for resources – termed also the Semantic Grid. We use a Self e-Learning Network (SeLeNe) as the testbed application and propose a set of services that are applicable in such a case in alignment to the Open Grid Services Architecture (OGSA). We concentrate on providing services for the utilization of Learning Objects’ (LO) metadata, the basic of which, however, are generic enough to be utilized by other Grid-based systems that need to make use of semantic descriptions. Different service placement scenarios produce a number of possible architectural alternatives.

George Samaras, Kyriakos Karenos, Eleni Christodoulou

Springer Professional

Grid Computing

Second European AcrossGrids Conference, AxGrids 2004, Nicosia, Cyprus, January 28-30, 2004. Revised Papers

Inhaltsverzeichnis

Frontmatter

EU Funded Grid Development in Europe

Pegasus: Mapping Scientific Workflows onto the Grid

A Low-Cost Rescheduling Policy for Dependent Tasks on Grid Computing Systems

An Advanced Architecture for a Commercial Grid Infrastructure

Managing MPI Applications in Grid Environments

Flood Forecasting in CrossGrid Project

MPICH-G2 Implementation of an Interactive Artificial Neural Network Training

OpenMolGRID, a GRID Based System for Solving Large-Scale Drug Design Problems

Integration of Blood Flow Visualization on the Grid: The FlowFish/GVK Approach

A Migration Framework for Executing Parallel Programs in the Grid

Implementations of a Service-Oriented Architecture on Top of Jini, JXTA and OGSI

Dependable Global Computing with JaWS++

Connecting Condor Pools into Computational Grids by Jini

Overview of an Architecture Enabling Grid Based Application Service Provision

A Grid-Enabled Adaptive Problem Solving Environment

Workflow Support for Complex Grid Applications: Integrated and Portal Solutions

Debugging MPI Grid Applications Using Net-dbx

Towards an UML Based Graphical Representation of Grid Workflow Applications

Support for User-Defined Metrics in the Online Performance Analysis Tool G-PM

Software Engineering in the EU CrossGrid Project

Monitoring Message-Passing Parallel Applications in the Grid with GRM and Mercury Monitor

Lhcmaster – A System for Storage and Analysis of Data Coming from the ATLAS Simulations

Using Global Snapshots to Access Data Streams on the Grid

SCALEA-G: A Unified Monitoring and Performance Analysis System for the Grid

Application Monitoring in CrossGrid and Other Grid Projects

Grid Infrastructure Monitoring as Reliable Information Service

Towards a Protocol for the Attachment of Semantic Descriptions to Grid Services

Semantic Matching of Grid Resource Descriptions

Enabling Knowledge Discovery Services on Grids

A Grid Service Framework for Metadata Management in Self-e-Learning Networks

Backmatter