Skip to main content
Top

2012 | Book

Large-Scale Scientific Computing

8th International Conference, LSSC 2011, Sozopol, Bulgaria, June 6-10, 2011, Revised Selected Papers

Editors: Ivan Lirkov, Svetozar Margenov, Jerzy Waśniewski

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Computer Science

insite
SEARCH

About this book

This book constitutes the thoroughly refereed post-conference proceedings of the 8th International Conference on Large-Scale Scientific Computations, LSSC 2011, held in Sozopol, Bulgaria, in June 2011. The 74 revised full papers presented together with 3 plenary and invited papers were carefully reviewed and selected from numerous submissions. The papers are organized in topical sections on robust multigrid, multilevel and multiscale, deterministic and stochastic methods for modeling highly heterogeneous media, advanced methods for transport, control and uncertain systems, applications of metaheuristics to large-scale problems, environmental modelling, large scale computing on many-core architectures, multiscale industrial, enviromental and biomedical problems, efficient algorithms of computational geometry, high performance Monte Carlo simulations, voxel based computations and contributed papers.

Table of Contents

Frontmatter

Plenary and Invited Papers

Frontmatter
Smoothed Aggregation Spectral Element Agglomeration AMG: SA-ρAMGe

A two–level smoothed aggregation (or SA) scheme with tentative coarse space constructed by spectral element agglomeration method is shown to provide weak–approximation property in a weighted

L

2

–norm. The resulting method utilizing efficient (e.g., polynomial) smoothers is shown to have convergence factor independent of both the coarse and fine–grid mesh–sizes, as well as, to be independent of the contrast (i.e., possible large jumps in the PDE coefficient) for second order elliptic problems discretized on general unstructured meshes. The method allows for multilevel extensions. Presented numerical experiments exhibit behavior in agreement with the developed theory.

Marian Brezina, Panayot S. Vassilevski
Approximation of Sparse Controls in Semilinear Elliptic Equations

Semilinear elliptic optimal control problems involving the

L

1

norm of the control in the objective are considered. Necessary and sufficient second-order optimality conditions are derived. A priori finite element error estimates for three different discretizations for the control problem are given. These discretizations differ in the use of piecewise constant, piecewise linear and continuous or non-discretized controls, respectively. Numerical results and implementation details are provided.

Eduardo Casas, Roland Herzog, Gerd Wachsmuth
A Non-standard Finite Element Method Based on Boundary Integral Operators

This paper provides an overview over our results on the construction and analysis of a non-standard finite element method that is based on the use of boundary integral operators for constructing the element stiffness matrices. This approach permits polyhedral element shapes as well as meshes with hanging nodes. We consider the diffusion equation and convection-diffusion-reaction problems as our model problems, but the method can also be generalized to more general problems like systems of partial differential equations. We provide a rigorous

H

1

- and

L

2

-error analysis of the method for smooth and non-smooth solutions. This a priori discretization error analysis is only done for the diffusion equation. However, our numerical results also show good performance of our method for convection-dominated diffusion problems.

Clemens Hofreither, Ulrich Langer, Clemens Pechstein

Robust Multigrid, Multilevel and Multiscale, Deterministic and Stochastic Methods for Modeling Highly Heterogeneous Media

Frontmatter
Robust Solvers for Symmetric Positive Definite Operators and Weighted Poincaré Inequalities

An abstract setting for robustly preconditioning symmetric positive definite (SPD) operators is presented. The term “robust” refers to the property of the condition numbers of the preconditioned systems being independent of mesh parameters and problem parameters. Important instances of such problem parameters are in particular (highly varying) coefficients. The method belongs to the class of additive Schwarz preconditioners. The paper gives an overview of the results obtained in a recent paper by the authors. It, furthermore, focuses on the importance of weighted Poincaré inequalities, whose notion is extended to general SPD operators, for the analysis of stable decompositions. To demonstrate the applicability of the abstract preconditioner the scalar elliptic equation and the stream function formulation of Brinkman’s equations in two spatial dimensions are considered. Several numerical examples are presented.

Yalchin Efendiev, Juan Galvis, Raytcho Lazarov, Joerg Willems
Additive Schur Complement Approximation for Elliptic Problems with Oscillatory Coefficients

We introduce an algorithm for Additive Schur Complement Approximation (ASCA) that can be used in various iterative methods for solving systems of linear algebraic equations arising from finite element discretization of Partial Differential Equations (PDE). Here we consider a model problem of a scalar elliptic PDE with highly oscillatory (piecewise constant) diffusion coefficient. The main ideas are illustrated by three different examples that reveal the key point of constructing a robust sparse ASCA. We also demonstrate how the quality of the ASCA can be improved and how its sparsity is controlled.

J. Kraus

Advanced Methods for Transport

Frontmatter
Optimization–Based Modeling with Applications to Transport: Part 1. Abstract Formulation

This paper is the first of three related articles, which develop and demonstrate a new, optimization–based framework for computational modeling. The framework uses optimization and control ideas to assemble and decompose multiphysics operators and to preserve their fundamental physical properties in the discretization process. An optimization–based monotone, linearity preserving algorithm for transport (OBT) demonstrates the scope of the framework. The second and the third parts of this work focus on the formulation of efficient optimization algorithms for the solution of the OBT problem, and computational studies of its accuracy and efficacy.

Pavel Bochev, Denis Ridzal, Joseph Young
Optimization-Based Modeling with Applications to Transport: Part 2. The Optimization Algorithm

This paper is the second of three related articles that develop and demonstrate a new optimization-based framework for computational modeling. The framework uses optimization and control ideas to assemble and decompose multiphysics operators and to preserve their fundamental physical properties in the discretization process. One application of the framework is in the formulation of robust algorithms for optimization-based transport (OBT). Based on the theoretical foundations established in Part 1, this paper focuses on the development of an efficient optimization algorithm for the solution of the

remap subproblem

that is at the heart of OBT.

Joseph Young, Denis Ridzal, Pavel Bochev
Optimization-Based Modeling with Applications to Transport: Part 3. Computational Studies

This paper is the final of three related articles that develop and demonstrate a new optimization-based framework for computational modeling. The framework uses optimization and control ideas to assemble and decompose multiphysics operators and to preserve their fundamental physical properties in the discretization process. One application of the framework is in the formulation of robust algorithms for optimization-based transport (OBT). Based on the theoretical foundations established in Part 1 and the optimization algorithm for the solution of the remap subproblem, derived in Part 2, this paper focuses on the application of OBT to a set of benchmark transport problems. Numerical comparisons with two other transport schemes based on incremental remapping, featuring flux-corrected remap and the linear reconstruction with van Leer limiting, respectively, demonstrate that OBT is a competitive transport algorithm.

Denis Ridzal, Joseph Young, Pavel Bochev, Kara Peterson

Control and Uncertain Systems

Frontmatter
Newton’s Method and Secant Method for Set-Valued Mappings

For finding zeros or fixed points of set-valued maps, the fact that the space of convex, compact, nonempty sets of ℝ

n

is not a vector space presents a major disadvantage. Therefore, fixed point iterations or variants of Newton’s method, in which the derivative is applied only to a smooth single-valued part of the set-valued map, are often applied for calculations. We will embed the set-valued map with convex, compact images (i.e. by embedding its images) and shift the problem to the Banach space of directed sets. This Banach space extends the arithmetic operations of convex sets and allows to consider the Fréchet-derivative or divided differences of maps that have embedded convex images. For the transformed problem, Newton’s method and the secant method in Banach spaces are applied via directed sets. The results can be visualized as usual nonconvex sets in ℝ

n

.

Robert Baier, Mirko Hessel-von Molo
Optimal Control of Multibody Systems in Resistive Media

Locomotion of a mechanical system consisting of a main body and one or two links attached to it by cylindrical joints is considered. The system moves in a resistive medium and is controlled by periodic angular oscillations of the links relative to the main body. The resistance force acting upon each body is a quadratic function of its velocity. Under certain assumptions, a nonlinear equation of motion is derived and simplified. The optimal control of oscillations is found that corresponds to the maximal average locomotion speed.

F. L. Chernousko
Classical and Relaxed Progressively Refining Discretization-Optimization Methods for Optimal Control Problems Defined by Ordinary Differential Equations

An optimal control problem is considered, for systems defined by nonlinear ordinary differential equations, with control and pointwise state constraints. Since the problem may have no classical solutions, it is also formulated in the relaxed form. Various necessary/sufficient conditions for optimality are first given for both formulations. In order to solve these problems numerically, we then propose a discrete penalized gradient projection method generating classical controls, and a discrete penalised conditional descent method generating relaxed controls. In both methods, the discretization procedure is progressively refining in order to achieve efficiency with reduced computational cost. Results are given concerning the behaviour in the limit of these methods. Finally, numerical examples are provided.

I. Chryssoverghi, J. Coletsos, B. Kokkinis
On the Asymptotic Stabilization of an Uncertain Bioprocess Model

We study a nonlinear model of a biological digestion process, involving two microbial populations and two substrates and producing biogas (methane). A feedback control law for asymptotic stabilization of the closed-loop system is proposed. An extremum seeking algorithm is applied to stabilize the system towards the maximum methane flow rate.

Neli S. Dimitrova, Mikhail I. Krastanov
Reachable Sets of Impulsive Control System with Cone Constraint on the Control and Their Estimates

The problem of estimating reachable sets of linear measure driven (impulsive) dynamical control system with uncertainty in initial data is considered. It is assumed that the impulsive controls in the dynamical system belong to the intersection of a special cone with a generalized ellipsoid both taken in the space of functions of bounded variation. The algorithms for constructing the external ellipsoidal estimates of reachable sets for such control systems are given. Numerical simulation results relating to the proposed procedures are also discussed.

Tatiana F. Filippova, Oksana G. Matviychuk
Optimal Mass Transportation-Based Models for Neuronal Fibers

Diffusion Magnetic Resonance Imaging (MRI) is used to (non-invasively) study neuronal fibers in the brain white matter. Reconstructing fiber paths from such data (tractography problem) is relevant in particular to study the connectivity between two given cerebral regions. By considering the fiber paths between two given areas as geodesics of a suitable well-posed optimal control problem (related to optimal mass transportation), we are able to provide a quantitative criterion to estimate the connectivity between two given cerebral regions, and to recover the actual distribution of neuronal fibers between them.

Antonio Marigonda, Giandomenico Orlandi
Optimal Controls in Models of Economic Growth and the Environment

We investigate the impact of environmental quality standards on the accumulation of two types of capital (“brown” vs. “green”) and the corresponding R&D investments in an endogenous growth model. We show that environmental regulation as economic policy instrument rather represses economic growth in the long run but fosters green R&D and the accumulation of green capital in the short run. In addition we show that subsidies may support a shift to a greener production and can be used to counteract against repressed accumulation of green capital.

Elke Moser, Alexia Prskawetz, Gernot Tragler
On the Minimum Time Problem for Dodgem Car–Like Bang–Singular Extremals

In this paper we analyse second order conditions for a bang–singular extremal in a class of problems including the Dodgem–car one. Moreover we state some result on optimality and structural stability whose proof will appear elsewhere.

L. Poggiolini, G. Stefani
Perturbation Bounds for the Nonlinear Matrix Equation

In this paper we make a complete perturbation analysis of the nonlinear matrix equation

, where

A

and

B

are square complex matrices,

denotes the complex conjugate transpose of the matrix

A

and

I

is the identity matrix. We obtain local (first order) perturbation bounds and a non-local perturbation bound for the solution to the equation. The perturbation bounds allow to derive condition and accuracy estimates for the computed solution, when using a stable numerical algorithm to solve the equation.

Ivan Popchev, Petko Petkov, Mihail Konstantinov, Vera Angelova

Applications of Metaheuristics to Large-Scale Problems

Frontmatter
Sensitivity Analysis for the Purposes of Parameter Identification of a S. cerevisiae Fed-Batch Cultivation

The application of the sensitivity analysis to parameter identification of a

S. cerevisiae

fed-batch cultivation is presented. Parameter identification of a cultivation described with a fifth order non linear mathematical model is a difficult task because of the high number of parameters to be estimated. The aim of the study is the sensitivity analysis to be applied to determine the most significant parameters and to answer the question which parameters are most easily estimated. For that purpose a sensitivity model of a

S. cerevisiae

fed-batch cultivation is developed and as a result a stepwise parameter identification procedure is proposed.

Maria Angelova, Tania Pencheva
A Matheuristic Algorithm for a Large-Scale Energy Management Problem

The demand for electrical energy is globally growing very quickly. For this reason, the optimization of power plant productions and power plant maintenance scheduling have become important research topics. A Large Scale Energy Management (LSEM) problem is studied in this paper. Two types of power plants are considered: power plants of type 1 can be refueled while still operating. Power plants of type 2 need to be shut down from time to time, for refueling and ordinary maintenance (these are typically nuclear plants). Considering these two types of power plants, LSEM is the problem of optimizing production plans and scheduling of maintenances of type 2 plants, with the objective of keeping the production cost as low as possible, while fulfilling the customers demand. Uncertainty about the customers demand is taken into account in the model considered. In this article, a matheuristic optimization approach based on problem decomposition is proposed. The approach involves mixed integer linear programming and simulated annealing optimization methods. Computational results on some realistic instances are presented.

D. Anghinolfi, L. M. Gambardella, R. Montemanni, C. Nattero, M. Paolucci, N. E. Toklu
On a Game-Method for Modelling with Intuitionistic Fuzzy Estimations: Part 1

A new extension of Conway’s Game of Life is introduced. It is based on a previous Conway’s game extension, given by the authors. Now we use elements of intuitionistic fuzziness that give more detailed estimations of the degrees of existence and of the non-existence of the objects occuring the cells of the game plane. Rules for the motions and rules for the interactions among the objects are dicsussed.

Lilija Atanassova, Krassimir Atanassov
A Generalized Net with an ACO-Algorithm Optimization Component

In the paper we describe a generalized net

G

ACOA

realizing an arbitrary algorithms for ant colony optimization. In this sense, this net is universal for all standard algorithms for ant colony optimization, since it describes the way of functioning and results of their work. Then, we discuss the way of constructing a GN that includes the

G

ACOA

as a subnet. In this way, we ensure the generalized net tokens’ optimal transfer with regard to the results of

G

ACOA

. Thus, we construct a generalized net, featuring an optimization component and thus optimally functioning.

Vassia Atanassova, Stefka Fidanova, Panagiotis Chountas, Krassimir Atanassov
Time Series Prediction by Artificial Neural Networks and Differential Evolution in Distributed Environment

Current work will present a model for time series prediction by the usage of Artificial Neural Networks (ANN) trained with Differential Evolution (DE) in distributed computational environment. Time series prediction is a complex work and demand development of more effective and faster algorithms. ANN is used as a base and it is trained with historical data. One of the main problems is how to select accurate ANN training algorithm. There are two general possibilities — exact numeric optimization methods and heuristic optimization methods. When the right heuristic is applied the training can be done in distributed computational environment. In this case there is much faster and realistic output, which helps to achieve better prediction.

Todor Balabanov, Iliyan Zankinski, Nina Dobrinkova
Differential Evolution Applied to Large Scale Parametric Interval Linear Systems

Differential evolution (DE) is regarded to be a very effective optimisation method for continuous problems in terms of both good optimal solution approximation and short computation time. The authors applied DE method to the problem of solving large scale interval linear systems. Different variants of DE were compared and different strategies were used to ensure that candidate solutions generated in the process of recombination mechanism were always feasible. For the large scale problems the method occurred to be very sensitive to the constraint handling strategy used, so finding an appropriate strategy was very important to achieve good solutions in a reasonable time. Real world large optimisation problems coming from structural engineering were used as the test problems. Additionally DE performance was compared with evolutionary optimisation method presented in [10].

Jerzy Duda, Iwona Skalna
User-Centric Optimization with Evolutionary and Memetic Systems

One of the lessons learned in the last years in the metaheuristics community, and most prominently in the area of evolutionary computation (EC), is the need of exploiting problem knowledge in order to come up with effective optimization tools. This problem-knowledge can be provided in a variety of ways, but there are situations in which endowing the optimization algorithm with this knowledge is a very elusive task. This may be the case when this problem-awareness is hard to encapsulate within a specific algorithmic description, e.g., they belong more to the space of human-expert’s intuition than elsewhere. An extreme case of this situation can take place when the evaluation itself of solutions is not algorithmic, but needs the introduction of a human to critically assess the quality of solutions. The above use of a combined human-user/evolutionary-algorithm approach is commonly termed interactive EC. The term user-centric EC is however more appropriate since it hints possibilities for the system to be proactive rather than merely interactive, i.e., to anticipate some of the user behavior and/or exhibit some degree of creativity. Such features constitute ambitious goals that require a good grasp of the basic underlying issues surrounding interactive optimization. An overview of these is presented in this paper, along with some hints on what the future may bring to this area. An application example is provided in the context of the search for Optimal Golomb Rulers, a very hard combinatorial problem.

Javier Espinar, Carlos Cotta, Antonio J. Fernández-Leiva
Intuitionistic Fuzzy Estimation of the Ant Colony Optimization Starting Points

The ability of ant colonies to form paths for carrying food is rather fascinating. The problem is solved collectively by the whole colony. This ability is explained by the fact that ants communicate in an indirect way by laying trails of pheromone. The higher the pheromone trail within a particular direction, the higher the probability of choosing this direction. The collective problem solving mechanism has given rise to a metaheuristic referred to as Ant Colony Optimization. On this work we use intoitionistic fuzzy estimation of start nodes with respect to the quality of the solution. Various start strategies are offered. Sensitivity analysis of the algorithm behavior according to estimation parameters is made. As a test problem Multidimensional (Multiple) Knapsack Problem is used.

Stefka Fidanova, Krassimir Atanassov, Pencho Marinov
Variable Neighborhood Search for Robust Optimization and Applications to Aerodynamics

Many real-life applications lead to the definition of robust optimization problems where the objective function is a black box. This may be due, for example, to the fact that the objective function is evaluated through computer simulations, and that some parameters are uncertain. When this is the case, existing algorithms for optimization are not able to provide good-quality solutions in general. We propose a heuristic algorithm for solving black box robust optimization problems, which is based on a bilevel Variable Neighborhood Search to solve the minimax formulation of the problem. We also apply this algorithm for the solution of a wing shape optimization where the objective function is a computationally expensive black box. Preliminary computational experiments are reported.

A. Mucherino, M. Fuchs, X. Vasseur, S. Gratton
Processor Array Design with the Use of Genetic Algorithm

In this paper a method for processors arrays design dedicated to realization of specimen linear algebra algorithms in FPGA devices is presented. Within an allocation mapping process a genetic algorithm for information dependency graph projection is used and the runtime of the given algorithm is optimized. For larger input matrices, graph decomposition is used which allows the projection results to be obtained. The obtained projection results, with and without graph decomposition, for a specimen linear algebra algorithm are compared. Additionally, a parallel realization of the evolutionary algorithm for multicore processors is presented, which allows projection results to be obtained for larger input matrix sizes.

Piotr Ratuszniak
A Hybrid Genetic Algorithm for Parameter Identification of Bioprocess Models

In this paper a hybrid scheme using GA and SQP method is introduced. In the hybrid GA-SQP the role of the GA is to explore the search place in order to either isolate the most promising region of the search space. The role of the SQP is to exploit the information gathered by the GA. To demonstrate the usefulness of the presented approach, two cases for parameter identification of different complexity are considered. The hybrid scheme is applied for modeling of

E. coli MC4110

fed-batch cultivation process. The results show that the GA-SQP takes the advantages of both GA’s global search ability and SQP’s local search ability, hence enhances the overall search ability and computational efficiency.

Olympia Roeva
A General Frame for Building Optimal Multiple SVM Kernels

The aim of this paper is to define a general frame for building optimal multiple SVM kernels. Our scheme follows 5 steps: formal representation of the multiple kernels, structural representation, choice of genetic algorithm, SVM algorithm, and model evaluation. The computation of the optimal parameter values of SVM kernels is performed using an evolutionary method based on the SVM algorithm for evaluation of the quality of chromosomes. After the multiple kernel is found by the genetic algorithm we apply cross validation method for estimating the performance of our predictive model. We implemented and compared many hybrid methods derived from this scheme. Improved co-mutation operators are used and a comparative study about their effect on the predictive model performances is made. We tested our multiple kernels for classification tasks but they can be also used for other types of tasks.

Dana Simian, Florin Stoica

Environmental Modeling

Frontmatter
Modeling of Toxic Substances in the Atmosphere – Risk Analysis and Emergency Forecast

The present paper describes the activities and results achieved in the development of a modelling system for operational response to accidental releases of harmful gases in the atmosphere. The main envisaged functions of the system are:

1

Perform highly accurate and reliable risk analysis and assessment for selected “hot spots”;

2

Provide the national authorities and the international community with operational short-term local-to regional scale forecast of the propagation of harmful gases;

3

Perform, in an off–line mode, a more detailed and comprehensive analysis of the possible longer-term impacts on the environment and human health.

The system is based on the following models:

WRF

, used as meteorological pre-processor;

SMOKE

- the emission pre-processor;

CMAQ

- the Chemical Transport Model (CTM) of the system.

For the needs of the emergency response preparedness mode the risk is defined as probability the national regulatory threshold values for toxic gases to be exceeded. Maps of the risk around potential sources of emergency toxic gas releases are constructed and demonstrated in the current paper.

Some examples of the system “operational mode” results are demonstrated as well.

A. Brandiyska, K. Ganev, D. Syrakov, M. Prodanova, N. Miloshev
Some Aspects of Impact in the Potential Climate Change on Ozone Pollution Levels over Bulgaria from High Resolution Simulations

According to the European Environment Agency (EEA), the ground-level ozone is one of the most serious air pollutants in Europe today. High levels of ozone can affect the respiratory system and increases morbidity and mortality, particularly in sensitive groups of the population. Ozone also damages vegetation, reduces crop yields and corrodes technological materials. Ozone pollution is pronounced in regions with strong photochemical activity, such as the Mediterranean basin and Balkan Peninsula. Due to its central location in the second region, Bulgaria may be considered as a hot-spot for ozone and representative of ozone effects on Balkan ecosystems. Ozone concentrations are highly dependent on environmental conditions, including temperature. It is thought to be likely that long-term changes in climate will affect levels of future ozone pollution. Based on the calculated and accumulated in NIMH - Bulgaria during the CECILIA WP7-program data, scope of the presented work is to investigate the changes in ozone pollution levels, expressed with some exposure indices, due to climate change in Southeastern Europe.

Hristo Chervenkov
New Parallel Implementation of an Air Pollution Computer Model – Performance Study on an IBM Blue Gene/P Computer

A new parallel version of the Danish Eulerian model for long transport of air pollutants over the territory of Europe (UNI–DEM) is presented. It is based on the domain partitioning of the space domain both in the

Ox

and

Oy

directions. This new approach gives possibilities to use large number of processors (or cores) on the IBM BlueGene/P computer. The new version of the parallel code of the UNI–DEM is created by using MPI standard library and appears to be highly portable and shows good efficiency and scalability. Discussions according to the performance, speed-ups and efficiency achieved in the first testing runs of the new parallel code on an IBM Blue Gene/P computer are presented.

Krassimir Georgiev, Tzvetan Ostromsky, Zahari Zlatev
Simulation of the 2009 Harmanli Fire (Bulgaria)

We use a coupled atmosphere-fire model to simulate a fire that occurred on August 14–17, 2009, in the Harmanli region, Bulgaria. Data was obtained from GIS and satellites imagery, and from standard atmospheric data sources. Fuel data was classified in the 13 Anderson categories. For correct fire behavior, the spatial resolution of the models needed to be fine enough to resolve the essential micrometeorological effects. The simulation results are compared to available incident data. The code runs faster than real time on a cluster. The model is available from

openwfm.org

and it extends WRF-Fire from WRF 3.3 release.

Georgi Jordanov, Jonathan D. Beezley, Nina Dobrinkova, Adam K. Kochanski, Jan Mandel, Bedřich Sousedík
A Computational Approach for Remediation Procedures in Horizontal Subsurface Flow Constructed Wetlands

A large-scale computational approach for groundwater flow and contaminant transport and removal in porous media is presented. Emphasis is given to remediation procedures in horizontal subsurface flow constructed wetlands. For the numerical procedure, the MODFLOW computer code family is used. Application is made for the simulation of horizontal subsurface flow wetlands pilot-scale units, constructed and operated in Democritus University of Thrace, Xanthi, Greece. The effects of the inlet and outlet recharge positions to the optimum contaminant removal are also numerically investigated.

Konstantinos Liolios, Vassilios Tsihrintzis, Konstantinos Moutsopoulos, Ivan Georgiev, Krassimir Georgiev
Parallel Computation of Sensitivity Analysis Data for the Danish Eulerian Model

Sensitivity Analysis of the Danish Eulerian Model requires an extensive amount of output data from computationally expensive numerical experiments with a specially adapted for the purpose version of the model, called SA-DEM. It has been successfully implemented and run on the most powerful parallel supercomputer in Bulgaria - IBM BlueGene/P. A new enhanced version, capable of using efficiently the full capacity of the mashine, has recently been developed. It will be described in this paper together with some performance analysis and numerical results. The output results are used to construct some mesh-functions of ozone concentrations ratios to be used further in sensitivity analysis of the model by using Monte Carlo algorithms.

Tzvetan Ostromsky, Ivan Dimov, Rayna Georgieva, Zahari Zlatev
Implementation of Two Different Shadow Models into EULAG Model: Madrid Case Study

Two shadow models are compared focusing on the energy fluxes. The first one uses a simplified geometry of the building, this approach is included in the Urban Canopy Model (UCM). The second one is a new three-dimensional urban solar radiation model (SHAMO) which has been developed by the authors of the present contribution. It has been developed to calculate short wave radiation over urban high resolution grids.

The data produced by the urban solar radiation model has been used in large scale numerical experiments to simulate turbulent fluxes for urban areas; in this contribution over Madrid (Spain) city. We have applied a modified version of the EULAG (UCAR) micro scale model (CFD) which includes an energy balance equation to obtain the urban atmosphere/biosphere energy exchange. Results of the micro scale simulations and sensitivity analysis related to the solar radiation approach will be presented in this paper.

R. San Jose, J. L. Perez, R. M. Gonzalez
Model Simulation of Air Pollution Due to April 2010 Iceland Volcano Eruption

The eruption of Iceland volcano Eyjafjallajokull in April 2010 caused enormous disruption to air transport over Europe over a long period of time. The losses and inconveniences for air companies, common business and usual passengers are difficult to estimate but in any case are rather considerable. There are few centres devoted to observation and forecast of such events. The ENSEMBLE consortium led by European JRC in Ispra, Italy, which is aimed at elaborating ensemble forecast on the base of individual forecasts of almost all European Emergency Response Systems (ERS) in case of nuclear accident, decided to launch a series of exercises, devoted to simulation of the first week air pollution dilution caused by Iceland volcano eruption. Bulgarian ERS (BERS) was upgraded to be able to take part in these exercises. This work presents the results of BERS participation in the exercises and comparison with other model results.

D. Syrakov, M. Prodanova
Automatic Data Quality Control of Environmental Data

The modern physics needs large uninterrupted data for advanced study. The research data must be reliable, precise, adequate and available on time. It can happen only with working advanced information systems. Those systems must implement the most advanced technologies, algorithms and knowledge of informatics, programming and mathematics. This article describes an automatic data quality control of large amount of real time uninterrupted data, implemented in the Institute for Nuclear Research and Nuclear Energy (INRNE) at the BEO Observatory at Moussala.

A. Tchorbadjieff

Large-Scale Computing on Many-Core Architectures

Frontmatter
Computing Boundary Element Method’s Matrices on GPU

Matrices resulting from standard boundary element methods are dense and computationally expensive. To speed up the computational time, the matrix computation is done on a GPU. The parallel processing capability of the Graphics Processing Unit (GPU) allows us to divide complex computing tasks into several thousands of smaller tasks that can be run concurrently. We achieved an acceleration of 31 − 36 in comparison to a computation performed on the CPU, serially.

Gundolf Haase, Martin Schanz, Samar Vafai
A Parallel Algorithm with Improved Performance of Finite Volume Method (SIMPLE-TS)

In this paper a parallel version of the finite volume method SIMPLE-TS for calculation of two-dimensional compressible, viscous gas flows with improved performance is presented. It is demonstrated on a problem regarded to micro gas flows, taking place in Micro-Electro-Mechanical Systems (MEMS). The reorganisation of the parallel algorithm improve the algorithm performance, when more cores are used for calculations on computational grids with relatively small number of nodes or cells. The reorganisation is two-fold: first to reduce the number of communications between the processes, and second to reorder the calculation of some variables in such a way that increases the number of calculations during the communications between the processes. The comparison of speed-up between previous and new parallel versions of SIMPLE-TS was performed on two types of clusters with regard to the communication hardware: the first uses specialised cards with low latency for the interconnections between the computers and the other uses conventional cards for the interconnections. The clusters are a part of the GRID-infrastructure of the European Research Area (ERA).

Kiril S. Shterev, Stefan K. Stefanov, Emanouil I. Atanassov
Towards Distributed Heterogenous High-Performance Computing with ViennaCL

One of the major drawbacks of computing with graphics adapters is the limited available memory for relevant problem sizes. To overcome this limitation for the ViennaCL library, we investigate a partitioning approach for one of the standard benchmark problems in High-Performance Computing (HPC), namely the dense matrix-matrix product. We apply this partitioning approach to problems exceeding the available memory on graphics adapters. Moreover, we investigate the applicability on distributed memory systems by facilitating the Message Passing Interface (MPI). Our approach is presented in detail and benchmark results are given.

Josef Weinbub, Karl Rupp, Siegfried Selberherr
High-Throughput-Screening of Medical Image Data on Heterogeneous Clusters

Non-invasive medical imaging by means of computed tomography (CT) and fMRI helps clinicians to improve diagnostics and - hopefully - treatment of patients. Due to better image resolutions as well as ever increasing numbers of patients who undergo these procedures, the amount of data that have to be analyzed puts great strain on radiologists. In an ongoing development with SALK (Salzburger Landeskrankenhaus) we propose a system for automated screening of CT data for cysts in the patient’s kidney area. The proper detection of kidneys is non-trivial, due the high variance of possible size, location, levels of contrast and possible pathological anomalies a human kidney can expose in a CT slice. We employ large-scale, semi-automatically generated dictionaries (based on 10

7

training images) to be used in injunction with principal component analysis (PCA). Heterogeneous clusters of CPU-, GPGPU-, and Cell BE-processors are used for high-throughput-screening of CT data. For data-parallel programming CUDA, OpenCL and the IBM Cell SDK have been used. Task parallelism is based on OpenMPI and a dynamic load-balancing scheme, which demonstrates very low latencies by means of double-buffered, multi-threaded queues.

Peter Zinterhof

Multiscale Industrial, Enviromental and Biomedical Problems

Frontmatter
Preconditioning of Linear Systems Arising in Finite Element Discretizations of the Brinkman Equation

In this work we present a preconditioner for the pressure Schur complement of the linear system, resulting from finite element discretizations of the Stokes-Brinkman equation. The work is motivated by the need to solve numerically the Stokes-Brinkman system. The particular focus are two specific applications: industrial filtration problems and vuggy subsurface flows. The first problem features complex filtering media, coupled to a free flow (Stokes) domain. In vuggy subsurface flows one has free flow inclusions of various connectivity, embedded in highly heterogeneous diffusive media. The Birnkman equation provides a new modeling path to both problems, which warrants the search for efficient methods of solving the resulting linear systems. We consider a block-preconditioning approach for the pressure Schur complement. The starting point is an Incomplete Cholesky factorization of the velocity block. Based on it, an approximate pressure Schur complement is constructed and applied using Preconditioned Conjugate Gradient (PCG). The key in this scheme is an efficient preconditioning of this approximate Schur complement. This is achieved by introducing a second approximation of the pressure Schur complement based on an incomplete back-substitution scheme, followed by a second IC factorization. Numerical examples, illustrating the efficiency of this approach are also presented.

P. Popov

Efficient Algorithms of Computational Geometry

Frontmatter
Blending Functions for Hermite Interpolation by Beta-Function B-Splines on Triangulations

In the present paper we compute for the first time Beta-function B-splines (BFBS) achieving Hermite interpolation up to third partial derivatives at the vertices of the triangulation. We consider examples of BFBS with uniform and variable order of the Hermite interpolation at the vertices of the triangulation, for possibly non-convex star-1 neighbourhoods of these vertices. We also discuss the conversion of the local functions from Taylor monomial bases to appropriately shifted and scaled Bernstein bases, thereby converting the Hermite interpolatory form of the linear combination of BFBS to a new, Bezier-type, form. This conversion is fully parallelized with respect to the vertices of the triangulation and, for Hermite interpolation of uniform order, the load of the computations for each vertex of the computation is readily balanced.

Børre Bang, Lubomir T. Dechevsky, Arne Lakså, Peter Zanaty
Index Mapping between Tensor-Product Wavelet Bases of Different Number of Variables, and Computing Multivariate Orthogonal Discrete Wavelet Transforms on Graphics Processing Units

An algorithm for computation of multivariate wavelet transforms on graphics processing units (GPUs) was proposed in [1]. This algorithm was based on the so-called

isometric conversion between dimension and resolution

(see [2] and the references therein) achieved by mapping the indices of orthonormal tensor-product wavelet bases of different number of variables and a tradeoff between the number of variables versus the resolution level, so that the resulting wavelet bases of different number of variables are with different resolution, but the overall dimension of the bases is the same.

In [1] we developed the algorithm only up to mapping of the indices of

blocks

of wavelet basis functions. This was sufficient to prove the consistency of the algorithm, but not enough for the

mapping of the individual basis functions

in the bases needed for a programming implementation of the algorithm. In the present paper we elaborate the full details of this ‘book-keeping’ construction by passing from block-matrix index mapping on to the detailed index mapping of the individual basis functions. We also consider some examples computed using the new detailed index mapping.

Lubomir T. Dechevsky, Jostein Bratlie, Joakim Gundersen
Interpolation of Curvature and Torsion Using Expo-Rational B-Splines

Expo-rational B-splines (ERBS) are well adapted for Hermite interpolation of any prescribed order [1]. This property of ERBS can be used to interpolate and approximate a variety of differential geometric structures of smooth manifolds, such as length, curvature, torsion, area, volume, etc. In this paper we solve the simplest of these problems: finding the explicit formulae for ERBS-based interpolation of curvature and torsion of unit-speed 3D-space curves and the order of the rate of approximation it provides.

Lubomir T. Dechevsky, Georgi H. Georgiev
Hermite Interpolation Using ERBS with Trigonometric Polynomial Local Functions

Given a sequence of knots

t

0

 < 

t

1

 < ⋯ < 

t

n

 + 1

, an expo-rational B-spline (ERBS) function

f

(

t

) is defined by

$$ f(t)=\sum_{k=1}^n \ell_k(t)B_k(t), \quad t\in[t_1,t_n], $$

where

B

k

(

t

) are the ERBS and ℓ

k

(

t

) are local functions defined on (

t

k

 − 1

,

t

k

 + 1

). Consider the Hermite interpolation problem at the knots 0 ≤ 

t

1

 < 

t

2

 < ⋯ < 

t

n

 < 2

π

of arbitrary multiplicities. In [3] a formula was suggested for Hermite-interpolating ERBS function with ℓ

k

(

t

) being algebraic polynomials. Here we construct Hermite interpolation by an ERBS function with trigonometric polynomial local functions. We provide also numerical results for the performance of the new trigonometric ERBS (TERBS) interpolant in graphical comparison with the interpolant from [3]. Potential applications and some topics for further research on TERBS are briefly outlined.

Lubomir T. Dechevsky, Rumen Uluchev
Triangular Beta-Function B-Spline Finite Elements: Evaluation and Graphical Comparisons

This work is dedicated to the computation of Euler Beta-function B-spline (BFBS) finite elements (FE) on triangulations, and to comparative visualization of their graphs. BFBS are a particular type of generalized expo-rational B-splines (GERBS) [2] and provide a piecewise polynomial modification of the true expo-rational B-splines (ERBS) [3]. The organization of the exposition is, as follows. First, we derive new formulae for triangular BFBS FE having

C

r

smoothness at the vertices

r

 ∈ ℕ. Second, we provide visualization of their graphs. Third, we compare the interpolatory and fitting properties of the new triangular BFBS FE of different polynomial degrees on two model surfaces used as a benchmark manifold.

Lubomir T. Dechevsky, Peter Zanaty

High-Performance Monte Carlo Simulations

Frontmatter
Sensitivity Study of Heston Stochastic Volatility Model Using GPGPU

The focus of this paper is on effective parallel implementation of Heston Stochastic Volatility Model using GPGPU. This model is one of the most widely used stochastic volatility (SV) models. The method of Andersen provides efficient simulation of the stock price and variance under the Heston model. In our implementation of this method we tested the usage of both pseudo-random and quasi-random sequences in order to evaluate the performance and accuracy of the method.

We used it for computing Sobol’ sensitivity indices of the model with respect to input parameters. Since this method is computationally intensive, we implemented a parallel GPGPU-based version of the algorithm, which decreases substantially the computational time. In this paper we describe in detail our implementation and discuss numerical and timing results.

Emanouil I. Atanassov, Sofiya Ivanovska
A Monte Carlo Simulator for Non-contact Mode Atomic Force Microscopy

Nanolithography using Non-Contact Mode Atomic Force Microscopy (NCM-AFM) is a promising method for the manufacture of nanometer sized devices. Compact models which suggest nanopatterned oxide dots with Gaussian or Lorentzian profiles are implemented in a Monte Carlo simulator in a level set environment. An alternative to compact models is explored with a physics based Monte Carlo model, where the AFM tip is treated as a point charge and the silicon wafer as an infinite conducting plane. The strength of the generated electric field creates oxyions which accelerate towards the silicon surface and cause oxide growth and surface deformations. A physics based model is presented, generating an oxide dot based on the induced surface charge density. Comparisons to empirical models suggest that a Lorentzian profile is better suited to describe surface deformations when compared to the Gaussian profile.

Lado Filipovic, Siegfried Selberherr
Numerical Integration Using Sequences Generating Permutations

In this paper we propose a new class of pseudo random number generators based on a special linear recursions modulo

m

. These generators produce sequences which are permutations of the elements of a ℤ

m

. These sequences have been developed for other applications but the analysis of their statistical properties and the experiments described in this paper show that they are appropriate for multiple integration. Here we present some results from numerical tests comparing the performance of the two proposed generators with Mersenne Twister.

Sofiya Ivanovska, A. Karaivanova, N. Manev
Optimization of Intermolecular Interaction Potential Energy Parameters for Monte-Carlo and Molecular Dynamics Simulations

Derivation of high-quality intermolecular potentials for molecular dynamics (MD) and Monte Carlo (MC) simulations is crucial for efficient modeling of molecular systems. Despite their overall complexity, the interactions potentials have often been derived in a semiempirical manner, though in certain cases, also

ab initio

techniques have been involved in their construction. In the present study, we aim to construct optimized intermolecular interaction potentials to be used for MD and MC simulations of pure molecular liquids and their mixtures. We have focused on one of the simplest forms of the potentials, namely the Lennard-Jones (LJ) + Coulomb electrostatic terms. Interaction between each pair of atoms in the molecular liquids has thus been characterized by the LJ parameters + atomic charges. The optimization has been performed by genetic algorithms. An in-depth analysis of the performances of both the standard, widely used (

i.e.

non-optimized), and the optimized interaction potentials was carried out. This analysis was carried out from various aspects related to their performances.

Dragan Sahpaski, Ljupčo Pejov, Anastas Misev
Phonon-Induced Decoherence in Electron Evolution

A Monte Carlo analysis of the evolution of an electron interacting with phonons is presented in terms of a Wigner function. The initial electron state is constructed by a superposition of two wave packets and a pronounced interference term. The results show that phonons effectively destroy the interference term. The initial coherence in wave vector distribution is pushed towards the equilibrium distribution. Phonons hinder the natural spread of the density with time and advance it towards a classical localization. The decoherence effect due to phonons, which brings about the transition from a quantum to a classical state, is demonstrated by the purity of the state, which decreases from its initial value of 1, with a rate depending on the lattice temperature.

Philipp Schwaha, Mihail Nedjalkov, Siegfried Selberherr, Ivan Dimov
Study of Human Influenza’s Spreading Phenomenon

The paper starts from the human influenza’s spreading phenomenon, as a complex of observable occurrences, and develops a stochastic process, defined as a set of procedures that convert its initial state into a sequence of different states during the phenomenon’s lifespan. The Monte Carlo simulation method for a stochastic discrete event system is used. This system is completely described in terms of: entities, sequential states, transition tables of states, sets of input/output events, internal/external transition function, events/time advance function, input/output parameters. The simulation can encompass several contagion schema and health policy responses. Finally, some information about the software in the field is given.

Romică Trandafir, Cornel Resteanu

Voxel-Based Computations

Frontmatter
Multilevel Solvers with Aggregations for Voxel Based Analysis of Geomaterials

Our motivation for voxel based analysis comes from the investigation of geomaterials (geocomposites) arising from rock grouting or sealing. We use finite element analysis based on voxel data from tomography. The arising finite element systems are large scale, which motivates the use of multilevel iterative solvers or preconditioners. Among others we concentrate on multilevel Schwarz preconditioners with aggregations. The aggregations are efficient even in the case of problems with heterogeneity, coefficient oscillations and large coefficient jumps if the aggregations include a proper handling of the strong couplings.

Radim Blaheta, Vojtěch Sokol
A Highly Scalable Matrix-Free Multigrid Solver for μFE Analysis Based on a Pointer-Less Octree

The state of the art method to predict bone stiffness is micro finite element (

μ

FE) analysis based on high-resolution computed tomography (CT). Modern parallel solvers enable simulations with billions of degrees of freedom. In this paper we present a conjugate gradient solver that works directly on the CT image and exploits the geometric properties of the regular grid and the basic element shapes given by the 3D pixel. The data is stored in a pointer-less octree. The tree data structure provides different resolutions of the image that are used to construct a geometric multigrid preconditioner. It enables the use of matrix-free representation of all matrices on all levels. The new solver reduces the memory footprint by more than a factor of 10 compared to our previous solver ParFE. It allows to solve much bigger problems than before and scales excellently on a Cray XT-5 supercomputer.

Cyril Flaig, Peter Arbenz
Aliasing Properties of Voxels in Three-Dimensional Sampling Lattices

It is well-known that band limited functions can be represented more efficiently on the body centered cubic (bcc) and face centered cubic (fcc) lattices than on the standard cartesian (cubic) lattice. We examine sampling properties of these lattices for non band-limited functions and see that, with respect to preserving the energy of the sampled function, the bcc and fcc lattices perform better than the cubic lattice for a given sampling density, indicating that sampling on these lattices will produce less aliasing errors. We study the aliasing errors within a cross section of a three dimensional, rotational invariant pattern through the origin and how these errors depend on the normal of the cross section. The results are in line with the theory, showing that the bcc and fcc lattices have superior sampling qualities with respect to aliasing errors.

E. Linnér, R. Strand
Analysis of a Fast Fourier Transform Based Method for Modeling of Heterogeneous Materials

The focus of this paper is on the analysis of the Conjugate Gradient method applied to a non-symmetric system of linear equations, arising from a Fast Fourier Transform-based homogenization method due to Moulinec and Suquet [1]. Convergence of the method is proven by exploiting a certain projection operator reflecting physics of the underlying problem. These results are supported by a numerical example, demonstrating significant improvement of the Conjugate Gradient-based scheme over the original Moulinec-Suquet algorithm.

Jaroslav Vondřejc, Jan Zeman, Ivo Marek

Contributed Papers

Frontmatter
Properties and Estimates of an Integral Type Nonconforming Finite Element

This paper is intended to provide an investigation to the application of an extended Crouzeix-Raviart (EC-R) nonconforming finite element. Integral degrees of freedom are used, which yields some superclose properties. The considered finite element basis contains an integral type bubble function. The approximate eigenvalues obtained by means of this nonconforming method give asymptotically lower bounds of the exact eigenvalues. It is considerable easier to obtain upper bounds for eigenvalues using variational numerical methods. That is why approximations from below are very valued and useful. Finally, computational aspects are discussed and numerical examples are presented.

A. B. Andreev, M. R. Racheva
Quadratic Finite Element Approximation of a Contact Eigenvalue Problem

In this paper we present a finite element method (FEM) to a nonstandard second-order elliptic eigenvalue problem defined on a two-component domain consisting of two intervals with a contact point. This vector problem involves a nonlocal (integral type) coupling condition between the solution components. By introducing suitable degrees of freedom for the quadratic finite element and a corresponding vector Lagrange interpolant we derive optimal order finite element approximation.

Some numerical aspects concerning the method implementation are considered. Illustrative example is given which shows the efficiency of the proposed method.

A. B. Andreev, M. R. Racheva
Optimization Methods for Calibration of Heat Conduction Models

The paper provides a summary of techniques, which are suitable for calibration of models like both stationary and nonstationary heat conduction. We assume that the PDE based models are discretized by finite elements and PDE coefficients are piecewise constant on apriori given macroelements (subdomains). A special attention is given to Gauss-Newton methods, evaluation of the derivatives and application of these methods to a heat evolution problem, which arose in geoengineering.

Radim Blaheta, Rostislav Hrtus, Roman Kohut, Ondřej Jakl
Block-Preconditioners for Conforming and Non-conforming FEM Discretizations of the Cahn-Hilliard Equation

We consider preconditioned iterative solution methods to solve the algebraic systems of equations arising from finite element discretizations of multiphase flow problems, based on the phase-field model.

The aim is to solve coupled physics problems, where both diffusive and convective processes take place simultaneously in time and space. To model the above, a coupled system of partial differential equations has to be solved, consisting of the Cahn-Hilliard equation to describe the diffusive interface and the time-dependent Navier-Stokes equation, to follow the evolution of the convection field in time.

We focus on the construction and efficiency of preconditioned iterative solution methods for the linear systems, arising after conforming and non-conforming finite element discretizations of the Cahn-Hilliard equation in space and implicit discretization schemes in time. The non-linearity of the phase-separation process is treated by Newton’s method. The resulting matrices admit a two-by-two block structure, utilized by the preconditioning techniques, proposed in the current work. We discuss approximation estimates of the preconditioners and include numerical experiments to illustrate their behaviour.

P. Boyanova, M. Do-Quang, M. Neytcheva
Comparison of Two Numerical Methods for Computation of American Type of the Floating Strike Asian Option

We present a numerical approach for solving the free boundary problem for the Black-Scholes equation for pricing American style of floating strike Asian options. A fixed domain transformation of the free boundary problem into a parabolic equation defined on a fixed spatial domain is performed. As a result a nonlinear time-dependent term is involved in the resulting equation. Two new numerical algorithms are proposed. In the first algorithm a predictor-corrector scheme is used. The second one is based on the Newton method. Computational experiments, confirming the accuracy of the algorithms, are presented and discussed.

J. D. Kandilarov, D. Ševčovič
A Kernel-Based Algorithm for Numerical Solution of Nonlinear PDEs in Finance

We present an algorithm for approximate solutions to certain nonlinear model equations from financial mathematics, using kernels techniques (fundamental solution, Green’s function) for the linear Black-Scholes operator as a basis of the computation. Numerical experiments for comparison the accuracy of the algorithms with other known numerical schemes are discussed. Finally, observations are given.

Miglena N. Koleva, Lubin G. Vulkov
Improving the Efficiency of Parallel FEM Simulations on Voxel Domains

In this work, we consider large-scale finite element modeling on voxel grids. We are targeting the IBM Blue Gene/P computer, which features a 3D torus interconnect. Our previous parallelization approach was to divide the domain in one spatial direction only, which lead to limited parallelism. Here, we extend it to all three spatial directions in order to match the interconnect topology.

As a sample problem, we consider the simulation of the thermal and electrical processes, involved in the radio-frequency (RF) ablation procedure. RF ablation is a low invasive technique for the treatment of hepatic tumors, utilizing AC current to destroy the tumor cells by heating. A 3D voxel approach is used for finite element method (FEM) approximation of the involved partial differential equations. After the space discretization, the backward Euler scheme is used for the time stepping.

We study the impact of the domain partitioning on the performance of a parallel preconditioned conjugate gradient (PCG) solver for the arising large linear systems. As a preconditioner, we use BoomerAMG – a parallel algebraic multigrid implementation from the package Hypre, developed in LLNL, Livermore. The implementation is tested on the IBM Blue Gene/P massively parallel computer.

N. Kosturski, S. Margenov, Y. Vutov
On the Robustness of Two-Level Preconditioners for Quadratic FE Orthotropic Elliptic Problems

We study the construction of subspaces for quadratic FEM orthotropic elliptic problems with a focus on the robustness with respect to mesh and coefficient anisotropy. In the general setting of an arbitrary elliptic operator it is known that standard hierarchical basis (HB) techniques do not result in splittings in which the angle between the coarse space and its (hierarchical) complement is uniformly bounded with respect to the ratio of anisotropy. In this paper we present a robust splitting of the finite element space of continuous piecewise quadratic functions for the orthotropic problem. As a consequence of this result we obtain also a uniform condition number bound for a special sparse Schur complement approximation. Further we construct a uniform preconditioner for the pivot block with optimal order of computational complexity.

J. Kraus, M. Lymbery, S. Margenov
A Computational Approach for the Earthquake Response of Cable-Braced Reinforced Concrete Structures under Environmental Actions

A numerical treatment for the seismic response of reinforced concrete structures containing cable elements is presented. The cable behaviour is considered as nonconvex and nonmonotone one and is described by generalized subdifferential relations including loosening, elastoplastic - fracturing etc. effects. The problem is treated incrementally by double discretization: in space by finite elements and piece-wise linearization of cable - behaviour, and in time by the Newmark method. Thus, in each time - step an incremental linear complementarity problem is solved with a reduced number of problem unknowns.

Angelos Liolios, Konstantinos Chalioris, Asterios Liolios, Stefan Radev, Konstantinos Liolios
An Inverse Problem for the Stationary Kirchhoff Equation

This work is concerned with the development of numerical methods and algorithms for solving the inverse problem for parameter identification from over-determined data in Kirchhoff plate equations. A technique called Method of Variational Imbedding is used for solving the inverse problem. The original inverse problem is replaced by a minimization problem. The Euler-Lagrange equations comprise a higher-order system of equations for the solution of the original equation and for the coefficients. In the present work, difference scheme and numerical algorithm for solving the Euler-Lagrange system are proposed. Results for different values of the governing parameters and the physical relevance are presented.

Tchavdar T. Marinov, Rossitza Marinova
An Improved Sparse Matrix-Vector Multiply Based on Recursive Sparse Blocks Layout

The

Recursive Sparse Blocks

(

RSB

) is a sparse matrix layout designed for coarse grained parallelism and reduced cache misses when operating with matrices, which are larger than a computer’s cache. By laying out the matrix in sparse, non overlapping blocks, we allow for the shared memory parallel execution of transposed

SParse Matrix-Vector

multiply (

SpMV

), with higher efficiency than the traditional

Compressed Sparse Rows

(CSR) format. In this note we cover two issues. First, we propose two improvements to our original approach. Second, we look at the performance of standard and transposed shared memory parallel

SpMV

for unsymmetric matrices, using the proposed approach. We find that our implementation’s performance is competitive with that of both the highly optimized, proprietary Intel MKL Sparse BLAS library’s CSR routines, and the

Compressed Sparse Blocks

(CSB) research prototype.

Michele Martone, Marcin Paprzycki, Salvatore Filippone
On the Differences of the Discrete Weak and Strong Maximum Principles for Elliptic Operators

When choosing a numerical method to approximate the solution of a continuous mathematical problem, we need to consider which method results in an approximation that is not only close to the solution of the original problem, but possesses the important qualitative properties of the original problem, too. For linear elliptic problems the main qualitative properties are the various maximum principles. The preservation of the weak maximum principle was extensively investigated in the last decades, but not the strong maximum principle preservation. In this paper we focus on the latter property by giving its necessary and sufficient conditions, investigating the relation of the preservation of the strong and weak maximum principles and illustrating the differences between them with numerous examples.

Miklós E. Mincsovics, Tamás L. Horváth
Adaptive FEM Package with Decentralized Parallel Adaptation of Tetrahedral Meshes

Our parallel FEM package NuscaS allows us to solve adaptive FEM problems with 3D unstructured meshes on distributed-memory parallel computers such as PC-clusters. For solving sparse systems of equations, NuscaS uses the message-passing paradigm to implement the PCG method with geometric multigrid as a preconditioner.

For the mesh adaptation, the 8-tetrahedra longest-edge partition is used as a refinement mesh algorithm. In this paper, a new method for parallelizing this algorithm is presented. It was developed for the message-passing model, and implemented using the MPI standard. The new solution is based on a decentralized approach. So it is more scalable in comparison to previous implementations, where a centralized synchronizing node (coordinator processor or gateway node) is required.

Both the sequential and parallel versions of the mesh adaptation are carefully optimized to maximize performance. One of key solutions is the usage of suitable data structures, such as hash tables. They allow for high performance while preserving modest memory requirements.

Tomasz Olas, Roman Wyrzykowski
Efficient Simulations of the Transport Properties of Spin Field-Effect Transistors Built on Silicon Fins

Significant progress in integrated circuits performance has been supported by the miniaturization of the transistor feature size. With transistor scalability gradually slowing down new concepts have to be introduced in order to maintain the computational speed increase at reduced power consumption for future micro- and nanoelectronic devices. A promising alternative to the charge degree of freedom currently used in MOSFET switches is to take into account the spin degree of freedom. We computationally investigate transport properties of ballistic spin field-effect transistors (SpinFETs). These simulations require a significant amount of computational resources. To achieve the best performance of calculations we parallelize the code for a shared-memory multi-CPU system. As the result of the optimization of the whole model a significant speed-up in calculations is achieved. We demonstrate that the [100] oriented silicon fins are best suited for practical realizations of a SpinFET.

D. Osintsev, A. Makarov, V. Sverdlov, S. Selberherr
Large-Scale Simulation of Uniform Load Traffic for Modeling of Throughput on a Crossbar Switch Node

In the present paper we propose a family of patterns for uniform traffic simulating. The results from computer simulations of the throughput on a switch node with these patterns are presented. The necessary computations have been performed on grid-clusters of IICT-BAS and CERN. Our simulations utilize the PIM-algorithm for non-conflict schedule, specified by apparatus of Generalized Nets. It is shown that the usage of the suggested family of patterns enables us to evaluate the influence of large scale changes of the input buffer on the throughput.

Tasho Tashev, Vladimir Monov
Two-Phase Porous Media Flow Simulation on a Hybrid Cluster

A kinetically-based model is developed to describe flow of slightly compressible two-phase fluid in a porous medium. The continuity equations for phases are modified taking into account the minimal scales of averaging on space and on time, as a result regularizing terms and the second order time derivative with small parameters are present in the equations. They are approximated by the three-level explicit difference scheme with a mild enough stability condition. The proposed algorithm is easily adapted to modern hybrid supercomputers. The problem of contaminant infiltration into the soil is solved on a cluster with graphics accelerators. High speed-up of GPU computations in comparison with CPU is demonstrated.

Marina Trapeznikova, Boris Chetverushkin, Natalia Churbanova, Dmitrii Morozov
Petrov-Galerkin Analysis for a Degenerate Parabolic Equation in Zero-Coupon Bond Pricing

A

degenerate

parabolic equation in the zero-coupon bond pricing (ZCBP) is studied. First, we analyze the time discretization of the equation. Involving weighted Sobolev spaces, we develop a variational analysis to describe qualitative properties of the solution. On each time-level we formulate a Petrov-Galerkin FEM, in which each of the basis functions of the trial space is determined by the finite volume difference scheme in [2, 3]. Using this formulation, we establish the stability of the method with respect to a discrete energy norm and show that the error of the numerical solution in the energy norm is

O

(

h

), where

h

denotes the mesh parameter.

R. L. Valkov
Agents in Grid System — Design and Implementation

We are developing an agent-based intelligent middleware for the Grid. It is based on agent teams as resource brokers and managers. Our earlier work resulted in a prototype implementation. However, our recent research led to a complete redesign of the system. Here, we discuss the new and main technical issues found during its implementation.

Katarzyna Wasielewska, Michał Drozdowicz, Paweł Szmeja, Maria Ganzha, Marcin Paprzycki, Ivan Lirkov, Dana Petcu, Costin Badica
Using Blue Gene/P and GPUs to Accelerate Computations in the EULAG Model

EULAG (Eulerian/semi-Lagrangian fluid solver) is an established computational model developed by the group headed by Piotr K. Smolarkiewicz for simulating thermo-fluid flows across a wide range of scales and physical scenarios. This paper presents perspectives of the EULAG parallelization based on the MPI, OpenMP, and OpenCL standards. We focus on development of computational kernels of the EULAG model. They consist of the most time-consuming calculations of the model, which are: laplacian algorithm (laplc) and multidimensional positive definite advection transport algorithm (MPDATA).

The first challenge of our work was parallelization of the laplc subroutine using MPI across nodes and OpenMP within nodes, on the BlueGene/P supercomputer located in the Bulgarian Supercomputing Center. The second challenge was to accelerate computations of the Eulag model using modern GPUs. We discuss the scalability issue for the OpenCL implementation of the linear part of MPDATA on ATI Radeon HD 5870 GPU with AMD Phenom II X4 CPU, and NVIDIA Tesla C1060 GPU with AMD Phenom II X4 CPU.

Roman Wyrzykowski, Krzysztof Rojek, Łukasz Szustak
Backmatter
Metadata
Title
Large-Scale Scientific Computing
Editors
Ivan Lirkov
Svetozar Margenov
Jerzy Waśniewski
Copyright Year
2012
Publisher
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-29843-1
Print ISBN
978-3-642-29842-4
DOI
https://doi.org/10.1007/978-3-642-29843-1

Premium Partner