Skip to main content

2015 | Buch

Large-Scale Scientific Computing

10th International Conference, LSSC 2015, Sozopol, Bulgaria, June 8-12, 2015. Revised Selected Papers

insite
SUCHEN

Über dieses Buch

This book constitutes the thoroughly refereed post-conference proceedings of the 10th International Conference on Large-Scale Scientific Computations, LSSC 2015, held in Sozopol, Bulgaria, in June 2015.

The 49 revised full papers presented were carefully reviewed and selected from 64 submissions. The general theme for LSSC 2015 was Large-Scale Scientific Computing with a particular focus on the organized special sessions: enabling exascale computation; control and uncertain systems; computational microelectronics - from monte carlo to deterministic approaches; numerical methods for multiphysics problems; large-scale models: numerical methods, parallel computations and applications; mathematical modeling and analysis of PDEs describing physical problems; a posteriori error control and iterative methods for maxwell type problems; efficient algorithms for hybrid HPC systems; multilevel methods on graphs; and applications of metaheuristics to large-scale problems.

Inhaltsverzeichnis

Frontmatter

Invited Papers

Frontmatter
Preconditioners for Mixed FEM Solution of Stationary and Nonstationary Porous Media Flow Problems

The paper concerns porous media flow in rigid or deformable matrix. It starts with stationary Darcy flow, but the main interest is in extending Darcy problem to involve time dependent behaviour and deformation of the matrix. The considered problems are discretized by mixed FEM in space and stable time discretization methods as backward Euler and second order Radau methods. The discretization leads to time stepping methods which involve solution of a linear system within each time step. The main focus of the paper is then devoted to the construction of suitable preconditioners for these Euler and Radau systems. The paper presents also numerical experiments for illustration of efficiency of the suggested numerical algorithms.

Owe Axelsson, Radim Blaheta, Tomáš Luber
Fast Constrained Image Segmentation Using Optimal Spanning Trees

We propose a graph theoretical algorithm for image segmentation which preserves both the volume and the connectivity of the solid (non-void) phase of the image. The approach uses three stages. Each step optimizes the approximation error between the image intensity vector and piece-wise constant (indicator) vector characterizing the segmentation of the underlying image. The different norms in which this approximation can be measured give rise to different methods. The running time of our algorithm is $$\mathcal {O}(N\log N)$$O(NlogN) for an image with N voxels.

Stanislav Harizanov, Svetozar Margenov, Ludmil Zikatanov
On Computer Simulation of Fluid-Porous Structure Interaction Problems for a Class of Filtration Problems

Fluid-Porous Structure Interaction, FPSI, problem is first formulated and discussed in 3D in connection with modeling of flow processes in pleated filters. Solving the 3D problem is computationally expensive, therefore for a subclass of problems reduced model is considered, namely, the 3D poroelasticity problem is approximated by a poroelastic shell. Because resolving the geometry of a pleat is very important for obtaining accurate solution, interface fitted general quadrilateral grid is introduced. It is difficult to generate good quality grid in such complicated domains, therefore a discretization approach, which is robust on rough grids is selected, namely, multipoint flux approximation method. The coupled FPSI problem is solved with sequential approach, what allows to reuse an existing flow solver. Results from numerical simulations are presented and discussed.

Oleg Iliev, Dimitar Iliev, Ralf Kirsch
Spin-Based CMOS-Compatible Devices

With CMOS feature size rapidly approaching scaling limits the electron spin attracts attention as an alternative degree of freedom for low-power non-volatile devices. Silicon is perfectly suited for spin-driven applications, because it is mostly composed of nuclei without spin and is characterized by weak spin-orbit interaction. Elliot-Yafet spin relaxation due to phonons’ mediated scattering is the main mechanism in bulk silicon at room temperature. Uniaxial stress dramatically reduces the spin relaxation, particularly in thin silicon films. Lifting the valley degeneracy completely in a controllable way by means of standard stress techniques represents a major breakthrough for spin-based devices. Despite impressive progress regarding spin injection, the larger than predicted signal amplitude is still heavily debated. In addition, the absence of a viable concept of spin manipulation in the channel by electrical means makes a practical realization of a device working similar to a MOSFET difficult. An experimental demonstration of such a spin field-effect transistor (SpinFET) is pending for 25 years now, which at present is a strong motivation for researchers to look into the subject. Commercially available CMOS compatible spin-transfer torque magnetic random access memory (MRAM) built on magnetic tunnel junctions possesses all properties characteristic to universal memory: fast operation, high density, and non-volatility. The critical current for magnetization switching and the thermal stability are the main issues to be addressed. A substantial reduction of the critical current density and a considerable increase of the thermal stability are achieved in structures with a recording layer between two vertically sandwiched layers, where the recording layer is composed of two parts in the same plane next to each other. MRAM can be used to build logic-in-memory architectures with non-volatile storage elements on top of CMOS logic circuits. Non-volatility and reduced interconnect losses guarantee low-power consumption. A novel concept for non-volatile logic-in-memory circuits utilizing the same MRAM cells to store and process information simultaneously is proposed.

Viktor Sverdlov, Siegfried Selberherr

Multilevel Methods on Graphs

Frontmatter
Shortest-Path Queries in Planar Graphs on GPU-Accelerated Architectures

We develop an efficient parallel algorithm for answering shortest-path queries in planar graphs and implement it on a multi-node CPU-GPU clusters. The algorithm uses a divide-and-conquer approach for decomposing the input graph into small and roughly equal subgraphs and constructs a distributed data structure containing shortest distances within each of those subgraphs and between their boundary vertices. For a planar graph with n vertices, that data structure needs O(n) storage per processor and allows queries to be answered in $$O(n^{1/4})$$O(n1/4) time.

Guillaume Chapuis, Hristo Djidjev

Mathematical Modeling and Analysis of PDEs Describing Physical Problems

Frontmatter
A Numerical Approach to Price Path Dependent Asian Options

In this paper we develop a parabolic-hyperbolic splitting method for resolving the degeneracy of order $$\gamma , \; 0 < \gamma \le 2$$γ,0<γ≤2 in the ultra-parabolic equation of path dependent Asian options. For the space discretization of the parabolic subproblem we have used two approximations. The first one is the finite volume difference scheme of S. Wang [11], while the second one is the monotone difference scheme of A.A. Samarskii [9]. Some computation results and a comparison between the two methods are presented.

Tatiana Chernogorova, Lubin Vulkov
Operator-Difference Scheme with a Factorized Operator

In the study of difference schemes for time-dependent problems of mathematical physics, the general theory of stability (well-posedness) for operator-difference schemes is in common use. At the present time, the exact (matching necessary and sufficient) conditions for stability are obtained for a wide class of two- and three-level difference schemes considered in finite-dimensional Hilbert spaces.The main results of the theory of stability for operator-difference schemes are obtained for problems with self-adjoint operators. In this work, we consider difference schemes for numerical solution of the Cauchy problem for first order evolution equation, where non-self-adjoint operator is represented as a product of two non-commuting self-adjoint operators. We construct unconditionally stable regularized schemes based on the solution of a grid problem with a single operator multiplier on the new time level.

Petr N. Vabishchevich
Computational Identification of the Right Hand Side of the Parabolic Equations in Problems of Filtration

In this paper, we will consider the right-hand side of a parabolic equation in a multidimensional domain, which depends only on time. For the numerical solution of the initial boundary value problem, a homogeneous implicit differential scheme is used. The problem at a particular time level is solved on the basis of a special decomposition into two standard elliptic boundary value problems. We discuss the results of numerical experiments for a model problem of filtration theory.

V. I. Vasil’ev, M. V. Vasil’eva, A. M. Kardashevsky, D. Ya. Nikiforov

Numerical Methods for Multiphysics Problems

Frontmatter
Algebraic Multigrid Based Preconditioners for Fluid-Structure Interaction and Its Related Sub-problems

This work is devoted to the development and testing of algebraic multigrid based preconditioners for the linearized coupled fluid-structure interaction problem using low order finite element basis functions, and the compressible and nearly incompressible elasticity sub-problems in mixed displacement-pressure form using higher-order finite element basis functions. The preconditioners prove to be robust with respect to the mesh size, time step size, and other material parameters.

Ulrich Langer, Huidong Yang

Control and Uncertain Systems

Frontmatter
Functional Differential Model of an Anaerobic Biodegradation Process

In this paper we study a nonlinear functional differential model of a biological digestion process, involving two microbial populations and two substrates. We establish the global asymptotic stability of the model solutions towards a previously chosen equilibrium point and in the presence of two different discrete delays. Numerical simulation results are also included.

Milen K. Borisov, Neli S. Dimitrova, Mikhail I. Krastanov
Time-Optimal Control Problem in the Space of Probability Measures

We are going to define a time optimal control problem in the space of probability measures. Our aim is to model situations in which the initial position of a particle is not exactly known, even if the evolution is assumed to be deterministic. We will study some natural generalization of objects commonly used in control theory, proving some interesting properties. In particular we will focus on a comparison result between the classical minimum time function and its natural generalization to the probability measures setting.

Giulia Cavagnari, Antonio Marigonda
Sufficient Conditions for Small Time Local Attainability for a Class of Control Systems

We deal with the problem of small time local attainability (STLA) for nonlinear finite-dimensional time-continuous control systems. More precisely, given a nonlinear system $$\dot{x}(t)=f(t,x(t),u(t))$$x˙(t)=f(t,x(t),u(t)), $$u(t)\in U$$u(t)∈U, possibly subjected to state constraints $$x(t)\in \varOmega $$x(t)∈Ω and a closed set S, our aim is to provide sufficient conditions to steer to S every point of a suitable neighborhood of S along admissible trajectories of the system, respecting the constraints, and giving also an upper estimate of the minimum time needed for each point near S to reach S.

Antonio Marigonda, Thuy Thi Le
Financing the Reduction of Emissions from Deforestation: A Differential Game Approach

This paper analyzes and compares two versions of a mechanism that aims at mitigating climate change through REDD (Reduced Emissions from Deforestation and Forest Degradation). In this mechanism industrialised countries compensate countries with rainforests if they reduce their deforestation, because it is more cost efficient than restricting carbon emissions from domestic production. The initial question is, which funding possibility yields the best environmental results and is most beneficial for the involved parties. For this purpose, differential games are developed, in which industrialized countries and countries with rainforests denote the two players. Solutions are obtained by applying Pontryagin’s Maximum Principle and the concept of Nash and Stackelberg Equilibria. Due to the model assumptions, analytical solutions can be found. It turns out that both versions of the mechanism can be a valuable contribution in the battle against climate change. Moreover, most advantages and disadvantages of the two variants turn out to be robust w.r.t. parameter changes and small modifications of the model.

Bernadette Riesner, Gernot Tragler
Relaxation of Euler-Type Discrete-Time Control System

This paper investigates what is the Hausdorff distance between the set of Euler curves of a Lipschitz continuous differential inclusion and the set of Euler curves for the corresponding convexified differential inclusion. It is known that this distance can be estimated by $$O(\sqrt{h})$$O(h), where h is the Euler discretization step. It has been conjectured that, in fact, an estimation O(h) holds. The paper presents results in favor of the conjecture, which cover most of the practically relevant cases. However, the conjecture remains unproven, in general.

Vladimir M. Veliov

Enabling Exascale Computation

Frontmatter
Uncertainty Quantification for Porous Media Flow Using Multilevel Monte Carlo

Uncertainty quantification (UQ) for porous media flow is of great importance for many societal, environmental and industrial problems. An obstacle for progress in this area is the extreme computational effort needed for solving realistic problems. It is expected that exa-scale computers will open the door for a significant progress in this area. We demonstrate how new features of the Distributed and Unified Numerics Environment DUNE [1] address these challenges. In the frame of the DFG funded project EXA-DUNE the software has been extended by multiscale finite element methods (MsFEM) and by a parallel framework for the multilevel Monte Carlo (MLMC) approach. This is a general concept for computing expected values of simulation results depending on random fields, e.g. the permeability of porous media. It belongs to the class of variance reduction methods and overcomes the slow convergence of classical Monte Carlo by combining cheap/inexact and expensive/accurate solutions in an optimal ratio.

Jan Mohring, René Milk, Adrian Ngo, Ole Klein, Oleg Iliev, Mario Ohlberger, Peter Bastian
Task-Based Parallel Sparse Matrix-Vector Multiplication (SpMVM) with GPI-2

We present a task-based implementation of SpMVM with the PGAS communication library GPI-2. This computational kernel is essential for the overall performance of the Krylov subspace solvers but its proper hybrid parallel design is nowadays still a challenge on hierarchical architectures consisting of multi- and many-core sockets and nodes. The GPI-2 library allows, by default and in a natural way, a task-based parallelization. Thus, our implementation is fully asynchronous and it considerably differs from the standard hybrid approaches combining MPI and threads/OpenMP. Here we briefly describe the GPI-2 library, our implementation of the SpMVM routine, and then we compare the performance of our Jacobi preconditioned Richardson solver against the PETSc-Richardson using Poisson BVP in a unit cube as a benchmark test. The comparison employs two types of domain decomposition and demonstrates the preemptive performance and better scalability of our task-based implementation.

Dimitar Stoyanov, Rui Machado, Franz-Josef Pfreundt

Efficient Algorithms for Hybrid HPC Systems

Frontmatter
On the Preconditioned Quasi-Monte Carlo Algorithm for Matrix Computations

In this paper we present a quasi-Monte Carlo Sparse Approximate Inverse (SPAI) preconditioner. In contrast to the standard deterministic SPAI preconditioners that use the Frobenius norm, Monte Carlo and quasi-Monte Carlo preconditioners rely on stochastic and hybrid algorithms to compute a rough matrix inverse (MI). The behaviour of the proposed algorithm is studied. Its performance is measured and compared with the standard deterministic SPAI and MSPAI (parallel SPAI) approaches and with the Monte Carlo approach. An analysis of the results is also provided.

V. Alexandrov, O. Esquivel-Flores, S. Ivanovska, A. Karaivanova
Energy Performance Evaluation of Quasi-Monte Carlo Algorithms on Hybrid HPC

The increasing demands of scientific applications and the increasing capacity of modern computing systems lead to the need of evaluating energy consumption and, consequently, to the development of energy efficient algorithms. In this paper we study the energy performance of a class of quasi-Monte Carlo algorithms on hybrid HPC systems. These algorithms are applied to solve quantum kinetic integral equations using Sobol and Halton sequences. The energy performance results are compared on a CPU-based computer platform and computer platforms with accelerators like GPU cards and Intel Xeon Phi coprocessors with respect to several metrics. Directions for future work are also given.

E. Atanassov, T. Gurov, A. Karaivanova
Towards RBF Interpolation on Heterogeneous HPC Systems

We present a general approach for the parallelization of the interpolation with radial basis functions (RBF) on distributed memory systems, which might use various shared memory hardware as accelerator for the local subtasks involved. The calculation of an interpolant in general requires a global dense system to be solved. Iterative methods need appropriate preconditioning to achieve reasonable iteration counts. For the shared memory approach we use a special Krylov subspace method, namely the FGP algorithm. Addressing the distributed task we start with a simple block-Jacobi iteration with each block solved in parallel. Adding a coarse representation leads to a two-level block-Jacobi iteration with much better iteration counts and a wider applicability.

Gundolf Haase, Dirk Martin, Günter Offner
On the Relation Between Matrices and the Greatest Common Divisor of Polynomials

Following the Barnett’s approach to gcd(a(x),b(x)) based on the use of companion matrix we develop an extended algorithm that gives effectively $$d(x),\ u(x),\ v(x),\ a_1(x)$$d(x),u(x),v(x),a1(x) and $$b_1(x)$$b1(x), where $$a_1(x)=a(x)/d(x), b_1(x)=b(x)/d(x)$$a1(x)=a(x)/d(x),b1(x)=b(x)/d(x) and $$d(x)=u(x)a(x)+v(x)b(x).$$d(x)=u(x)a(x)+v(x)b(x). The algorithm is suitable for parallel realization on GPU, FPGA, and smart cards.

Nikolai L. Manev

Applications of Metaheuristics to Large-Scale Problems

Frontmatter
Distributed Evolutionary Computing Migration Strategy by Incident Node Participation

Distributed evolutionary algorithms are implemented on heterogeneous computing nodes. In a distributed environment it is usual these nodes to differ in operating system and hardware. Such an environment has major problems related to network latency. Some of the evolutionary optimization algorithms are very suitable for distributed computing implementation, because of their high level of parallel scalability. In most cases only fitness function calculation is distributed synchronously (or asynchronously). In this case the population is presented only in the master node. In the next case each separate node has part of the distributed population (it is called island model). The last common model is based on shared memory and each computing node has access to the whole population (it is called fine-grained model). All other models are some hybridization. In the island model there is a common parameter related to migration strategy. The most often used node topology is the ring topology. On a regular basis each node sends its best individual to the next node in the ring. In this paper, a hybrid model, based on incident node participation in star topology, is proposed.

Todor Balabanov, Iliyan Zankinski, Maria Barova
Slot Machine RTP Optimization and Symbols Wins Equalization with Discrete Differential Evolution

It is possible to solve slot machine RTP optimization problem by using evolutionary algorithms. In practice this optimization is done by hand adjustment of the symbols placed on the game reels. By arranging symbols positions, it is possible to achieve optimal return to player percentage (RTP). Equalization of the prizes distribution, generated by different win combinations, can be optimized also. In this paper a DE based RTP optimization and prizes equalization is proposed. DE is used in its discrete variation, because the problem of optimal symbols distribution on the reels is in the discrete domain. DE is selected as an alternative to genetic algorithms (GA) because of its faster convergence. The convergence is a key factor in such optimizations, because each fitness value is calculated based on intensive Monte-Carlo simulations. The scope of this paper is focused on the symbols distribution placed on the machine reels in such a way that two common goals to be satisfied - desired RTP and keeping relatively equal levels of the prizes (prizes expressed as amount of money won from combinations with each particular symbol), with relatively good symbol diversity on the reels.

Todor Balabanov, Iliyan Zankinski, Bozhidar Shumanov
Application of Ants Ideas on Image Edge Detection

The aim of the image edge detection is to find the points, in a digital image, at which the brightness level changes sharply. Normally they are curved lines called edges. Edge detection is a fundamental tool in image processing, machine vision and computer vision, particularly in the areas of feature detection and feature extraction. Edge detection may lead to finding the boundaries of objects. It is one of the fundamental steps in image analysis. Edge detection is a hard computational problem. In this paper we apply a multiagent system. The idea comes from ant colony optimization. We use the swarm intelligence of the ants to search the image edges.

Stefka Fidanova, Zlatolilya Ilcheva
ACD with ESN for Tuning of MEMS Kalman Filter

In the present work we designed a neuro-fuzzy approach for on-line optimal tuning of a Kalman filter of a gyroscope within a Micro ElectroMechanical Sensor (MEMS) device. It consists of Adaptive Critic Design (ACD) scheme in which the controller (a Fuzzy Rule Base (FRB) designed to adapt the measurement noise covariance matrix of a Kalman filter) is tuned using only information about the direction to which the estimation error changes (increase or decrease). A novel fast training dynamic neural network structure - Echo state network (ESN) - was used in the role of the critic element. Application to data collected from real MEMS demonstrated the ability of the proposed approach to tune Kalman filter and improve the quality of its estimates in changing working conditions of the MEMS in real time.

Petia Koprinkova-Hristova, Kiril Alexiev
Optimal Discretization Orders for Distance Geometry: A Theoretical Standpoint

Distance geometry consists in embedding a simple weighted undirected graph $$G=(V,E,d)$$G=(V,E,d) in a K-dimensional space so that all distances $$d_{uv}$$duv, which are the weights on the edges of G, are satisfied by the positions assigned to its vertices. The search domain of this problem is generally continuous, but it can be discretized under certain assumptions, that are strongly related to the order given to the vertices of G. This paper formalizes the concept of optimal partial discretization order, and adapts a previously proposed algorithm with the aim of finding discretization orders that are also able to optimize a given set of objectives. The objectives are conceived for improving the structure of the discrete search domain, for its exploration to become more efficient.

Antonio Mucherino
Sensitivity Analysis of Checkpointing Strategies for Multimemetic Algorithms on Unstable Complex Networks

The use of volatile decentralized computational platforms such as, e.g., peer-to-peer networks, is becoming an increasingly popular option to gain access to vast computing resources. Making an effective use of these resources requires algorithms adapted to such a changing environment, being resilient to resource volatility. We consider the use of a variant of evolutionary algorithms endowed with a classical fault-tolerance technique, namely the creation of checkpoints in a safe external storage. We analyze the sensitivity of this approach on different kind of networks (scale-free and small-world) and under different volatility scenarios. We observe that while this strategy is robust under low volatility conditions, in cases of severe volatility performance degrades sharply unless a high checkpoint frequency is used. This suggest that other fault-tolerance strategies are required in these situations.

Rafael Nogueras, Carlos Cotta
Free Search in Multidimensional Space III

Various scientific and technological fields, such as design, engineering, physics, chemistry, economics, business, and finance often face multidimensional optimisation problems. Although substantial research efforts have been directed in this area, key questions are still waiting for answers, such as: What limits computer aided design systems on optimisation tasks with high variables number? How to improve capabilities of modern search methods applied to multidimensional problems? What are software and hardware constraints? Approaching multidimensional optimisation problems raises in addition new research questions, which cannot be seen or identified on low dimensional tasks, such as: What time is required to resolve multidimensional task with acceptable level of precision? How dimensionality reflects on the search space complexity? How to establish search process orientation, within multidimensional space? How task specific landscapes embarrass orientation? This article presents an investigation on 300 dimensional heterogeneous real-value numerical tests. The study aims to evaluate relation between tasks’ dimensions’ number and required for achieving acceptable solution with non-zero probability number of objective function evaluations. Experimental results are presented, analysed and compared to other publications.

Kalin Penev
Speeding up Parallel Combinatorial Optimization Algorithms with Las Vegas Method

In this paper we introduce a new method for speeding up parallel run times of discrete optimization problems which can be used for different problems. We propose that the variant of the Monte Carlo method, the Las Vegas method can be used for overcoming some special barriers that can occur in the course of dividing such problems. Especially the problem of maximum clique and k-clique is examined, and the new algorithm with the relevant measurements is presented.

Bogdan Zavalnij

Computational Microelectronics — From Monte Carlo to Deterministic Approaches

Frontmatter
Optimization of the Deterministic Solution of the Discrete Wigner Equation

The development of novel nanoelectronic devices requires methods capable to simulate quantum-mechanical effects in the carrier transport processes. We present a deterministic method based on an integral formulation of the Wigner equation, which considers the evolution of an initial condition as the superposition of the propagation of particular fundamental contributions.Major considerations are necessary, to overcome the memory and time demands typical for any quantum transport method. An advantage of our method is that it is perfectly suited for parallelization due to the independence of each fundamental contribution. Furthermore, a dramatic speed-up of the simulations has been achieved due to a preconditioning of the resulting equation system.To evaluate this deterministic approach, the simulation of a Resonant Tunneling Diode, will be shown.

Johann Cervenka, Paul Ellinghaus, Mihail Nedjalkov, Erasmus Langer
The Influence of Electrostatic Lenses on Wave Packet Dynamics

The control of coherent electrons is becoming relevant in emerging devices as (semi-)ballistic transport is observed within nanometer semiconductor structures at room temperature. The evolution of a wave packet – representing an electron in a semiconductor – can be manipulated using specially shaped potential profiles with convex or concave features, similar to refractive lenses used in optics. Such electrostatic lenses offer the possibility, for instance, to concentrate a single wave packet which has been invoked by a laser pulse, or split it up into several wave packets. Moreover, the shape of the potential profile can be dynamically changed by an externally applied potential, depending on the desired behaviour. The evolution of a wave packet under the influence of a two-dimensional potential – the electrostatic lens – is investigated by computing the physical densities using the Wigner function. The latter is obtained by using the signed-particle Wigner Monte Carlo method.

Paul Ellinghaus, Mihail Nedjalkov, Siegfried Selberherr
Evaluation of Spin Lifetime in Thin-Body FETs: A High Performance Computing Approach

Silicon, the prominent material of microelectronics, is perfectly suited for spin-driven applications because of the weak spin-orbit interaction resulting in long spin lifetime. However, additional spin relaxation on rough interfaces and acoustic phonons may strongly decrease the spin lifetime in modern silicon-on-insulator and trigate transistors. Because of the need to perform numerical calculation and appropriate averaging of the strongly scattering momenta depending spin relaxation rates, an evaluation of the spin lifetime in thin silicon films becomes prohibitively computationally expensive. We use a highly parallelized approach to calculate the spin lifetime in silicon films. Our scheme is based on a hybrid parallelization approach, using the message passing interface MPI and OpenMP. The algorithm precalculates wave functions and energies, and temporarily stores the results in a file-based cache to reduce memory consumption. Using the precalculated data for the spin relaxation rate calculations drastically reduces the demand on computational time. We show that our approach offers an excellent parallel speedup, and we demonstrate that the spin lifetime in strained silicon films is enhanced by several orders of magnitude.

Joydeep Ghosh, Dmitry Osintsev, Viktor Sverdlov, Josef Weinbub, Siegfried Selberherr
Free Open Source Mesh Healing for TCAD Device Simulations

Device geometries in technology computer-aided design processes are often generated using parametric solid modeling computer-aided design tools. However, geometries generated with these tools often lack geometric properties, like being intersection-free, which are required for volumetric mesh generation as well as discretization methods. Contributing to this problem is the fact, that device geometries often have multiple regions, used for, e.g., assigning different material parameters. Therefore, a healing process of the geometry is required, which detects the errors and repairs them. In this paper, we identify errors in multi-region device geometries created using computer-aided design tools. A robust algorithm pipeline for healing these errors is presented, which has been implemented in ViennaMesh. This algorithm pipeline is applied on complex device geometries. We show, that our approach robustly heals device geometries created with computer-aided design tools and is even able to handle certain modeling inaccuracies.

Florian Rudolf, Josef Weinbub, Karl Rupp, Peter Resutik, Andreas Morhammer, Siegfried Selberherr
A Non-Equilibrium Green Functions Study of Energy-Filtering Thermoelectrics Including Scattering

Thermoelectric materials can convert waste heat into usable power and thus have great potential as an energy technology. However, the thermoelectric efficiency of a material is quantified by its figure of merit, which has historically remained stubbornly low. One possible avenue towards increasing the figure of merit is through the use of low-dimensional nanograined materials. In such a system scattering, tunnelling through barriers and other low-dimensional effects all play a crucial role and thus a quantum mechanical treatment of transport is essential. This work presents a one-dimensional exploration of the physics of this system using the Non-Equilibrium Green’s Function (NEGF) numerical method and include carrier scattering from both acoustic and optical phonons. This entirely quantum mechanical treatment of scattering greatly increases the computational burden but provides important insights into the physics of the system. Thus, we explore the relative importance of nanograin size, shape and asymmetry in maximizing thermoelectric efficiency.

Mischa Thesberg, Mahdi Pourfath, Neophytos Neophytou, Hans Kosina
Parallelization of the Two-Dimensional Wigner Monte Carlo Method

A parallelization approach for two-dimensional Wigner Monte Carlo quantum simulations using the signed particle method is introduced. The approach is based on a domain decomposition technique, effectually reducing the memory requirements of each parallel computational unit. We depict design and implementation specifics for a message passing interface-based implementation, used in the Wigner Ensemble Monte Carlo simulator, part of the free open source ViennaWD simulation package. Benchmark and simulation results are presented for a time-dependent, two-dimensional problem using five randomly placed point charges. Although additional communication is required, our method offers excellent parallel efficiency for large-scale high-performance computing platforms. Our approach significantly increases the feasibility of computationally highly intricate two-dimensional Wigner Monte Carlo investigations of quantum electron transport in nanostructures.

Josef Weinbub, Paul Ellinghaus, Siegfried Selberherr

Large-Scale Models: Numerical Methods, Paralel Computations and Applications

Frontmatter
A Splitting Numerical Method for Primary and Secondary Pollutant Models

A splitting numerical method is proposed to study the effect of gravitational settling velocity and chemical reaction on pollutants emitted from a point source on the boundary of an urban area. The governing ultra-parabolic equation degenerates on the part of the boundary and we apply a fitted finite volume scheme in order to resolve the degeneration and to preserve the positivity property of the solution (concentration of the pollutant). Computational experiments illustrate the efficiency of our numerical method.

Tatiana Chernogorova, Ivan Dimov, Lubin Vulkov
Snow Cover Assessment with Regional Climate Model - Problems and Results

Climate modelling, either global or regional, is usually treated as a typical large-scale scientific computational problem. The regional climate model RegCM, well-known within the meteorological community, is applied in the study to estimate quantitatively the snow water equivalent, which is the most consistent snow cover parameter. Multiple runs for a time window of 14 consecutive winters with different model configurations, in particular with various initial and boundary conditions, have been performed, in an attempt to obtain most adequate representation of the real snow cover. The results are compared with stations’ measurements from the network of the National Institute of Meteorology and Hydrology. Generally all runs yield similar results, where the overall (i.e. over the whole time span) biases are acceptable, but, however, with large discrepancies in the day-by-day comparisons, which is typical for climate modelling studies.

Hristo Chervenkov, Todor Todorov, Kiril Slavov
Input Data Preparation for Fire Behavior Fuel Modeling of Bulgarian Test Cases
(Main Focus on Zlatograd Test Case)

Since the year 2000 Bulgaria is facing progressive increase of wildland fire occurrence. That is caused mainly because of human mistakes in having fire camps or agricultural land processing after crop harvesting. At the moment Bulgaria has no working mechanism to spot such fires before they become a threat, however the team from Bulgarian Academy of Sciences is working on fire behaviour modelling issues since 2007 and in this work the first attempts for Bulgarian forestry data classification will be presented according to the existing 53 Fire Behavior Fuel Models (FBFMs), and estimations where custom fuels has to be prepared for better representation of the potential fire spread. Calibrations with FARSITE (Fire Area Simulator) runs have been performed for the area of Zlatograd Forestry Department (Bulgaria) and the results are compared with Harmanli (Bulgaria) WRF-Fire/S-Fire simulations. The differences in the fire behaviour fuel models estimations reflect in the final simulated burned area which is presented in the conclusions. Fire behaviour fuel modeling based on both simulation approaches in Zlatograd and Harmanli areas gives future application of the presented work for Bulgarian test cases.

Georgi Dobrinkov, Nina Dobrinkova
Supervised 2-Phase Segmentation of Porous Media with Known Porosity

Porous media segmentation is a nontrivial and often quite inaccurate process, due to the highly irregular structure of the segmentation phases and the huge interaction among them. In this paper we perform a 2-class segmentation of a gray-scale 3D image under the restriction that the number of voxels within the phases are a priori fixed. Two parallel algorithms, based on the graph 2-Laplacian model [1] are proposed, implemented, and numerically tested.

Ivan Georgiev, Stanislav Harizanov, Yavor Vutov
Image Processing Methods in Analysis of Component Composition and Distribution of Dust Emissions for Environmental Quality Management

In this article we consider a novel fast approach based on wavelet transform for edge detection and simplified variant of active contour method - active primitives for the image processing in the research on dust emissions from industrial enterprises. The objective of the dust emissions analysis is to determine their component composition and the fine particle size distribution (PM10 and PM2.5). The scanning electronic microscope of high resolution was used to obtain the large-scale images of dust particles. We use the images of particles in different scales as the entire imagery data.

Andrew Kokoulin, Irina May, Anastasija Kokoulina
Fully Implicit Time-Stepping Schemes for a Parabolic-ODE System of European Options with Liquidity Shocks

We consider the numerical valuation of European options in a market subject to liquidity shocks. Natural boundary conditions are derived on the truncated boundary. We study the fully implicit scheme for this market model, by use of different algorithms, based on the Newton and the Picard iterations at each time step. To validate the efficiency of the time-stepping and the theoretical results, various appropriate numerical experiments are performed.

Miglena N. Koleva, Lubin G. Vulkov
Thermoelectrical Tick Removal Process Modeling

Ticks are widespread ectoparasites. They feed on blood of animals like birds and mammals, including humans. They are carriers and transmitters of pathogens, which cause many diseases, including tick-borne meningoencephalitis, lyme borreliosis, typhus to name few. The best way to prevent infection is to remove the ticks from the host as soon as possible. The removal usually is performed mechanically by pulling the tick. This however is a risky process. Tick irritation or injury may result in it vomiting infective fluids.On a quest of creating of a portable device, which utilizes radio-frequency alternating current for contact-less tick removal, we simulate the thermo-electrical processes of the device application. We use the finite element method, to obtain both the current density inside the host and the tick, and the created temperature field. The computational domain consists of the host’s skin, the tick, the electrodes, and air.Experiments on nested grids are performed to ensure numerical correctness of the obtained solutions. Various electrode configurations are investigated. The goal is to find suitable working parameters – applied power, duration, position for the procedure.

Nikola Kosturski, Ivan Lirkov, Svetozar Margenov, Yavor Vutov
Performance Analysis of Block AMG Preconditioning of Poroelasticity Equations

The goal of this study is to develop, analyze, and implement efficient numerical algorithms for equations of linear poroelasticity, a macroscopically diphasic description of coupled flow and mechanics. We suppose that the solid phase is governed by the linearized constitutive relationship of Hooke’s law. Assuming in addition a quasi-steady regime of the fluid structure interaction, the media is described by the Biot’s system of equations for the unknown displacements and pressure ($$\mathbf{u},p$$u,p). A mixed Finite Element Method (FEM) is applied for discretization. Linear conforming elements are used for the displacements. Following the approach of Arnold-Brezzi, non-conforming FEM approximation is applied for the pressure where bubble terms are added to guarantee a local mass conservation. Block-diagonal preconditioners are used for iterative solution of the arising saddle-point linear algebraic system. The BiCGStab and GMRES are the basic iterative schemes, while algebraic multigrid (AMG) is utilized for approximation of the diagonal blocks. The HYPRE implementations of BiCGStab, GMRES and AMG (BoomerAMG, [6]) are used in the presented numerical tests. The aim of the performance analysis is to improve both: (i) the convergence rate of the solvers measured by the iteration counts, and (ii) the CPU time to solve the problem. The reported results demonstrate some advantages of GMRES for the considered real-life, large-scale, and strongly heterogeneous test problems. Significant improvement is observed due to tuning of the BoomerAMG settings.

Nikola Kosturski, Svetozar Margenov, Peter Popov, Nikola Simeonov, Yavor Vutov
Surface Constructions on Irregular Grids

“Big” surfaces defined on domains that can not be modeled on a single regular grid is typically made by joining several surfaces together with the aid of fillet surfaces or by intersecting the surfaces and joining them after trimming.In computations on geometry and geometric modeling in general surface modeling is a key issue. The most important type of surfaces are tensor product spline surfaces. They are in general based on regular griding, i.e. knot vectors are the same for all “lines” or “columns”. Examples are B-spline surfaces, Hermite-spline surfaces and Expo-rational B-spline surfaces. Surfaces constructions that in some way handle “irregular grids” has been developed. We find them in for example T-splines, LR B-splines, Truncated Hierarchical B-splines and PHT-splines. In general, surfaces based on irregular grids can be regarded as a collection of surfaces on regular grids that are connected at the edges and the corners in a smooth, but irregular way. This involves T-junctions and star-junctions.To investigate a surface construction based on blending of local “small” patches into a “big” surface with arbitrary topology also requires that we can deal with T-junctions and star-junctions.Here we investigate use of blending technique at T- and star-junctions. We look at special blending surfaces between regular patches, and re-parametrization to obtain a correct orientation and a better mapping in the parameter plane. The focus is on smoothness of the resulting surface.

Arne Lakså, Børre Bang
Spline Representation of Connected Surfaces with Custom-Shaped Holes

Compact surfaces possessing a finite number of boundaries are important to isogeometric analysis (IGA). Generalized expo-rational B-Splines (GERBS) is a blending type spline construction where local functions associated with each knot are blended by $$C^k$$Ck-smooth basis functions. Modeling of surfaces with custom-shaped boundaries, or holes, can be achieved by using certain features and properties of the blending type spline construction, including local refinement and insertion of multiple inner knots. In this paper we investigate representation of arbitrary inner boundaries on parametric surfaces by using the above mentioned blending type spline construction.

Aleksander Pedersen, Jostein Bratlie, Rune Dalmo
Scalability of Shooting Method for Nonlinear Dynamical Systems

The computation of periodic solutions of nonlinear dynamical systems is essential step for their analysis. The variation of the steady-state periodic responses of elastic structures with the frequency of vibration or with the excitation frequency provides valuable information about the dynamical behavior of the structure. Shooting method computes iteratively the periodic solutions of dynamical systems. In the current paper a parallel implementation of the shooting method is presented. The nonlinear equation of motion of Bernoulli-Euler beam is used as a model equation. Large-scale system of ordinary differential equations is generated by applying the finite element method. The speedup and efficiency of the method are studied and presented.

Stanislav Stoykov, Svetozar Margenov
Selecting Explicit Runge-Kutta Methods with Improved Stability Properties

Explicit Runge-Kutta methods can efficiently be used in the numerical integration of initial value problems for non-stiff systems of ordinary differential equations (ODEs). Let m and p be the number of stages and the order of a given explicit Runge-Kutta method. We have proved in a previous paper [8] that the combination of any explicit Runge-Kutta method with $$m=p$$m=p and the Richardson Extrapolation leads always to a considerable improvement of the absolute stability properties. We have shown in [7] (talk presented at the NM&A14 conference in Borovets, Bulgaria, August 2014) that the absolute stability regions can be further increased when $$p<m$$p<m is assumed. For two particular cases, $$p=3 \wedge m=4$$p=3∧m=4 and $$p=4 \wedge m=6$$p=4∧m=6 it is demonstrated that(a)the absolute stability regions of the new methods are larger than those of the corresponding explicit Runge-Kutta methods with $$p=m$$p=m, and(b)these regions are becoming much bigger when the Richardson extrapolation is additionally applied.The explicit Runge-Kutta methods, which have optimal absolute stability regions, form two large classes of numerical algorithms (each member of any of these classes having the same absolute stability region as all the others). Rather complicated order conditions have to be derived and used in the efforts to obtain some special methods within each of the two classes.We selected two particular methods within these two classes and tested them by using appropriate numerical examples.

Zahari Zlatev, Krassimir Georgiev, Ivan Dimov

Contributed Papers

Frontmatter
Schur Complement Matrix and Its (Elementwise) Approximation: A Spectral Analysis Based on GLT Sequences

Using the notion of the so-called spectral symbol in the Generalized Locally Toeplitz (GLT) setting, we derive the GLT symbol of the sequence of matrices $$\{A_n\}$${An} approximating the elasticity equations. Further, as the GLT class defines an algebra of matrix sequences and Schur complements are obtained via elementary algebraic operation on the blocks of $$A_n$$An, we derive the symbol $$f^\mathcal {S}$$fS of the associated sequences of Schur complements $$\{S_n\}$${Sn} and that of its element-wise approximation.

Ali Dorostkar, Maya Neytcheva, Stefano Serra-Capizzano
An Iterative Process for the Solution of Semi-Linear Elliptic Equations with Discontinuous Coefficients and Solution

We consider and investigate boundary value problems (BVPs) for semi-linear elliptic equations with discontinuous coefficients and solutions (with imperfect contact matching conditions). Finite difference approximations of these problems are constructed. An iterative method for solving difference BVPs of contact for semi-linear elliptic equations with iterations on the inner boundary where the coefficients and solutions are discontinuous is constructed and validated. The convergence rate of iterations (with calculated constants) is estimated.

Aigul Manapova
Extremal Interpolation of Convex Scattered Data in $$\mathbb {R}^3$$ R 3 Using Tensor Product Bézier Surfaces

We consider the problem of extremal interpolation of convex scattered data in $$\mathbb {R}^3$$R3 and propose a feasible solution. Using our previous work on edge convex minimum $$L_p$$Lp-norm interpolation curve networks, $$1<p\le \infty $$1<p≤∞, we construct a bivariate interpolant F with the following properties:(i)F is $$G^1$$G1-continuous;(ii)F consists of tensor product Bézier surfaces (patches) of degree (n, n) where $$n\in \mathbb {N}, n\ge 4$$n∈N,n≥4, is priorly chosen;(iii)The boundary curves of each patch are convex;(iv)Each Bézier patch satisfies the tetra-harmonic equation $$\Delta ^4 F=0$$Δ4F=0. Hence F is an extremum to the corresponding energy functional.

Krassimira Vlachkova
Backmatter
Metadaten
Titel
Large-Scale Scientific Computing
herausgegeben von
Ivan Lirkov
Svetozar D. Margenov
Jerzy Waśniewski
Copyright-Jahr
2015
Electronic ISBN
978-3-319-26520-9
Print ISBN
978-3-319-26519-3
DOI
https://doi.org/10.1007/978-3-319-26520-9

Premium Partner