Skip to main content
main-content

Über dieses Buch

The two-volume set LNCS 12043 and 12044 constitutes revised selected papers from the 13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019, held in Bialystok, Poland, in September 2019.

The 91 regular papers presented in these volumes were selected from 161 submissions. For regular tracks of the conference, 41 papers were selected from 89 submissions.

The papers were organized in topical sections named as follows:

Part I: numerical algorithms and parallel scientific computing; emerging HPC architectures; performance analysis and scheduling in HPC systems; environments and frameworks for parallel/distributed/cloud computing; applications of parallel computing; parallel non-numerical algorithms; soft computing with applications; special session on GPU computing; special session on parallel matrix factorizations.

Part II: workshop on language-based parallel programming models (WLPP 2019); workshop on models algorithms and methodologies for hybrid parallelism in new HPC systems; workshop on power and energy aspects of computations (PEAC 2019); special session on tools for energy efficient computing; workshop on scheduling for parallel computing (SPC 2019); workshop on applied high performance numerical algorithms for PDEs; minisymposium on HPC applications in physical sciences; minisymposium on high performance computing interval methods; workshop on complex collective systems.

Chapters "Parallel adaptive cross approximation for the multi-trace formulation of scattering problems" and "A High-Order Discontinuous Galerkin Solver with Dynamic Adaptive Mesh Refinement to Simulate Cloud Formation Processes" of LNCS 12043 are available open access under a Creative Commons Attribution 4.0 International License via link.springer.com.

Inhaltsverzeichnis

Frontmatter

Workshop on Language-Based Parallel Programming Models (WLPP 2019)

Frontmatter

Parallel Fully Vectorized Marsa-LFIB4: Algorithmic and Language-Based Optimization of Recursive Computations

The aim of this paper is to present a new high-performance implementation of Marsa-LFIB4 which is an example of high-quality multiple recursive pseudorandom number generators. We propose a new algorithmic approach that combines language-based vectorization techniques together with a new divide-and-conquer method that exploits a special sparse structure of the matrix obtained from the recursive formula that defines the generator. We also show how the use of intrinsics for Intel AVX2 and AVX512 vector extensions can improve the performance. Our new implementation achieves good performance on several multicore architectures and it is much more energy-efficient than simple SIMD-optimized implementations.

Przemysław Stpiczyński

Studying the Performance of Vector-Based Quicksort Algorithm

The performance of parallel algorithms is often inconsistent with their preliminary theoretical analyses. Indeed, the difference is increasing between the ability to theoretically predict the performance of a parallel algorithm and the results measured in practice. This is mainly due to the accelerated development of advanced parallel architectures, whereas there is still no agreed model for parallel computation, which has implications for the design of parallel algorithms.In this study, we examined the practical performance of Cormen’s Quicksort parallel algorithm. We determined the performance of the algorithm with different parallel programming approaches and examine the capacity of theoretical performance analyses of the algorithm for predicting the actual performance.

Ami Marowka

Parallel Tiled Cache and Energy Efficient Code for Zuker’s RNA Folding

In this paper, we consider Zuker’s RNA folding algorithm, which is a challenging dynamic programming task to optimize because it is resource intensive and has a large number of non-uniform dependences. We apply a previously published approach, proposed by us, to automatically tile and parallelize each loop in the Zuker RNA Folding loop nest, which is within the polyhedral model. First, for each loop nest statement, rectangular tiles are formed within the iteration space of the Zuker loop nest. Then, those tiles are corrected to honor all dependences exposed for the original loop nest. Correction is based on applying the exact transitive closure of a dependence graph. We implemented our approach as a part of the source-to-source TRACO compiler. We compare code performance and energy consumption with those obtained with the state-of-the-art PluTo compiler based on the affine transformation framework as well as with those generated by means of the cache-efficient manual method Transpose. Experiments were carried out on a modern multi-core processor to achieve the significant locality improvement and energy saving for generated code.

Marek Palkowski, Wlodzimierz Bielecki

Examining Performance Portability with Kokkos for an Ewald Sum Coulomb Solver

We have implemented the computation of Coulomb interactions in particle systems using the performance portable C++ framework Kokkos. Coulomb interactions are evaluated with an Ewald-sum-based solver, where the interactions are split into long- and short-range contributions. The short-range contributions are calculated using pair-wise contributions of particles while long-range interactions are calculated using Fourier sums. We evaluate the performance portability of the implementation on Intel CPUs, including Intel Xeon Phi, and Nvidia GPUs.

Rene Halver, Jan H. Meinke, Godehard Sutmann

Efficient cuDNN-Compatible Convolution-Pooling on the GPU

The main contribution of this paper is to show efficient implementations of the convolution-pooling in the GPU, in which the pooling follows the multiple convolution. Since the multiple convolution and the pooling operations are performed alternately in earlier stages of many Convolutional Neural Networks (CNNs), it is very important to accelerate the convolution-pooling. Our new GPU implementation uses two techniques, (1) convolution interchange with direct sum, and (2) conversion to matrix multiplication. By these techniques, the computational and memory access cost are reduced. Further the convolution interchange is converted to matrix multiplication, which can be computed by cuBLAS very efficiently. Experimental results using Telsa V100 GPU show that our new GPU implementation compatible with cuDNN for the convolution-pooling is at least 1.34 times faster than the multiple convolution and then the pooling by cuDNN, the most popular library of primitives to implement the CNNs in the GPU.

Shunsuke Suita, Takahiro Nishimura, Hiroki Tokura, Koji Nakano, Yasuaki Ito, Akihiko Kasagi, Tsuguchika Tabaru

Reactive Task Migration for Hybrid MPI+OpenMP Applications

Many applications in high performance computing are designed based on underlying performance and execution models. While these models could successfully be employed in the past for balancing load within and between compute nodes, modern software and hardware increasingly make performance predictability difficult if not impossible. Consequently, balancing computational load becomes much more difficult. Aiming to tackle these challenges in search for a general solution, we present a novel library for fine-granular task-based reactive load balancing in distributed memory based on MPI and OpenMP. With our approach, individual migratable tasks can be executed on any MPI rank. The actual executing rank is determined at run time based on online performance data. We evaluate our approach under an enforced power cap and under enforced clock frequency changes for a synthetic benchmark and show its robustness for work-induced imbalances for a realistic application. Our experiments demonstrate speedups of up to $$1.31\text {X}$$.

Jannis Klinkenberg, Philipp Samfass, Michael Bader, Christian Terboven, Matthias S. Müller

Workshop on Models Algorithms and Methodologies for Hybrid Parallelism in New HPC Systems

Frontmatter

Ab-initio Functional Decomposition of Kalman Filter: A Feasibility Analysis on Constrained Least Squares Problems

The standard formulation of Kalman Filter (KF) becomes computationally intractable for solving large scale state space estimation problems as in ocean/weather forecasting due to matrix storage and inversion requirements. We introduce an innovative mathematical/numerical formulation of KF using Domain Decomposition (DD) approach. The proposed DD approach partitions ab-initio the whole KF computational method giving rise to local KF methods that can be solved independently. We present its feasibility analysis using the constrained least square model underlying variational Data Dssimilation problems. Results confirm that the accuracy of solutions of local KF methods are not impaired by DD approach.

Luisa D’Amore, Rosalba Cacciapuoti, Valeria Mele

Performance Analysis of a Parallel Denoising Algorithm on Intel Xeon Computer System

This paper presents an experimental performance study of a parallel implementation of the Poissonian image restoration algorithm. Hybrid parallelization based on MPI and OpenMP standards is investigated. The implementation is tested for high-resolution radiographic images on a supercomputer using Intel Xeon processors as well as Intel Xeon Phi coprocessors. The experimental results show an essential improvement when running experiments for a variety of problem sizes and number of threads.

Ivan Lirkov

An Adaptive Strategy for Dynamic Data Clustering with the K-Means Algorithm

K-means algorithm is one of the most widely used methods in data mining and statistical data analysis to partition several objects in K distinct groups, called clusters, on the basis of their similarities. The main problem of this algorithm is that it requires the number of clusters as an input data, but in the real life it is very difficult to fix in advance such value. For such reason, several modified K-means algorithms are proposed where the number of clusters is defined at run time, increasing it in a iterative procedure until a given cluster quality metric is satisfied. In order to face the high computational cost of this approach we propose an adaptive procedure, where at each iteration two new clusters are created, splitting only the one with the worst value of the quality metric.

Marco Lapegna, Valeria Mele, Diego Romano

Security and Storage Issues in Internet of Floating Things Edge-Cloud Data Movement

Sensors and actuators became first class citizens in technologically pervasive urban environments. However, the full potential of data crowdsourcing is still unexploited in marine coastal areas, due to the challenging operational conditions, extremely unstable network connectivity and security issues in data movement. In this paper, we present the latest specification of our DYNAMO Transfer Protocol (DTP), a platform-independent data mover framework specifically designed for the Internet of Floating Things applications, where data collected on board of professional or leisure vessels are stored locally and then moved from the edge to the cloud. We evaluate the performance achieved by the DTP in data movement in a controlled environment.

Raffaele Montella, Diana Di Luccio, Sokol Kosta, Aniello Castiglione, Antonio Maratea

Workshop on Power and Energy Aspects of Computations (PEAC 2019)

Frontmatter

Performance/Energy Aware Optimization of Parallel Applications on GPUs Under Power Capping

In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the benchmarks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm-benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance and energy consumption. We present two: both energy savings and performance drops for same power caps as well as a normalized performance-energy consumption product. It is important that optimal configurations are often non-trivial i.e. are obtained for power caps smaller than default and larger than minimal allowed limits. Tests have been performed for two modern GPUs of Pascal and Turing generations i.e. NVIDIA GTX 1070 and NVIDIA RTX 2080 respectively and thus results can be useful for many applications with profiles similar to the benchmarks executed on modern GPU based systems.

Adam Krzywaniak, Paweł Czarnul

Improving Energy Consumption in Iterative Problems Using Machine Learning

To reach the new milestone in High Performance Computing, energy and power constraints have to be considered. Optimal workload distributions are necessary in heterogeneous architectures to avoid inefficient usage of computational resources. Static load balancing techniques are not able to provide optimal workload distributions for problems of irregular nature. On the other hand, dynamic load balancing algorithms are coerced by energy metrics that are usually slow and difficult to obtain. We present a methodology based on Machine Learning to perform dynamic load balancing in iterative problems. Machine Learning models are trained using data acquired during previous executions. We compare this new approach to two dynamic workload balancing techniques already proven in the literature. Inference times for the Machine Learning models are fast enough to be applied in this environment. These new predictive models further improve the workload distribution, reducing the energy consumption of iterative problems.

Alberto Cabrera, Francisco Almeida, Vicente Blanco, Dagoberto Castellanos–Nieves

Automatic Software Tuning of Parallel Programs for Energy-Aware Executions

For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to the type and characteristics of workload that the device is running. For this reason, this paper presents an automatic software tuning method for parallel program generation able to adapt and exploit the hardware features available on a target computing system such as an HPC facility or a cloud system in a better way than traditional compiler infrastructures. We propose a search based approach combining both exact methods and approximated heuristics evolving programs in order to find optimized configurations relying on an ever-increasing number of tunable knobs i.e., code transformation and execution options (such as the number of OpenMP threads and/or the CPU frequency settings). The main objective is to outperform the configurations generated by traditional compiling infrastructures for selected KPIs i.e., performance, energy and power usage (for both for the CPU and DRAM), as well as the runtime. First experimental results tied to the local optimization phase of the proposed framework are encouraging, demonstrating between 8% and 41% improvement for all considered metrics on a reference benchmarking application (i.e., Linpack). This brings novel perspectives for the global optimization step currently under investigation within the presented framework, with the ambition to pave the way toward automatic tuning of energy-aware applications beyond the performance of the current state-of-the-art compiler infrastructures.

Sébastien Varrette, Frédéric Pinel, Emmanuel Kieffer, Grégoire Danoy, Pascal Bouvry

Special Session on Tools for Energy Efficient Computing

Frontmatter

Overview of Application Instrumentation for Performance Analysis and Tuning

Profiling and tuning of parallel applications is an essential part of HPC. Analysis and improvement of the hot spots of an application can be done using one of many available tools, that provides measurement of resources consumption for each instrumented part of the code. Since complex applications show different behavior in each part of the code, it is desired to insert instrumentation to separate these parts.Besides manual instrumentation, some profiling libraries provide different ways of instrumentation. Out of these, the binary patching is the most universal mechanism, that highly improves user-friendliness and robustness of the tool. We provide an overview of the most often used binary patching tools and show a workflow of how to use them to implement a binary instrumentation tool for any profiler or autotuner. We have also evaluated the minimum overhead of the manual and binary instrumentation.

Ondrej Vysocky, Lubomir Riha, Andrea Bartolini

Energy-Efficiency Tuning of a Lattice Boltzmann Simulation Using MERIC

Energy-efficiency is already of paramount importance for High Performance Computing (HPC) systems operation, and tools to monitor power usage and tune relevant hardware parameters are already available and in use at major supercomputing centres. On the other hand, HPC application developers and users still usually focus just on performance, even if they will probably be soon required to look also at the energy-efficiency of their jobs. Only few software tools allow to energy-profile a generic application, and even less are able to tune energy-related hardware parameters from the application itself. In this work we use the MERIC library and the RADAR analyzer, developed within the EU READEX project, to profile and tune for efficiency the execution parameters of a real-life Lattice Boltzmann code. Profiling methodology and details are described, and results are presented and compared with the ones measured in a previous work using different methodologies and tools.

Enrico Calore, Alessandro Gabbana, Sebastiano Fabio Schifano, Raffaele Tripiccione

Evaluating the Advantage of Reactive MPI-aware Power Control Policies

Power consumption is an essential factor that worsens the performance and costs of today and future supercomputer installations. In state-of-the-art works, some approaches have been proposed to reduce the energy consumption of scientific applications by reducing the operating frequency of the computational elements during MPI communication regions. State-of-the-art algorithms rely on the capability of predicting at execution time the duration of these communication regions before their execution. The COUNTDOWN approach tries to do the same by mean of a purely reactive timer based policy. In this paper, we compare the COUNTDOWN algorithm with state-of-the-art predictive-based algorithm, showing that timer based policies are more effective in extract power saving opportunities and reducing energy waste with a lower overhead. When running in a Tier1 system, COUNTDOWN achieves 5% more energy saving with lower overhead than state-of-the-art proactive policy. This suggests that reactive policies are more suited then proactive approaches for communication-aware power management algorithms.

Daniele Cesarini, Carlo Cavazzoni, Andrea Bartolini

Application-Aware Power Capping Using Nornir

Power consumption of IT infrastructure is a major concern for data centre operators. Since data centres power supply is usually dimensioned for an average-case scenario, uncorrelated and simultaneous power spikes in multiple servers could lead to catastrophic effects such as power outages. To avoid such situations, power capping solutions are usually put in place by data centre operators, to control power consumption of individual server and to avoid the datacenter exceeding safe operational limits. However, most power capping solutions rely on Dynamic Voltage and Frequency Scaling (DVFS), which is not always able to guarantee the power cap specified by the user, especially for low power budget values. In this work, we propose a power-capping algorithm that uses a combination of DVFS and Thread Packing. We implement this algorithm in the Nornir framework and we validate it on some real applications by comparing it to the Intel RAPL power capping algorithm and another state of the art power capping algorithm.

Daniele De Sensi, Marco Danelutto

Workshop on Scheduling for Parallel Computing (SPC 2019)

Frontmatter

A New Hardware Counters Based Thread Migration Strategy for NUMA Systems

Multicore NUMA systems present on-board memory hierarchies and communication networks that influence performance when executing shared memory parallel codes. Characterising this influence is complex, and understanding the effect of particular hardware configurations on different codes is of paramount importance. In this paper, monitoring information extracted from hardware counters at runtime is used to characterise the behaviour of each thread in the processes running in the system. This characterisation is given in terms of number of instructions per second, operational intensity, and latency of memory access. We propose to use all this information to guide a thread migration strategy that improves execution efficiency by increasing locality and affinity. Different configurations of NAS Parallel OpenMP benchmarks running concurrently on multicore systems were used to validate the benefits of the proposed thread migration strategy. Our proposal produces up to 25% improvement over the OS for heterogeneous workloads, under different and realistic locality and affinity scenarios.

Oscar García Lorenzo, Rubén Laso Rodríguez, Tomás Fernández Pena, Jose Carlos Cabaleiro Domínguez, Francisco Fernández Rivera, Juan Ángel Lorenzo del Castillo

Alea – Complex Job Scheduling Simulator

Using large computer systems such as HPC clusters up to their full potential can be hard. Many problems and inefficiencies relate to the interactions of user workloads and system-level policies. These policies enable various setup choices of the resource management system (RMS) as well as the applied scheduling policy. While expert’s assessment and well known best practices do their job when tuning the performance, there is usually plenty of room for further improvements, e.g., by considering more efficient system setups or even radically new scheduling policies. For such potentially damaging modifications it is very suitable to use some form of a simulator first, which allows for repeated evaluations of various setups in a fully controlled manner. This paper presents the latest improvements and advanced simulation capabilities of the Alea job scheduling simulator that has been actively developed for over 10 years now. We present both recently added advanced simulation capabilities as well as a set of real-life based case studies where Alea has been used to evaluate major modifications of real HPC and HTC systems.

Dalibor Klusáček, Mehmet Soysal, Frédéric Suter

Makespan Minimization in Data Gathering Networks with Dataset Release Times

In this work, we analyze scheduling in a star data gathering network. Each worker node produces a dataset of known size at a possibly different time. The datasets have to be passed to the base station for further processing. A dataset can be transferred in many separate pieces, but each sent message incurs additional time overhead. The scheduling problem is to organize the communication in the network so that the total time of data gathering and processing is as short as possible. We show that this problem is strongly NP-hard, and propose a polynomial-time 2-approximation algorithm for solving it. Computational experiments show that the algorithm delivers high quality solutions.

Joanna Berlińska

Workshop on Applied High Performance Numerical Algorithms for PDEs

Frontmatter

Overlapping Schwarz Preconditioner for Fourth Order Multiscale Elliptic Problems

In this paper, a domain decomposition parallel preconditioner for the 4th order multiscale elliptic problem in 2D with highly heterogeneous coefficients is constructed. The problem is discretized by the conforming $$C^1$$ reduced Hsieh-Tough-Tocher (HCT) macro element. The proposed preconditioner is based on the classical overlapping Schwarz method and is constructed using an abstract framework of the Additive Schwarz Method. The coarse space consists of multiscale finite element functions associated with the edges and is enriched with functions based on solving carefully constructed generalized eigenvalue problems locally on each edge. The convergence rate of the Preconditioned Conjugate Method of our method is independent of the variations in the coefficients for a sufficient number of eigenfunctions in the coarse space.

Leszek Marcinkowski, Talal Rahman

MATLAB Implementation of C1 Finite Elements: Bogner-Fox-Schmit Rectangle

Rahman and Valdman (2013) introduced a new vectorized way to assemble finite element matrices. We utilize underlying vectorization concepts and extend MATLAB codes to implementation of Bogner-Fox-Schmit C1 rectangular elements in 2D. Our focus is on the detailed construction of elements and simple computer demonstrations including energies evaluations and their visualizations.

Jan Valdman

Simple Preconditioner for a Thin Membrane Diffusion Problem

A diffusion through a thin membrane problem discussed in [13] is discretized with a variant of the composite h-p discontinuous Galerkin method. A preconditioner based on the additive Schwarz method is introduced, and its convergence properties are investigated in numerical experiments.

Piotr Krzyżanowski

A Numerical Scheme for Evacuation Dynamics

We give a stability condition for a semi–implicit numerical scheme and prove unconditional stability for an implicit scheme for a nonlinear advection – diffusion equation, meant as a model of crowd dynamics. Numerical stability is given for a wider class of equations and schemes.

Maria Gokieli, Andrzej Szczepańczyk

Additive Average Schwarz with Adaptive Coarse Space for Morley FE

We propose an additive average Schwarz preconditioner with two adaptively enriched coarse space for the nonconforming Morley finite element method for fourth order biharmonic equation with highly varying and discontinuous coefficients. In this paper, we extend the work of [9, 10]: (additive average Schwarz with adaptive coarse spaces: scalable algorithms for multiscale problems). Our analysis shows that the condition number of the preconditioned problem is bounded independent of the jump of the coefficient, and it depends only on the ratio of the coarse to the fine mesh.

Salah Alrabeei, Mahmood Jokar, Leszek Marcinkowski

Minisymposium on HPC Applications in Physical Sciences

Frontmatter

Application of Multiscale Computational Techniques to the Study of Magnetic Nanoparticle Systems

We have employed a multiscale modeling approach that combines ab-initio electronic structure calculations with atomic and mesoscopic scale modeling to describe the magnetic behavior of assemblies of magnetic nano-particles (MNPs) with core/surface morphology. Our modeling is based on the calculated atomistic parameters and we rescale them after the reduction of the simulated number of the NPs atomic spins to the minimum necessary to represent their magnetic structure in the assemblies. Monte Carlo simulations are them performed to study their macroscopic magnetic behavior. We apply our model to (a) $$\text {CoFe}_{2}\mathrm{O}_{4}$$ NPs coated with two different surfactants and (b) bovine serum albumin-coated $$\text {MnFe}_{2}\text {O}_{4}$$ MNPs’ clusters. Our approach overcomes current computational limitations. The numerical results produced are in excellent agreement with the experimental findings illustrating the potentials of our strategy to simulate the magnetic behavior of complex magnetic nanoparticle systems and to optimize their magnetic properties for advanced energy and biotechnology nanomaterial applications.

Marianna Vasilakaki, Nikolaos Ntallis, Kalliopi N. Trohidou

clique: A Parallel Tool for the Molecular Nanomagnets Simulation and Modelling

A powerful program for modelling the molecular nanomagnets is presented. The exact diagonalization method is used, which gives numerically accurate results. Its main bottleneck is the diagonalization time of large matrices, however it is removed by exploiting the symmetry of the compounds and implementing the method in the parallel computing environment. The diagonalization scheduling algorithm is implemented to increase the balance of the parallel processes workload. The running times of two different diagonalization procedures are compared.

Michał Antkowiak, Łukasz Kucharski, Monika Haglauer

Modelling of Limitations of BHJ Architecture in Organic Solar Cells

Polymer solar cells are considered as very promising candidates for the development of photovoltaics of the future. They are cheap and easy to fabricate, however, up to now, they possess a fundamental drawback: low effectiveness. One ask the question how fundamental this limitation is. We propose the simple model which examines the limitations of efficiency by analysis of geometrical aspects of the bulk heterojunction (BHJ) architecture. We calculate the effective area of the donor-acceptor border in the random mixture of the donor and the acceptor nanocrystals and further compare it with an ideal “brush architecture”. It turns out that in the BHJ architecture, this effective areas are very close to the value obtained in the “brush” one. Implications of this fact are discussed: we consider some other factors, which could limit the efficiency of the BHJ architecture, try to estimate its scale and speculate on possibilities of realization of another architectures and materials in the construction of solar cells.

Jacek Wojtkiewicz, Marek Pilch

Monte Carlo Study of Spherical and Cylindrical Micelles in Multiblock Copolymer Solutions

Solutions of multiblock copolymer chains in implicit selective solvents are studied by Monte Carlo off-lattice method which employs a parallel tempering algorithm. The aggregation of block copolymers into micellar structures of spherical and cylindrical shapes is observed. The parallel tempering Monte Carlo method with feedback-optimized parallel tempering method is used.

Krzysztof Lewandowski, Karolina Gębicka, Anna Kotlarska, Agata Krzywicka, Aneta Łasoń, Michał Banaszak

Electronic and Optical Properties of Carbon Nanotubes Directed to Their Applications in Solar Cells

We calculate electronic and optical properties of a series of finite carbon nanotubes. Where available, our calculations exhibit good consistency with experimental data. Our study is directed towards potential application of carbon nanotubes in solar cells, constructed in a layer architecture.

Jacek Wojtkiewicz, Bartosz Brzostowski, Marek Pilch

Minisymposium on High Performance Computing Interval Methods

Frontmatter

The MPFI Library: Towards IEEE 1788–2015 Compliance

(In Memoriam Dmitry Nadezhin)

The IEEE 1788–2015 has standardized interval arithmetic. However, few libraries for interval arithmetic are compliant with this standard. In the first part of this paper, the main features of the IEEE 1788–2015 standard are detailed, namely the structure into 4 levels, the possibility to accomodate a new mathematical theory of interval arithmetic through the notion of flavor, and the mechanism of decoration for handling exceptions. These features were not present in the libraries developed prior to the elaboration of the standard. MPFI is such a library: it is a C library, based on MPFR, for arbitrary precision interval arithmetic. MPFI is not (yet) compliant with the IEEE 1788–2015 standard for interval arithmetic: the planned modifications are presented. Some considerations about performance and HPC on interval computations based on this standard, or on MPFI, conclude the paper.

Nathalie Revol

Softmax and McFadden’s Discrete Choice Under Interval (and Other) Uncertainty

One of the important parts of deep learning is the use of the softmax formula, that enables us to select one of the alternatives with a probability depending on its expected gain. A similar formula describes human decision making: somewhat surprisingly, when presented with several choices with different expected equivalent monetary gain, we do not just select the alternative with the largest gain; instead, we make a random choice, with probability decreasing with the gain – so that it is possible that we will select second highest and even third highest value. Both formulas assume that we know the exact value of the expected gain for each alternative. In practice, we usually know this gain only with some certainty. For example, often, we only know the lower bound $$\underline{f}$$ and the upper bound $$\overline{f}$$ on the expected gain, i.e., we only know that the actual gain f is somewhere in the interval $$\left[ \,\underline{f},\overline{f}\right] $$. In this paper, we show how to extend softmax and discrete choice formulas to interval uncertainty.

Bartlomiej Jacek Kubica, Laxman Bokati, Olga Kosheleva, Vladik Kreinovich

Improvements of Monotonicity Approach to Solve Interval Parametric Linear Systems

Recently, we have proposed several improvements of the standard monotonicity approach to solving parametric interval linear systems. The obtained results turned out to be very promising; i.e., we have achieved narrower bounds, while generally preserving the computational time. In this paper we propose another improvements, which aim to further decrease the widths of the interval bounds.

Iwona Skalna, Marcin Pietroń, Milan Hladík

The First Approach to the Interval Generalized Finite Differences

The paper concerns the first approach to interval generalized finite differences. The conventional generalized finite differences are of special interest due to the fact that they can be applied to irregular grids (clouds) of points. Based on these finite differences we can compute approximate values of some derivatives (ordinary or partial). Furthermore, one can formulate the generalized finite difference method for solving some boundary value problems with a complicated boundary of a domain. The aim of this paper is to propose the interval counterparts of generalized finite differences. Under the appropriate assumptions the exact values of the derivatives are included in the interval values obtained.

Malgorzata A. Jankowska, Andrzej Marciniak

An Interval Calculus Based Approach to Determining the Area of Integration of the Entropy Density Function

This paper considers the problem of numerical computation of a definite integral of the entropy density function over a wide, potentially unbounded, domain. Despite there are efficient approaches to compute the quadrature over a finite (hyper)rectangle, it may be challenging to bound the whole domain, out of which the function is negligible. An approach based on the interval analysis is proposed in this paper. Preliminary numerical results are also discussed.

Bartłomiej Jacek Kubica, Arkadiusz Janusz Orłowski

An Interval Difference Method of Second Order for Solving an Elliptical BVP

In the article we present an interval difference scheme for solving a general elliptic boundary value problem with Dirichlet’ boundary conditions. The obtained interval enclosure of the solution contains all possible numerical errors. A numerical example we present confirms that the exact solution belongs to the resulting interval enclosure.

Andrzej Marciniak, Malgorzata A. Jankowska, Tomasz Hoffmann

A Parallel Method of Verifying Solutions for Systems of Two Nonlinear Equations

The paper describes a new algorithm for verifying solutions of nonlinear systems of equations. Interval methods provide us a few tools for such verification. Some of them are based on topological theorems. Also our new test is based on checking the extendability of the function from a subspace of the boundary of the box to its interior. For a system of two equations, we can provide an efficient implementation. Generalization to a higher number of equations is also theoretically possible, yet cumbersome. Some numerical results are presented.

Bartłomiej Jacek Kubica, Jarosław Kurek

Workshop on Complex Collective Systems

Frontmatter

Experiments with Heterogenous Automata-Based Multi-agent Systems

We present a theoretical framework and an experimental tool to study behavior of heterogeneous multi-agent systems composed of the two classes of automata-based agents: Cellular Automata (CA) and Learning Automata (LA). Our general aim is to use this framework to solve global optimization problems in a distributed way using the collective behavior of agents. The common feature of CA and LA systems is the ability to show a collective behavior which, however, is understood differently. It is natural for LA-based agents that are able to learn and adapt, but for CA-based agents, extra features have to be used like the second–order CA. We create a theoretical framework of the system based on a spatial Prisoner’s Dilemma (PD) game in which both classes of players may participate. We introduce to the game some mechanisms like local profit sharing, mutation, and competition which stimulate the evolutionary process of developing collective behavior among players. We present some results of an experimental study showing the emergence of collective behavior in such systems.

Franciszek Seredyński, Jakub Gąsior, Rolf Hoffmann, Dominique Désérable

Cellular Automata Model for Crowd Behavior Management in Airports

At the airports, everything must work with remarkable precision and coordination, especially since their operational processes involve managing a large number of moving human groups in order to minimize waiting and service times of individuals, as well as to eliminate phenomena resulting from the interaction of large crowds, such as crowding and congestion around points of interest. The aim of the study is the development of an integrated automated simulation model for human behavior and traffic in the spaces of an airport. Thus, the model simulates the behavior of the human crowds in different operational areas of an airport. The area of the airport is divided into levels that are characterized by differences in the way that people move within. A fully analytical model based on the computational tool of the Cellular Automata (CA) was realised as well as an obstacle avoidance algorithm that is based on the A star (A*) algorithm. According to its structure, the model is microscopic and discrete in space and time while inherent parallelism boosts its performance. Its prominent feature is that the crowd consists of separate, autonomous or non-autonomous entities rather than a mass. During the simulation, each entity is assigned unique features that affect the person’s behavior in the different areas of the airport terminal.

Martha Mitsopoulou, Nikolaos Dourvas, Ioakeim G. Georgoudas, Georgios Ch. Sirakoulis

A Conjunction of the Discrete-Continuous Pedestrian Dynamics Model SigmaEva with Fundamental Diagrams

This article is focused on dynamics of the computer simulation module SigmaEva in connection with an unidirectional flow under periodic boundary conditions. The module SigmaEva realizes the discrete-continuous stochastic pedestrian dynamics model that is shortly presented here. A fundamental diagram (speed-density dependance) is an input for the model. Simulated specific flow rates are considered versus input ones for different diagrams. A sensitivity of the model to input diagrams is shown and discussed.

Ekaterina Kirik, Tatýana Vitova, Andrey Malyshev, Egor Popel

Simulating Pedestrians’ Motion in Different Scenarios with Modified Social Force Model

A model created by Helbing, Molnar, Farkas and Vicsek [1] in the beginning of 21st century considers each agent in pedestrian movement as separate individual who obeys Newton’s laws. The model has been implemented and simulated by numbers of different authors who proved its reliability through realism of agents’ behaviour. To describe the motion as accurately as possible, many of them modified it by presenting their own approach of used formulas and parameters. In this work, authors consider combination of various model modifications as well as present adequate factors values, which allows to observe correct, consistent simulation of different evacuation scenarios and to track changes of Crowd Pressure in subsequent stages of visualization, depending on used exit design.

Karolina Tytko, Maria Mamica, Agnieszka Pękala, Jarosław Wąs

Traffic Prediction Based on Modified Nagel-Schreckenberg Model. Case Study for Traffic in the City of Darmstadt

In this study the authors present a model for traffic flow prediction based on Nagel-Schreckenberg Cellular Automata model. The basic model was expanded to allow simulation of multi-lane roads along with representing different types of cars and drivers’ psychological profiles found in urban traffic. Real traffic data from sensors in Darmstadt city (Germany) were used to perform a set of simulations on presented model.

Łukasz Gosek, Fryderyk Muras, Przemysław Michałek, Jarosław Wąs

HPC Large-Scale Pedestrian Simulation Based on Proxemics Rules

The problem of efficient pedestrian simulation, when large-scale environment is considered, poses a great challenge. When the simulation model size exceeds the capabilities of a single computing node or the results are expected quickly, the simulation algorithm has to use many cores and nodes. The problem considered in the presented work is the task of splitting the data-intensive computations with a common data structure into separate computational domains, while preserving the crucial features of the simulation model. We propose a new model created on the basis of some popular pedestrian models, which can be applied in parallel processing. We describe its implementation in a highly scalable simulation framework. Additionally, the preliminary results are presented and outcomes are discussed.

Paweł Renc, Maciej Bielech, Tomasz Pęcak, Piotr Morawiecki, Mateusz Paciorek, Wojciech Turek, Aleksander Byrski, Jarosław Wąs

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise