Skip to main content
main-content
Top

About this book

The six-volume set LNCS 12742, 12743, 12744, 12745, 12746, and 12747 constitutes the proceedings of the 21st International Conference on Computational Science, ICCS 2021, held in Krakow, Poland, in June 2021.*

The total of 260 full papers and 57 short papers presented in this book set were carefully reviewed and selected from 635 submissions. 48 full and 14 short papers were accepted to the main track from 156 submissions; 212 full and 43 short papers were accepted to the workshops/ thematic tracks from 479 submissions. The papers were organized in topical sections named:

Part I: ICCS Main Track

Part II: Advances in High-Performance Computational Earth Sciences: Applications and Frameworks; Applications of Computational Methods in Artificial Intelligence and Machine Learning; Artificial Intelligence and High-Performance Computing for Advanced Simulations; Biomedical and Bioinformatics Challenges for Computer Science

Part III: Classifier Learning from Difficult Data; Computational Analysis of Complex Social Systems; Computational Collective Intelligence; Computational Health

Part IV: Computational Methods for Emerging Problems in (dis-)Information Analysis; Computational Methods in Smart Agriculture; Computational Optimization, Modelling and Simulation; Computational Science in IoT and Smart Systems

Part V: Computer Graphics, Image Processing and Artificial Intelligence; Data-Driven Computational Sciences; Machine Learning and Data Assimilation for Dynamical Systems; MeshFree Methods and Radial Basis Functions in Computational Sciences; Multiscale Modelling and Simulation

Part VI: Quantum Computing Workshop; Simulations of Flow and Transport: Modeling, Algorithms and Computation; Smart Systems: Bringing Together Computer Vision, Sensor Networks and Machine Learning; Software Engineering for Computational Science; Solving Problems with Uncertainty; Teaching Computational Science; Uncertainty Quantification for Computational Models

*The conference was held virtually.

Table of Contents

Frontmatter

Quantum Computing Workshop

Frontmatter

Implementing Quantum Finite Automata Algorithms on Noisy Devices

Quantum finite automata (QFAs) literature offers an alternative mathematical model for studying quantum systems with finite memory. As a superiority of quantum computing, QFAs have been shown exponentially more succinct on certain problems such as recognizing the language $$\mathtt {MOD}_\mathrm{p}= \{{a^{j}} \mid {j \equiv 0 \mod p}\} $$ MOD p = { a j ∣ j ≡ 0 mod p } with bounded error, where p is a prime number. In this paper we present improved circuit based implementations for QFA algorithms recognizing the $$\mathtt {MOD}_\mathrm{p}$$ MOD p problem using the Qiskit framework. We focus on the case $$p=11$$ p = 11 and provide a 3 qubit implementation for the $$\mathtt {MOD}_\mathrm{11}$$ MOD 11 problem reducing the total number of required gates using alternative approaches. We run the circuits on real IBM quantum devices but due to the limitation of the real quantum devices in the NISQ era, the results are heavily affected by the noise. This limitation reveals once again the need for algorithms using less amount of resources. Consequently, we consider an alternative 3 qubit implementation which works better in practice and obtain promising results even for the problem $$\mathtt {MOD}_\mathrm{31}$$ MOD 31 .

Utku Birkan, Özlem Salehi, Viktor Olejar, Cem Nurlu, Abuzer Yakaryılmaz

OnCall Operator Scheduling for Satellites with Grover’s Algorithm

The application of quantum algorithms on some problems in NP promises a significant reduction of time complexity. This work uses Grover’s Algorithm, designed to search an unstructured database with quadratic speedup, to find valid a solution for an instance of the on-call operator scheduling problem at the German Space Operation Center. We explore new approaches in encoding the problem and construct the Grover oracle automatically from the given constraints and independent of the problem size. Our solution is not designed for currently available quantum chips but aims to scale with their growth in the next years.

Antonius Scherer, Tobias Guggemos, Sophia Grundner-Culemann, Nikolas Pomplun, Sven Prüfer, Andreas Spörl

Multimodal Container Planning: A QUBO Formulation and Implementation on a Quantum Annealer

Quantum computing is developing fast. Real world applications are within reach in the coming years. One of the most promising areas is combinatorial optimisation, where the Quadratic Unconstrained Binary Optimisation (QUBO) problem formulation is used to get good approximate solutions. Both the universal quantum computer as well as the quantum annealer can handle this kind of problems well. In this paper, we present an application on multimodal container planning. We show how to map this problem to a QUBO problem formulation and how the practical implementation can be done on the quantum annealer produced by D-Wave Systems.

F. Phillipson, I. Chiscop

Portfolio Optimisation Using the D-Wave Quantum Annealer

The first quantum computers are expected to perform well on quadratic optimisation problems. In this paper a quadratic problem in finance is taken, the Portfolio Optimisation problem. Here, a set of assets is chosen for investment, such that the total risk is minimised, a minimum return is realised and a budget constraint is met. This problem is solved for several instances in two main indices, the Nikkei225 and the S&P500 index, using the state-of-the-art implementation of D-Wave’s quantum annealer and its hybrid solvers. The results are benchmarked against conventional, state-of-the-art, commercially available tooling. Results show that for problems of the size of the used instances, the D-Wave solution, in its current, still limited size, comes already close to the performance of commercial solvers.

Frank Phillipson, Harshil Singh Bhatia

Cross Entropy Optimization of Constrained Problem Hamiltonians for Quantum Annealing

This paper proposes a Cross Entropy approach to shape constrained Hamiltonians by optimizing their energy penalty values. The results show a significantly improved solution quality when run on D-Wave’s quantum annealing hardware and the numerical computation of the eigenspectrum reveals that the solution quality is correlated with a larger minimum spectral gap. The experiments were conducted based on the Knapsack-, Minimum Exact Cover- and Set Packing Problem. For all three constrained optimization problems we could show a remarkably better solution quality compared to the conventional approach, where the energy penalty values have to be guessed.

Christoph Roch, Alexander Impertro, Claudia Linnhoff-Popien

Classification Using a Two-Qubit Quantum Chip

Quantum computing has great potential for advancing machine learning algorithms beyond classical reach. Even though full-fledged universal quantum computers do not exist yet, its expected benefits for machine learning can already be shown using simulators and already available quantum hardware. In this work, we consider a distance-based classification algorithm and modify it to be run on actual early stage quantum hardware. We extend upon earlier work and present a non-trivial reduction using only two qubits. The algorithm is consequently run on a two-qubit silicon spin quantum computer. We show that the results obtained using the two-qubit silicon spin quantum computer are similar to the theoretically expected results.

Niels M. P. Neumann

Performance Analysis of Support Vector Machine Implementations on the D-Wave Quantum Annealer

In this paper a classical classification model, Kernel-Support Vector machine, is implemented as a Quadratic Unconstrained Binary Optimisation problem. Here, data points are classified by a separating hyperplane while maximizing the function margin. The problem is solved for a public Banknote Authentication dataset and the well-known Iris Dataset using a classical approach, simulated annealing, direct embedding on the Quantum Processing Unit and a hybrid solver. The hybrid solver and Simulated Annealing algorithm outperform the classical implementation on various occasions but show high sensitivity to a small variation in training data.

Harshil Singh Bhatia, Frank Phillipson

Adiabatic Quantum Feature Selection for Sparse Linear Regression

Linear regression is a popular machine learning approach to learn and predict real valued outputs or dependent variables from independent variables or features. In many real world problems, its beneficial to perform sparse linear regression to identify important features helpful in predicting the dependent variable. It not only helps in getting interpretable results but also avoids overfitting when the number of features is large, and the amount of data is small. The most natural way to achieve this is by using ‘best subset selection’ which penalizes non-zero model parameters by adding $$\ell _0$$ ℓ 0 norm over parameters to the least squares loss. However, this makes the objective function non-convex and intractable even for a small number of features. This paper aims to address the intractability of sparse linear regression with $$\ell _0$$ ℓ 0 norm using adiabatic quantum computing, a quantum computing paradigm that is particularly useful for solving optimization problems faster. We formulate the $$\ell _0$$ ℓ 0 optimization problem as a Quadratic Unconstrained Binary Optimization (QUBO) problem and solve it using the D-Wave adiabatic quantum computer. We study and compare the quality of QUBO solution on synthetic and real world datasets. The results demonstrate the effectiveness of the proposed adiabatic quantum computing approach in finding the optimal solution. The QUBO solution matches the optimal solution for a wide range of sparsity penalty values across the datasets.

Surya Sai Teja Desu, P. K. Srijith, M. V. Panduranga Rao, Naveen Sivadasan

EntDetector: Entanglement Detecting Toolbox for Bipartite Quantum States

Quantum entanglement is an extremely important phenomenon in the field of quantum computing. It is the basis of many communication protocols, cryptography and other quantum algorithms. On the other hand, however, it is still an unresolved problem, especially in the area of entanglement detection methods. In this article, we present a computational toolbox which offers a set of currently known methods for detecting entanglement, as well as proposals for new tools operating on two-partite quantum systems. We propose to use the concept of combined Schmidt and spectral decomposition as well as the concept of Gramian operators to examine a structure of analysed quantum states. The presented here computational toolbox was implemented by the use of Python language. Due to popularity of Python language, and its ease of use, a proposed set of methods can be directly utilised with other packages devoted to quantum computing simulations. Our toolbox can also be easily extended.

Roman Gielerak, Marek Sawerwain, Joanna Wiśniewska, Marek Wróblewski

On Decision Support for Quantum Application Developers: Categorization, Comparison, and Analysis of Existing Technologies

Quantum computers have been significantly advanced in recent years. Offered as cloud services, quantum computers have become accessible to a broad range of users. Along with the physical advances, the landscape of technologies supporting quantum application development has also grown rapidly in recent years. However, there is a variety of tools, services, and techniques available for the development of quantum applications, and which ones are best suited for a particular use case depends, among other things, on the quantum algorithm and quantum hardware. Thus, their selection is a manual and cumbersome process. To tackle this challenge, we introduce a categorization and a taxonomy of available tools, services, and techniques for quantum application development to enable their analysis and comparison. Based on that we further present a comparison framework to support quantum application developers in their decision for certain technologies.

Daniel Vietz, Johanna Barzen, Frank Leymann, Karoline Wild

Quantum Asymmetric Encryption Based on Quantum Point Obfuscation

Quantum obfuscation means encrypting the functionality of circuits or functions by quantum mechanics. It works as a form of quantum computation to improve the security and confidentiality of quantum programs. Although some quantum encryption schemes have been discussed, any quantum asymmetric scheme based on quantum obfuscation is not still proposed. In this paper, we construct an asymmetric encryption scheme based on quantum point function, which applies the advantages of quantum obfuscation to quantum public-key encryption. As a start of the study on applications of quantum obfuscation to asymmetric encryption, our work will be helpful in the future quantum obfuscation theory and will therefore promote the development of quantum computation.

Chuyue Pan, Tao Shang, Jianwei Liu

Index Calculus Method for Solving Elliptic Curve Discrete Logarithm Problem Using Quantum Annealing

This paper presents an index calculus method for elliptic curves over prime fields using quantum annealing. The relation searching step is transformed into the QUBO (Quadratic Unconstrained Boolean Optimization) problem, which may be efficiently solved using quantum annealing, for example, on a D-Wave computer. Unfortunately, it is hard to estimate the complexity of solving the given QUBO problem using quantum annealing. Using Leap hybrid sampler on the D-Wave Leap cloud, we could break ECDLP for the 8-bit prime field. The most powerful general-purpose quantum computers nowadays would break ECDLP at most for a 6-bit prime using Shor’s algorithm. In presented approach, the Semaev method of construction of decomposition base is used, where the decomposition base has a form $$\mathcal {B}=\left\{ x: 0 \le x \le p^{\frac{1}{m}} \right\} $$ B = x : 0 ≤ x ≤ p 1 m , with m being a fixed integer.

Michał Wroński

Simulations of Flow and Transport: Modeling, Algorithms and Computation

Frontmatter

Multi-phase Compressible Compositional Simulations with Phase Equilibrium Computation in the VTN Specification

In this paper, we present a numerical solution of a multi-phase compressible Darcy’s flow of a multi-component mixture in a porous medium. The mathematical model consists of mass conservation equation of each component, extended Darcy’s law for each phase, and an appropriate set of the initial and boundary conditions. The phase split is computed using the phase equilibrium computation in the VTN-specification (known as VTN-flash). The transport equations are solved numerically using the mixed-hybrid finite element method and a novel iterative IMPEC scheme [1]. We provide two examples showing the performance of the numerical scheme.

Tomáš Smejkal, Jiří Mikyška

A Three-Level Linearized Time Integration Scheme for Tumor Simulations with Cahn-Hilliard Equations

The paper contains an analysis of a three-level linearized time integration scheme for Cahn-Hilliard equations. We start with a rigorous mixed strong/variational formulation of the appropriate initial boundary value problem taking into account the existence and uniqueness of its solution. Next we pass to the definition of two time integration schemes: the Crank-Nicolson and a three-level linearized ones. Both schemes are applied to the discrete version of Cahn-Hilliard equation obtained through the Galerkin approximation in space. We prove that the sequence of solutions of the mixed three level finite difference scheme combined with the Galerkin approximation converges when the time step length and the space approximation error decrease. We also recall the verification of the second order of this scheme and its unconditional stability with respect to the time variable. A comparative scalability analysis of parallel implementations of the schemes is also presented.

Maciej Smołka, Maciej Woźniak, Robert Schaefer

Poroelasticity Modules in DarcyLite

This paper elaborates on design and implementation of code modules for finite element solvers for poroelasticity in our Matlab package DarcyLite [15]. The Biot’s model is adopted. Both linear and nonlinear cases are discussed. Numerical experiments are presented to demonstrate the accuracy and efficiency of these solvers.

Jiangguo Liu, Zhuoran Wang

Mathematical Modeling of the Single-Phase Multicomponent Flow in Porous Media

A numerical scheme of higher-order approximation in space for the single-phase multicomponent flow in porous media is presented. The mathematical model consists of Darcy velocity, transport equations for components of a mixture, pressure equation and associated relations for physical quantities such as viscosity or density. The discrete problem is obtained via discontinuous Galerkin method for the discretization of transport equations with the combination of mixed-hybrid finite element method for the discretization of Darcy velocity and pressure equation both using higher-order approximation. Subsequent problem is solved with the fully mass-conservative iterative IMPEC method. Numerical experiments of 2D flow are carried out.

Petr Gális, Jiří Mikyška

An Enhanced Finite Element Algorithm for Thermal Darcy Flows with Variable Viscosity

This paper deals with the development of a stable and efficient unified finite element method for the numerical solution of thermal Darcy flows with variable viscosity. The governing equations consist of coupling the Darcy equations for the pressure and velocity fields to a convection-diffusion equation for the heat transfer. The viscosity in the Darcy flows is assumed to be nonlinear depending on the temperature of the medium. The proposed method is based on combining a semi-Lagrangian scheme with a Galerkin finite element discretization of the governing equations along with an robust iterative solver for the associate linear systems. The main features of the enhanced finite element algorithm are that the same finite element space is used for all solutions to the problem including the pressure, velocity and temperature. In addition, the convection terms are accurately dealt with using the semi-Lagrangian scheme and the standard Courant-Friedrichs-Lewy condition is relaxed and the time truncation errors are reduced in the diffusion terms. Numerical results are presented for two examples to demonstrate the performance of the proposed finite element algorithm.

Loubna Salhi, Mofdi El-Amrani, Mohammed Seaid

Multilevel Adaptive Lagrange-Galerkin Methods for Unsteady Incompressible Viscous Flows

A highly efficient multilevel adaptive Lagrange-Galerkin finite element method for unsteady incompressible viscous flows is proposed in this work. The novel approach has several advantages including (i) the convective part is handled by the modified method of characteristics, (ii) the complex and irregular geometries are discretized using the quadratic finite elements, and (iii) for more accuracy and efficiency a multilevel adaptive $$\mathrm {L}^2$$ L 2 -projection using quadrature rules is employed. An error indicator based on the gradient of the velocity field is used in the current study for the multilevel adaptation. Contrary to the h-adaptive, p-adaptive and hp-adaptive finite element methods for incompressible flows, the resulted linear system in our Lagrange-Galerkin finite element method keeps the same fixed structure and size at each refinement in the adaptation procedure. To evaluate the performance of the proposed approach, we solve a coupled Burgers problem with known analytical solution for errors quantification then, we solve an incompressible flow past two circular cylinders to illustrate the performance of the multilevel adaptive algorithm.

Abdelouahed Ouardghi, Mofdi El-Amrani, Mohammed Seaid

Numerical Investigation of Transport Processes in Porous Media Under Laminar, Transitional and Turbulent Flow Conditions with the Lattice-Boltzmann Method

In the present paper the mass transfer in porous media under laminar, transitional and turbulent flow conditions was investigated using the lattice-Boltzmann method (LBM). While previous studies have applied the LBM to species transport in complex geometries under laminar conditions, the main objective of this study was to demonstrate its applicability to turbulent internal flows including the transport of a scalar quantity. Thus, besides the resolved scalar transport, an additional turbulent diffusion coefficient was introduced to account for the subgrid-scale turbulent transport. A packed-bed of spheres and an adsorber geometry based on $$\mu $$ μ CT scans were considered. While a two-relaxation time (TRT) model was applied to the laminar and transitional cases, the Bhatnagar-Gross-Krook (BGK) collision operator in conjunction with the Smagorinsky turbulence model was used for the turbulent flow regime. To validate the LBM results, simulations under the same conditions were carried out with ANSYS Fluent v19.2. It was found that the pressure drop over the height of the packed-bed were in close accordance to empirical correlations. Furthermore, the comparison of the calculated species concentrations for all flow regimes showed good agreement between the LBM and the results obtained with Ansys Fluent. Subsequently, the proposed extension of the Smagorinsky turbulence model seems to be able to predict the scalar transport under turbulent conditions.

Tobit Flatscher, René Prieler, Christoph Hochenauer

A Study on a Marine Reservoir and a Fluvial Reservoir History Matching Based on Ensemble Kalman Filter

In reservoir management, utilizing all the observed data to update the reservoir models is the key to make accurate forecast on the parameters changing and future production. Ensemble Kalman Filter (EnKF) provides a practical way to continuously update the petroleum reservoir models, but its application reliability in different reservoirs types and the proper design of the ensemble size are still remain unknown. In this paper, we mathematically demonstrated Ensemble Kalman Filter method; discussed its advantages over standard Kalman Filter and Extended Kalman Filter (EKF) in reservoir history matching, and the limitations of EnKF. We also carried out two numerical experiments on a marine reservoir and a fluvial reservoir by EnKF history matching method to update the static geological models by fitting bottom-hole pressure and well water cut, and found the optimal way of designing the ensemble size. A comparison of those the two numerical experiments is also presented. Lastly, we suggested some adjustments of the EnKF for its application in fluvial reservoirs.

Zelong Wang, Xiangui Liu, Haifa Tang, Zhikai Lv, Qunming Liu

Numerical Simulation of Free Surface Affected by Submarine with a Rotating Screw Moving Underwater

We conducted a numerical simulation of the free surface affected by the diving movement of an object such as a submarine. We have already proposed a computation method that combines the moving grid finite volume method and a surface height function method. In this case, the dive movement was expressed only as a traveling motion, not as a deformation. To express the deformation of a body underwater, the unstructured moving grid finite volume method and sliding mesh approach are combined. The calculation method is expected to be suitable for a computation with high versatility. After the scheme was validated, it was put to practical use. The free surface affected by a submarine with a rotating screw moving underwater was computed using the proposed method. Owing to the computation being for a relatively shallow depth, a remarkable deformation of the free surface occurred. In addition, the movement of the submarine body had a more dominant effect than a screw rotation on changing the shape of the free water surface.

Masashi Yamakawa, Kohei Yoshioka, Shinichi Asao, Seiichi Takeuchi, Atsuhide Kitagawa, Kyohei Tajiri

Modeling and Simulation of Atmospheric Water Generation Unit Using Anhydrous Salts

The atmosphere contains 3400 trillion gallons of water vapor, which would be enough to cover the entire earth in 1 inch of water. Air humidity is available everywhere, and it acts as a great alternative as a renewable reservoir of water known as atmospheric water. Atmospheric water harvesting system efficiency depends on the sorption capacity of water based on the adsorption phenomenon. Using anhydrous salts is an efficient process for capturing and delivering water from ambient air, especially at a low relative humidity as low as 15%. A lot of water-scarce countries like Saudi Arabia have much annual solar radiation and relatively high humidity. This study is focusing on modeling and simulating the water absorption and release of the anhydrous salt copper chloride ( $${CuCl}_{2}$$ CuCl 2 ) under different relative humidity to produce atmospheric drinking water in scarce regions.

Shereen K. Sibie, Mohamed F. El-Amin, Shuyu Sun

Smart Systems: Bringing Together Computer Vision, Sensor Networks and Machine Learning

Frontmatter

Improving UWB Indoor Localization Accuracy Using Sparse Fingerprinting and Transfer Learning

Indoor localization systems become more and more popular. Several technologies are intensively studied with application to high precision object localization in such environments. Ultra-wideband (UWB) is one of the most promising, as it combines relatively low cost and high localization accuracy, especially compared to Beacon or WiFi. Nevertheless, we noticed that leading UWB systems’ accuracy is far below values declared in the documentation. To improve it, we proposed a transfer learning approach, which combines high localization accuracy with low fingerprinting complexity. We perform very precise fingerprinting in a controlled environment to learn the neural network. When the system is deployed in a new localization, full fingerprinting is not necessary. We demonstrate that thanks to the transfer learning, high localization accuracy can be maintained when only 7% of fingerprinting samples from a new localization are used to update the neural network, which is very important in practical applications. It is also worth noticing that our approach can be easily extended to other localization technologies.

Krzysztof Adamkiewicz, Piotr Koch, Barbara Morawska, Piotr Lipiński, Krzysztof Lichy, Marcin Leplawy

Effective Car Collision Detection with Mobile Phone Only

Despite fast progress in the automotive industry, the number of deaths in car accidents is constantly growing. One of the most important challenges in this area, besides crash prevention, is immediate and precise notification of rescue services. Automatic crash detection systems go a long way towards improving these notifications, and new cars currently sold in developed countries often come with such systems factory installed. However, the majority of life threatening accidents occur in low-income countries, where these novel and expensive solutions will not become common anytime soon. This paper presents a method for detecting car collisions, which requires a mobile phone only, and therefore can be used in any type of car. The method was developed and evaluated using data from real crash tests. It integrates data series from various sensors using an optimized decision tree. The evaluation results show that it can successfully detect even minor collisions while keeping the number of false positives at an acceptable level.

Mateusz Paciorek, Adrian Kłusek, Piotr Wawryka, Michał Kosowski, Andrzej Piechowicz, Julia Plewa, Marek Powroźnik, Wojciech Wach, Bartosz Rakoczy, Aleksander Byrski, Marcin Kurdziel, Wojciech Turek

Corrosion Detection on Aircraft Fuselage with Multi-teacher Knowledge Distillation

The procedures of non-destructive inspection (NDI) are employed by the aerospace industry to reduce operational costs and the risk of catastrophe. The success of deep learning (DL) in numerous engineering applications encouraged us to check the usefulness of autonomous DL models also in this field. Particularly, in the inspection of the fuselage surface and search for corrosion defects. Herein, we present the tests of employing convolutional neural network (CNN) architectures in detecting small spots of corrosion on the fuselage surface and rivets. We use a unique and difficult dataset consisting of $$1.3\times 10^4$$ 1.3 × 10 4 images ( $$640\times 480$$ 640 × 480 ) of various fuselage parts from several aircraft types, brands, and service life. The images come from the non-invasive DAIS (D-Sight Aircraft Inspection System) inspection system, which can be treated as an analog image enhancement device. We demonstrate that our novel DL ensembling scheme, i.e., multi-teacher/single-student knowledge distillation architecture, allows for 100% detection of the images representing the “moderate corrosion” class on the test set. Simultaneously, we show that the proposed ensemble classifier, when used for the whole dataset with images representing various stages of corrosion, yields significant improvement in the classification accuracy in comparison to the baseline single ResNet50 neural network. Our work is the contribution to a relatively new discussion of deep learning applications in the fast inspection of the full surface of an aircraft fuselage but not only its fragments.

K. Zuchniak, W. Dzwinel, E. Majerz, A. Pasternak, K. Dragan

Warm-Start Meta-Ensembles for Forecasting Energy Consumption in Service Buildings

Energy Management Systems are equipments that normally perform the individual supervision of power controllable loads. With the objective of reducing energy costs, those management decisions result from algorithms that select how the different working periods of equipment should be combined, taking into account the usage of the locally generated renewable energy, electricity tariffs etc., while complying with the restrictions imposed by users and electric circuits. Forecasting energy usage, as described in this paper, allows to optimize the management being a major asset.This paper proposes and compares three new meta-methods for forecasts associated to real-valued time series, applied to the buildings energy consumption case, namely: a meta-method which uses a single regressor (called Sliding Regressor – SR), an ensemble of regressors with no memory of previous fittings (called Bagging Sliding Regressor – BSR), and a warm-start bagging meta-method (called Warm-start Bagging Sliding Regressor – WsBSR). The novelty of this framework is combination of the meta-methods, warm-start ensembles and time series in a forecast framework for energy consumption in buildings. Experimental tests done over data from an hotel show that, the best accuracy is obtained using the second method, though the last one has comparable results with less computational requirements.

Pedro J. S. Cardoso, Pedro M. M. Guerreiro, Jânio Monteiro, André S. Pedro, João M. F. Rodrigues

Supporting the Process of Sewer Pipes Inspection Using Machine Learning on Embedded Devices

We are currently seeing an increasing interest in using machine learning and image recognition methods to support routine human-made processes in various application domains. In the paper, the results of the conducted research on supporting the sewage network inspection process with the use of machine learning on embedded devices are presented. We analyze several image recognition algorithms on real-world data, and then we discuss the possibility of running these methods on embedded hardware accelerators.

Mieszko Klusek, Tomasz Szydlo

Explanation-Driven Model Stacking

With advances of artificial intelligence (AI), there is a growing need for provisioning of transparency and accountability to AI systems. These properties can be achieved with eXplainable AI (XAI) methods, extensively developed over the last few years with relation for machine learning (ML) models. However, the practical usage of XAI is limited nowadays in most of the cases to the feature engineering phase of the data mining (DM) process. We argue that explainability as a property of a system should be used along with other quality metrics such as accuracy, precision, recall in order to deliver better AI models. In this paper we present a method that allows for weighted ML model stacking and demonstrates its practical use in an illustrative example.

Szymon Bobek, Maciej Mozolewski, Grzegorz J. Nalepa

Software Engineering for Computational Science

Frontmatter

I/O Associations in Scientific Software: A Study of SWMM

Understanding which input and output variables are related to each other is important for metamorphic testing, a simple and effective approach for testing scientific software. We report in this paper a quantitative analysis of input/output (I/O) associations based on co-occurrence statistics of the user manual, as well as association rule mining of a user forum, of the Storm Water Management Model (SWMM). The results show a positive correlation of the identified I/O pairs, and further reveal the complementary aspects of the user manual and user forum in supporting scientific software engineering tasks.

Zedong Peng, Xuanyi Lin, Nan Niu, Omar I. Abdul-Aziz

Understanding Equity, Diversity and Inclusion Challenges Within the Research Software Community

Research software – specialist software used to support or undertake research – is of huge importance to researchers. It contributes to significant advances in the wider world and requires collaboration between people with diverse skills and backgrounds. Analysis of recent survey data provides evidence for a lack of diversity in the Research Software Engineer community. We identify interventions which could address challenges in the wider research software community and highlight areas where the community is becoming more diverse. There are also lessons that are applicable, more generally, to the field of software development around recruitment from other disciplines and the importance of welcoming communities.

Neil P. Chue Hong, Jeremy Cohen, Caroline Jay

Solving Problems with Uncertainty

Frontmatter

The Necessity and Difficulty of Navigating Uncertainty to Develop an Individual-Level Computational Model

The design of an individual-level computational model requires modelers to deal with uncertainty by making assumptions on causal mechanisms (when they are insufficiently characterized in a problem domain) or feature values (when available data does not cover all features that need to be initialized in the model). The simplifications and judgments that modelers make to construct a model are not commonly reported or rely on evasive justifications such as ‘for the sake of simplicity’, which adds another layer of uncertainty. In this paper, we present the first framework to transparently and systematically investigate which factors should be included in a model, where assumptions will be needed, and what level of uncertainty will be produced. We demonstrate that it is computationally prohibitive (i.e. NP-Hard) to create a model that supports a set of interventions while minimizing uncertainty. Since heuristics are necessary, we formally specify and evaluate two common strategies that emphasize different aspects of a model, such as building the ‘simplest’ model in number of rules or actively avoiding uncertainty.

Alexander J. Freund, Philippe J. Giabbanelli

Predicting Soccer Results Through Sentiment Analysis: A Graph Theory Approach

More than four out of 10 sports fans consider themselves soccer fans, making the game the world’s most popular sport. Sports are season based and constantly changing over time, as well, statistics vary according to the sport and league. Understanding sports communities in Social Networks and identifying fan’s expertise is a key indicator for soccer prediction. This research proposes a Machine Learning Model using polarity on a dataset of 3,000 tweets taken during the last game week on English Premier League season 19/20. The end goal is to achieve a flexible mechanism, which automatizes the process of gathering the corpus of tweets before a match, and classifies its sentiment to find the probability of a winning game by evaluating the network centrality.

Clarissa Miranda-Peña, Hector G. Ceballos, Laura Hervert-Escobar, Miguel Gonzalez-Mendoza

Advantages of Interval Modification of NURBS Curves in Modeling Uncertain Boundary Shape in Boundary Value Problems

In this paper, the advantages of interval modification of NURBS curves for modeling uncertainly defined boundary shapes in boundary value problems, are presented. The different interval techniques for modeling the uncertainty of linear as well as curvilinear shapes are considered. The uncertainty of the boundary shape is defined using interval coordinates of control points. The knots and weights in the proposed interval modification of NURBS curves are defined exactly. Such a definition allows for modification of the uncertainly defined shape without any change of interval values. The interval NURBS curves are compared with other interval techniques. The correctness of modeling the shape uncertainty is confirmed by solving the problem using the interval parametric integral equations system method. Such solutions (obtained using a program implemented by authors) confirm the advantages of using interval NURBS curves for modeling the boundary shape uncertainty. The shape approximation is improved using less number of interval input data and the obtained solutions are correct and less over-estimated.

Marta Kapturczak, Eugeniusz Zieniuk, Andrzej Kużelewski

Introducing Uncertainty into Explainable AI Methods

Learning from uncertain or incomplete data is one of the major challenges in building artificial intelligence systems. However, the research in this area is more focused on the impact of uncertainty on the algorithms performance or robustness, rather than on human understanding of the model and the explainability of the system. In this paper we present our work in the field of knowledge discovery from uncertain data and show its potential usage for the purpose of improving system interpretability by generating Local Uncertain Explanations (LUX) for machine learning models. We present a method that allows to propagate uncertainty of data into the explanation model, providing more insight into the certainty of the decision making process and certainty of explanations of these decisions. We demonstrate the method on synthetic, reproducible dataset and compare it to the most popular explanation frameworks.

Szymon Bobek, Grzegorz J. Nalepa

New Rank-Reversal Free Approach to Handle Interval Data in MCDA Problems

In many real-life decision-making problems, decisions have to be based on partially incomplete of uncertain data. Since classical MCDA methods were created to be used with numerical data, they are often unable to process incomplete or uncertain data. There are several ways to handle uncertainty and incompleteness in the data, i.e., interval numbers, fuzzy numbers, and their generalizations. New methods are developed, and classical methods are modified to work with incomplete and uncertain data. In this paper, we propose an extension of the SPOTIS method, which is a new rank-reversal free MCDA method. Our extension allows for applying this method to decision problems with missing or uncertain data. The proposed approach is compared in two study cases with other MCDA methods: COMET and TOPSIS. Obtained rankings would be analyzed using rank correlation coefficients.

Andrii Shekhovtsov, Bartłomiej Kizielewicz, Wojciech Sałabun

Vector and Triangular Representations of Project Estimation Uncertainty: Effect of Gender on Usability

The paper proposes a new visualisation in the form of vectors of not-fully-known quantitative features. The proposal is put in the context of project defining and planning and the importance of visualisation for decision making. The new approach is empirically compared with the already known visualisation utilizing membership functions of triangular fuzzy numbers. The designed and conducted experiment was aimed at evaluating the usability of the new approach according to ISO 9241–11. Overall 76 subjects performed 72 experimental conditions designed to assess the effectiveness of uncertainty conveyance. Efficiency and satisfaction were examined by participants subjective assessment of appropriate statements. The experiment results show that the proposed visualisation may constitute a significant alternative to the known, triangle-based visualisation. The paper emphasizes potential advantages for the proposed representation for project management and in other areas.

Dorota Kuchta, Jerzy Grobelny, Rafał Michalski, Jan Schneider

The Use of Type-2 Fuzzy Sets to Assess Delays in the Implementation of the Daily Operation Plan for the Operating Theatre

In the paper we present a critical time analysis of the project, in which there is a risk of delay in commencing project activities. We assume that activity times are type-2 fuzzy numbers. When experts estimate shapes of membership functions of times of activities, they take into account both situations when particular activities of the project start on time and situations when they start with a delay. We also suggest a method of a sensitivity analysis of these delays to meeting the project deadline. We present a case study in which the critical tome analysis was used to analyse processes implemented in the operating ward of a selected hospital in the South of Poland. Data for the empirical study was collected in the operating theatre of this hospital. This made it possible to identify non-procedural activities at the operating ward that have a significant impact on the duration of the entire operating process. In the hospital selected for testing implementation of the daily plan of surgeries was at risk every day. The research shows that the expected delay in performing the typical daily plan - two surgeries in one operating room – could be about 1 h. That may result in significant costs of overtime. Additionally, the consequence may also include extension of the queue of patients waiting for their surgeries. We show that elimination of occurrence of surgery activity delays allows for execution of the typical daily plan of surgeries within a working day in the studied hospital.

Barbara Gładysz, Anna Skowrońska-Szmer, Wojciech Nowak

Linguistic Summaries Using Interval-Valued Fuzzy Representation of Imprecise Information - An Innovative Tool for Detecting Outliers

The practice of textual and numerical information processing often involves the need to analyze and test a database for the presence of items that differ substantially from other records. Such items, referred to as outliers, can be successfully detected using linguistic summaries. In this paper, we extend this approach by the use of non-monotonic quantifiers and interval-valued fuzzy sets. The results obtained by this innovative method confirm its usefulness for outlier detection, which is of significant practical relevance for database analysis applications.

Agnieszka Duraj, Piotr S. Szczepaniak

Combining Heterogeneous Indicators by Adopting Adaptive MCDA: Dealing with Uncertainty

Adaptive MCDA systematically supports the dynamic combination of heterogeneous indicators to assess overall performance. The method is completely generic and is currently adopted to undertake a number of studies in the area of sustainability. The intrinsic heterogeneity characterizing this kind of analysis leads to a number of biases, which need to be properly considered and understood to correctly interpret computational results in context. While on one side the method provides a comprehensive data-driven analysis framework, on the other side it introduces a number of uncertainties that are object of discussion in this paper. Uncertainty is approached holistically, meaning we address all uncertainty aspects introduced by the computational method to deal with the different biases. As extensively discussed in the paper, by identifying the uncertainty associated with the different phases of the process and by providing metrics to measure it, the interpretation of results can be considered more consistent, transparent and, therefore, reliable.

Salvatore F. Pileggi

Solutions and Challenges in Computing FBSDEs with Large Jumps for Dam and Reservoir System Operation

Optimal control of Lévy jump-driven stochastic differential equations plays a central role in management of resource and environment. Problems involving large Lévy jumps are still challenging due to their mathematical and computational complexities. We focus on numerical control of a real-scale dam and reservoir system from the viewpoint of forward-backward stochastic differential equations (FBSDEs): a new mathematical tool in this research area. The problem itself is simple but unique, and involves key challenges common to stochastic systems driven by large Lévy jumps. We firstly present an exactly-solvable linear-quadratic problem and numerically analyze convergence of different numerical schemes. Then, a more realistic problem with a hard constraint of state variables and a more complex objective function is analyzed, demonstrating that the relatively simple schemes perform well.

Hidekazu Yoshioka

Optimization of Resources Allocation in High Performance Computing Under Utilization Uncertainty

In this work, we study resources co-allocation approaches for a dependable execution of parallel jobs in high performance computing systems with heterogeneous hosts. Complex computing systems often operate under conditions of the resources availability uncertainty caused by job-flow execution features, local operations, and other static and dynamic utilization events. At the same time, there is a high demand for reliable computational services ensuring an adequate quality of service level. Thus, it is necessary to maintain a trade-off between the available scheduling services (for example, guaranteed resources reservations) and the overall resources usage efficiency. The proposed solution can optimize resources allocation and reservation procedure for parallel jobs’ execution considering static and dynamic features of the resources’ utilization by using the resources availability as a target criterion.

Victor Toporkov, Dmitry Yemelyanov, Maksim Grigorenko

A Comparison of the Richardson Extrapolation and the Approximation Error Estimation on the Ensemble of Numerical Solutions

The epistemic uncertainty quantification concerning the estimation of the approximation error using the differences between numerical solutions treated in the Inverse Problem statement is addressed and compared with the Richardson extrapolation. The Inverse Problem is posed in the variational statement with the zero order Tikhonov regularization. The ensemble of numerical results, obtained by the OpenFOAM solvers for the inviscid compressible flow with a shock wave is analyzed. The approximation errors, obtained by the Richardson extrapolation and the Inverse Problem are compared with the exact error, computed as the difference of numerical solutions and the analytical solution. The Inverse problem based approach is demonstrated to be an inexpensive alternative to the Richardson extrapolation.

Aleksey K. Alekseev, Alexander E. Bondarev, Artem E. Kuvshinnikov

Predicted Distribution Density Estimation for Streaming Data

Recent growth in interest concerning streaming data has been forced by the expansion of systems successively providing current measurements and information, which enables their ongoing, consecutive analysis. The subject of this research is the determination of a density function characterizing potentially changeable distribution of streaming data. Stationary and nonstationary conditions, as well as both appearing alternately, are allowed. Within the distribution-free procedure investigated here, when the data stream becomes nonstationary, the procedure begins to be supported by a forecasting apparatus. Atypical elements are also detected, after which the meaning of those connected with new tendencies strengthens, while diminishing elements weaken. The final result is an effective procedure, ready for use without studies and laborious research.

Piotr Kulczycki, Tomasz Rybotycki

LSTM Processing of Experimental Time Series with Varied Quality

Automatic processing and verification of data obtained in experiments have an essential role in modern science. In the paper, we discuss the assessment of data obtained in meteorological measurements conducted in Biebrza National Park in Poland. The data is essential for understanding the complex environmental processes, such as global warming. The measurements of CO2 flux brings a vast amount of data but suffer from drawbacks like high uncertainty. Part of the data has a high-level of credibility while, others are not reliable. The method of automatic evaluation of data with varied quality is proposed. We use LSTM networks with a weighted square mean error loss function. This approach allows incorporating the information on data reliability in the training process.

Krzysztof Podlaski, Michał Durka, Tomasz Gwizdałła, Alicja Miniak-Górecka, Krzysztof Fortuniak, Włodzimierz Pawlak

Sampling Method for the Robust Single Machine Scheduling with Uncertain Parameters

Many real problems are defined in an uncertain environment where different parameters such as processing times, setup times, release dates or due dates are not known at the time of determining the solution. As using deterministic approach very often provides solutions with poor performance, several approaches have been developed to embrace the uncertainty and the most of the methods are based on: stochastic modeling using random variables, fuzzy modeling or bound form where values are taken from a specific interval. In the paper we consider a single machine scheduling problem with uncertain parameters modeled by random variables with normal distribution. We apply the sampling method which we investigate as an extension to the tabu search algorithm. Sampling provides very promising results and it is also a very universal method which can be easily adapted to many other optimization algorithms, not only tabu search. Conducted computational experiments confirm that results obtained by the proposed method are much more robust than the ones obtained using the deterministic approach.

Paweł Rajba

Teaching Computational Science

Frontmatter

Biophysical Modeling of Excitable Cells - A New Approach to Undergraduate Computational Biology Curriculum Development

As part of a broader effort of developing a comprehensive neuroscience curriculum, we implemented an interdisciplinary, one-semester, upper-level course called Biophysical Modeling of Excitable Cells (BMEC). The course exposes undergraduate students to broad areas of computational biology. It focuses on computational neuroscience (CNS), develops scientific literacy, promotes teamwork between biology, psychology, physics, and mathematics-oriented undergraduate students. This course also provides pedagogical experience for senior Ph.D. students from the Neuroscience Department at the Medical University of South Carolina (MUSC). BMEC is a three contact hours per week lecture-based course that includes a set of computer-based activities designed to gradually increase the undergraduates’ ability to apply mathematics and computational concepts to solving biologically-relevant problems. The class brings together two different groups of students with very dissimilar and complementary backgrounds, i.e., biology or psychology and physics or mathematics oriented. The teamwork allows students with more substantial biology or psychology background to explain to physics or mathematics students the biological implications and instill realism into the computer modeling project they completed for this class. Simultaneously, students with substantial physics and mathematics backgrounds can apply techniques learned in specialized mathematics, physics, or computer science classes to generate mathematical hypotheses and implement them in computer codes.

Sorinel A. Oprisan

Increasing the Impact of Teacher Presence in Online Lectures

We present a freely available, easy to use system for promoting teacher presence during slide-supported online lectures, meant to aid effective learning and reduce students’ sense of isolation. The core idea is to overlay the teacher’s body directly onto the slide and move it and scale it dynamically according to the currently presented content. Our implementation runs entirely locally in the browser and uses machine learning and chroma keying techniques to segment and project only the instructor’s body onto the presentation. Students not only see the face of the teacher but they also perceive as the teacher, with his/her gaze and hand gestures, directs their attention to the areas of the slides being analyzed.We include an evaluation of the system by using it for online teaching programming courses for 134 students from 10 different study programs. The gathered feedback in terms of attention benefit, student satisfaction, and perceived learning, strongly endorse the usefulness and potential of enhanced teacher presence in general, and our web application in particular.

David Iclanzan, Zoltán Kátai

Model-Based Approach to Automated Provisioning of Collaborative Educational Services

The purpose of the presented work was to ease the creation of new educational environments to be used by consortia of educational institutions. The proposed approach allows teachers to take advantage of technological means and shorten the time it takes to create new remote collaboration environments for their students, even if the teachers are not adept at using cloud services. To achieve that, we decided to leverage the Model Driven Architecture, and provide the teachers with convenient, high-level abstractions, by using which they are able to easily express their needs. The abstract models are used as inputs to an orchestrator, which takes care of provisioning the described services. We claim that such approach both reduces the time of virtual laboratory setup, and provides for more widespread use of cloud-based technologies in day-to-day teaching. The article discusses both the model-driven approach and the results obtained from implementing a working prototype, customized for IT trainings, deployed in the Małopolska Educational Cloud testbed.

Raul Llopis Gandia, Sławomir Zieliński, Marek Konieczny

A Collaborative Peer Review Process for Grading Coding Assignments

With software technology becoming one of the most important aspects of computational science, it is imperative that we train students in the use of software development tools and teach them to adhere to sustainable software development workflows. In this paper, we showcase how we employ a collaborative peer review workflow for the homework assignments of our course on Numerical Linear Algebra for High Performance Computing (HPC). In the workflow we employ, the students are required to operate with the git version control system, perform code reviews, realize unit tests, and plug into a continuous integration system. From the students’ performance and feedback, we are optimistic that this workflow encourages the acceptance and usage of software development tools in academic software development.

Pratik Nayak, Fritz Göbel, Hartwig Anzt

How Do Teams of Novice Modelers Choose an Approach? An Iterated, Repeated Experiment in a First-Year Modeling Course

There are a variety of factors that can influence the decision of which modeling technique to select for a problem being investigated, such as a modeler’s familiarity with a technique, or the characteristics of the problem. We present a study which controls for modeler familiarity by studying novice modelers choosing between the only modeling techniques they have been introduced to: in this case, cellular automata and agent-based models. Undergraduates in introductory modeling courses in 2018 and 2019 were asked to consider a set of modeling problems, first on their own, and then collaboratively with a partner. They completed a questionnaire in which they characterized their modeling method, rated the factors that influenced their decision, and characterized the problem according to contrasting adjectives. Applying a decision tree algorithm to the responses, we discovered that one question (Is the problem complex or simple?) explained 72.72% of their choices. When asked to resolve a conflicting choice with their partners, we observed the repeated themes of mobility and decision-making in their explanation of which problem characteristics influence their resolution. This study provides both qualitative and quantitative insights into factors driving modeling choice among novice modelers. These insights are valuable for instructors teaching computational modeling, by identifying key factors shaping how students resolve conflict with different preferences and negotiate a mutually agreeable choice in the decision process in a team project environment.

Philippe J. Giabbanelli, Piper J. Jackson

Uncertainty Quantification for Computational Models

Frontmatter

Detection of Conditional Dependence Between Multiple Variables Using Multiinformation

We consider a problem of detecting the conditional dependence between multiple discrete variables. This is a generalization of well-known and widely studied problem of testing the conditional independence between two variables given a third one. The issue is important in various applications. For example, in the context of supervised learning, such test can be used to verify model adequacy of the popular Naive Bayes classifier. In epidemiology, there is a need to verify whether the occurrences of multiple diseases are dependent. However, focusing solely on occurrences of diseases may be misleading, as one has to take into account the confounding variables (such as gender or age) and preferably consider the conditional dependencies between diseases given the confounding variables. To address the aforementioned problem, we propose to use conditional multiinformation (CMI), which is a measure derived from information theory. We prove some new properties of CMI. To account for the uncertainty associated with a given data sample, we propose a formal statistical test of conditional independence based on the empirical version of CMI. The main contribution of the work is determination of the asymptotic distribution of empirical CMI, which leads to construction of the asymptotic test for conditional independence. The asymptotic test is compared with the permutation test and the scaled chi squared test. Simulation experiments indicate that the asymptotic test achieves larger power than the competitive methods thus leading to more frequent detection of conditional dependencies when they occur. We apply the method to detect dependencies in medical data set MIMIC-III.

Jan Mielniczuk, Paweł Teisseyre

Uncertainty Quantification of Coupled 1D Arterial Blood Flow and 3D Tissue Perfusion Models Using the INSIST Framework

We perform uncertainty quantification on a one-dimensional arterial blood flow model and investigate the resulting uncertainty in a coupled tissue perfusion model of the brain. The application of interest for this study is acute ischemic stroke. The outcome of interest is infarct volume, estimated using the change in perfusion between the healthy and occluded state (assuming no treatment). Secondary outcomes are the uncertainty in blood flow at the outlets of the network, which provide the boundary conditions to the pial surface of the brain in the tissue perfusion model. Uncertainty in heart stroke volume, heart rate, blood density, and blood viscosity are considered. Results show uncertainty in blood flow at the network outlets is similar to the uncertainty included in the inputs, however the resulting uncertainty in infarct volume is significantly smaller. These results provide evidence when assessing the credibility of the coupled models for use in in silico clinical trials.

Claire Miller, Max van der Kolk, Raymond Padmos, Tamás Józsa, Alfons Hoekstra

Second Order Moments of Multivariate Hermite Polynomials in Correlated Random Variables

Polynomial chaos methods can be used to estimate solutions of partial differential equations under uncertainty described by random variables. The stochastic solution is represented by a polynomial expansion, whose deterministic coefficient functions are recovered through Galerkin projections. In the presence of multiple uncertainties, the projection step introduces products (second order moments) of the basis polynomials. When the input random variables are correlated Gaussians, calculating the products of the corresponding multivariate basis polynomials is not straightforward and can become computationally expensive. We present a new expression for the products by introducing multiset notation for the polynomial indexing, which allows for simple and efficient evaluation of the second-order moments of correlated multivariate Hermite polynomials.

Laura Lyman, Gianluca Iaccarino

Backmatter

Additional information