Mark Asch: “A Toolbox for Digital Twins: From Model-Based to Data-Driven”
SIAM, 2022, xxiv+832 pp
- Open Access
- 27-10-2025
- Book Review
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by (Link opens in a new window)
Digital twins have recently become very popular in various application fields, including engineering, medicine, economy, and Earth science. They combine computational models with data from an individual physical object or system in order to make predictions and improve decision-making. The book A Toolbox for Digital Twins: From Model-Based to Data-Driven by Mark Asch provides an extensive overview of the mathematical methods used in digital twins.
Advertisement
The interplay of mathematical models and available data is central to digital twins in general and therefore to the book in particular. Some of the major challenges addressed in the book are: On purpose, the book does not provide plug-and-play digital twins which may be used right away and do not require expert knowledge of the user. It is argued that such digital twins are typically based on coupled ordinary differential equations which are too simple and lack important features such as dynamic self-updating based on measured data. Instead, the reader is provided with a basic understanding of the most important tools to be able to use them in a concrete application context. Nevertheless, the book includes many examples with source code, which are summarized in a toolbox available on the author’s GitHub account. In addition, there are many references to available software packages as well as to publications which allow to delve deeper into the various topics. One of the major goals of the book is to raise the awareness for mathematical techniques which are nowadays not sufficiently used in digital twin applications and could potentially extend the applicability of digital twins significantly. The book also includes several guidelines to facilitate the decision on which mathematical methods to use depending on the application context.
-
How to find a suitable trade-off between the complexity and the efficiency of mathematical models?
-
How to extract relevant information from the available data?
-
How to combine the available data and mathematical models so that the digital twin can use the best of both to support decision-making?
The idea to combine simulation data with measured online data of the asset of consideration in order to improve the decision-making is certainly older than the term “digital twin”. For instance, model-predictive control has been applied since the late 1970’s and combines models and data in a similar fashion as digital twins. Another example is data assimilation, which has been used since the 1960’s for weather forecast and incorporates both mathematical models and measured data. The concept of a digital twin was probably first used in NASA’s Apollo program, although without using the term “digital twin”. For example, simulators and vehicle modeling were used after an oxygen tank explosion on board of the Apollo 13 space shuttle in order to diagnose the problem and to work out solutions in real-time. Moreover, the space vehicles of the Apollo program were equipped with on-board computers which used model-based and data-driven techniques like the Kalman filter to estimate the flight trajectory. In the early 2000’s, the notion of dynamic data-driven application systems has been introduced and shares many similarities with the one of digital twins, as it includes a feedback loop between simulations and measurements.
The term “digital twin” has been first used in the early 2010’s. The general idea stems from product life-cycle management and has been proposed as a conceptual ideal which combines a physical asset and its virtual counterpart via a bidirectional transfer of information throughout the whole life time of the asset. The concept of a digital twin was also presented in a couple of NASA reports and papers in the early 2010’s and used, among others, for the development of future air- and spacecrafts. Over the past fifteen years, digital twins have seen a tremendous increase in research efforts and industrial applications, as they pave the way for significantly improved capabilities in prediction, optimization, etc., and can be used in a wide range of application areas.
In the literature there exist various definitions of digital twins and the book uses one proposed by the Aerospace Industries Association. According to this definition, a digital twin is a “set of virtual information constructs that mimics the structure, context, and behavior of an individual/unique physical asset, or a group of physical assets”. Moreover, the definition emphasizes that a digital twin is dynamically updated based on data from the corresponding physical twin and that it provides information which allows to make value-added decisions. Hence, an important feature of digital twins is a bidirectional exchange of information with its physical twin: Data from the physical twin is used by the digital twin to improve its mathematical models and, on the other hand, the mathematical models are used to make predictions about the behavior of the physical twin and the gained insights can be used for decision-making. For instance, such decisions may include suitable dynamic control strategies, e.g., speed control for trains, or the planning of maintenance measures to prevent impending damage. In order to allow the digital twin to be dynamically updated, its mathematical models must enable such updates, e.g., by allowing for dynamic changes of some of the parameters. This is mandatory for enabling the digital twin to mimic transient effects such as aging and properly predict the behavior of its physical twin throughout its life cycle. Another crucial feature of a digital twin is that it is tailored for an individual asset or a group of individual assets. For instance, a digital twin of a train does not just consist of a generic train model, but it takes into account the specific features, measures, routes, etc. of an individual train and constantly adapts its models based on the data coming from the individual train. This is in contrast to classical computational models which are rather static and consider generic physical assets instead of individual ones.
Advertisement
In the following, the importance of mathematical methods for digital twins is demonstrated by means of an example from the book. This is a prediction use case and deals with the modeling of underground reservoirs, e.g., of oil, gas, or groundwater. The flow in such a medium may be described by a diffusion equation of the form with corresponding initial and boundary conditions. Here, \(p\) denotes the pressure, \(k\) the permeability parameter, \(f\) a source term, and \(\Omega \) the spatial domain. We are interested in the pressure at some location \(\hat{x}\in \Omega \), which may correspond to an oil well, for example. Especially, given some data of the pressure at \(\hat{x}\) over some past time interval, we aim to predict the pressure in a future time interval. Since the parameter \(k\) is typically unknown, we are not only interested in predictions but also in an uncertainty quantification resulting from the uncertainty in \(k\). In the book, a data-driven modeling approach is presented for solving this task. First, the original initial-boundary value problem is discretized using the finite element method in space and the implicit Euler method in time. Then, to generate some training data, the discretized system is solved for several values of the permeability parameter \(k\) drawn from a prior distribution and the computed pressure time series at \(\hat{x}\) are stored. Afterwards, several statistical methods are applied including a principal component analysis for each time interval, which in this case allows to reduce the data to just one degree of freedom per time interval. For these single coordinates, a regression is performed to estimate the conditional probability of the coordinate \(\tilde{c}_{1}\) in the future time interval given the coordinate \(c_{1}\) in the past time interval. In total, these steps yield a statistical prediction model which is used as follows: As a preliminary step, the coordinate \(c_{1}\) is uniformly sampled and these samples are combined with the associated principal component to obtain a corresponding library of pressure time series. Then, for a new data set \(p_{\mathrm{obs}}\) over the past time interval, we solve an optimization problem and search for the pressure time series within the library that is closest to \(p_{\mathrm{obs}}\) w.r.t. the \(L_{1}\)-norm. Based on the \(c_{1}\) coordinate corresponding to this closest pressure time series, we compute the posterior distribution of \(\tilde{c}_{1}\). Combining this distribution with the associated principal component, we obtain a posterior distribution for the pressure in the future time interval, i.e., we obtain both a prediction and an uncertainty quantification as desired. In summary, this prediction use case requires several mathematical methods from mathematical modeling via partial differential equations, numerical mathematics, statistics and probability theory, and optimization. It should be emphasized that while this use case is rather academic in nature, the book also provides references to real-world applications where this approach has been used, e.g., to optimize the placement of oil wells.
$$ \frac{\partial p}{\partial t} -\nabla \cdot (k\nabla p) = f\quad \text{in }\Omega $$
The book is divided into three parts and the first part comprises seven chapters which provide the basis for the methods and concepts discussed in the remainder of the book. To this end, fundamentals from different topics in applied mathematics are introduced including probability theory, numerical simulation, optimization, machine learning, and control theory. Moreover, the first chapter provides a general introduction to digital twins including a definition and their characteristic features. In addition, a special emphasis is placed on the so-called inference cycle which may be used for describing the perpetual process of scientific discovery. More precisely, this cycle is described by a periodic sequence of experiments, induction, abduction, and deduction and is proposed to serve as a compass for digital twins. Accordingly, the inference cycle is revisited multiple times in the book in order to classify the considered methods with respect to the four phases of the inference cycle. Another important idea which is briefly introduced in the first chapter is to consider the decision-making of a digital twin as a process with three nested loops: The outermost loop consists of the decision-making itself, the middle loop involves optimization where models have to be evaluated repeatedly, and the inner loop comprises the actual evaluation of the models or generation of data.
Part II of the book consists of seven chapters and is dedicated to the most important methods required for constructing digital twins. Addressed topics include the treatment of inverse problems, data assimilation, scientific machine learning, and reduced-order methods which enable a digital twin to make sufficiently accurate predictions approximately in real time. The final chapter of part II, Chapter 14, is intended as a bridge between the methodology presented in the first two parts of the book and the more application-oriented discussion in the third part. It is emphasized that there is no single approach for realizing a digital twin. Instead, there are many choices to be made and these are summarized while providing a brief guidance on how to decide on these options in practice. Furthermore, the triple-loop approach from the first chapter is revisited and it is briefly mentioned which of the methods discussed in the first two parts of the book are suitable for which of the loops. At the end of Chapter 14, also the inference cycle is recalled while addressing related concepts from the machine learning literature, such as predictive or active learning.
The third and shortest part of the book includes four chapters which focus on how to realize a digital twin in practice. The discussion begins with presenting several toy examples which may be used as a first step to test the methods and algorithms used in a digital twin. Afterwards, real-world applications are presented with a special emphasis on digital twins in which the author was personally involved. In particular, a short discussion on the application of digital twins in medicine is presented, followed by a chapter addressing applications in environmental science. In the latter chapter, two applications are treated in more detail: the simulation of the biosonar of whales and underground reservoir modeling as outlined above. Finally, the book closes with an overview on applications of digital twins in advanced material design and with a reference to the author’s Github account for further examples of digital twins.
The book is aimed at students, scientists, and practitioners who are interested in digital twins and, in particular, the mathematical tools required for their construction and realization. Some mathematical background is certainly advantageous for reading the book, but it appears to be in principle also suitable for undergraduate students and scientists from other disciplines, as it requires only little mathematical pre-knowledge. The book introduces the basics for numerous techniques from various areas of applied mathematics, while focusing on their importance in the context of digital twins. Even though the focus is on both model-based and data-driven approaches, overall the book leaves me with the impression that data-driven aspects are more pronounced than model-based techniques. The fact that the book is targeted for a rather broad readership and not primarily for mathematicians is reflected throughout the book, for instance, in the introductory chapter where digital twins are introduced and explained. The reader will neither find a rigorous mathematical definition of a digital twin nor a mathematically precise explanation of the single phases of the inference cycle. However, it should be emphasized that this rather application-oriented focus fits well to most of the literature on digital twins since the concept originates from applications and rigorous mathematical descriptions of it are rarely found in the literature.
All in all, the book provides a comprehensive overview on mathematical methods for digital twins. Especially, the wide range of approaches being discussed in the book without losing the focus on digital twins and without requiring much expert knowledge makes this book a valuable and unique contribution. In general, I believe that the book serves as a well-suited introduction to digital twins and as a good starting point to delve into the various methods which are in the core of a digital twin.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.