Invited articleHow to be a gray box: dynamic semi-physical modeling
Introduction
The traditional view of neural networks is that of ‘black-box models’, i.e. models that are obtained from data alone, through an elaborate parameter estimation process called training. Although these methods have gained wide acceptance in industry, it is understandable that, for processes that have been investigated extensively from the point of view of physics (or chemistry, biology, etc.), there is some reluctance to forsake altogether a knowledge-based model in order to design a black-box model even though the latter is expected to be more accurate. Very frequently, a knowledge-based model is available, which is not satisfactory for the purpose of interest, but still accounts for the main features of the dynamics of the process. In such a case, it is desirable to take advantage of the existing knowledge while keeping the flexibility of parameterized models trained from data. In the present paper, we review a technique called semi-physical modeling, or knowledge-based neural modeling, which allows the model designer to incorporate, into the structure of the neural network, whatever prior knowledge is available, provided the latter is expressed as algebraic or differential equations. We present a general methodology for designing semi-physical models; we emphasize the importance of the discretization scheme used to transform differential equations arising from physics into discrete-time equations that are suitable for numerical processing. Traditionally, recurrent neural networks have been used in the framework of explicit discretization schemes; we show that they can also be used within the framework of implicit discretization schemes, which may improve the stability of the recurrent network model to a large extent.
The first part of the paper is devoted to the presentation of the mathematical framework of semi-physical neural modeling. The various steps of the design of a model are explained and illustrated by a didactic example. An original training algorithm is introduced, for use with models that are based on an implicit discretization scheme. In the second part of the paper, we describe an industrial application that makes use of this design strategy.
Section snippets
Dynamic semi-physical neural modeling: mathematical framework
A knowledge-based model is a mathematical description of the phenomena that occur in a process, based on the equations of physics and chemistry (or biology, sociology, etc.); typically, the equations involved in the model may be transport equations, equations of thermodynamics, mass conservation equations, etc. They contain parameters that have a physical meaning (e.g. activation energies, diffusion coefficients, etc.), and they may also contain a small number of parameters that are determined
Outline of the problem
We present a problem that was investigated in the framework of a collaboration between our group and 3M Inc.
The process to be modeled is the oven drying of a thin coating on an impermeable substrate. The coating (polymer phase) contains a single nonvolatile component—the polymer—and a single volatile component—the solvent. The solvent diffuses through the polymer and evaporates into the gas phase when reaching the surface. No diffusion takes place through the substrate. The evaporation of the
Conclusion
We have shown that dynamic semi-physical (knowledge-based) neural modeling can be a very powerful strategy for the design of models that combine the best of two worlds: the legibility of knowledge-based models and the flexibility of training from experimental data. It allows the integration of the mathematical equations derived from a knowledge-based model into the structure of the neural network. A very important issue, namely the stability of the recurrent neural network model, has been
Acknowledgements
The authors are very grateful to Dr Romdhane, Dr Stoppiglia and Dr Kinoghlu, of 3M, for many fruitful, stimulating and enlightening discussions.
References (5)
- et al.
Neural networks and non-linear adaptive filtering: unifying concepts and new algorithms
Neural Computation
(1993) - Oussar, Y. (1998). Réseaux d'ondelettes et réseaux de neurones pour la modélisation statique et dynamique de processus....
Cited by (69)
On Recurrent Neural Networks for learning-based control: Recent results and ideas for future developments
2022, Journal of Process ControlCitation Excerpt :Depending on the case at hand, such grey-box approaches consist of shaping the structure of the RNN model in specific ways or using, during the training procedure, a suitable loss function, so as to impose consistency with the physical knowledge of the system. Contributions in this direction have been proposed by many authors, see e.g. the definition of Semi-Empirical NN [92], the use of canonical forms [93], or the methods described in [94,95]. As for control, a notable and almost unique contribution to this emerging field has been presented in [17], where a methodological approach has been devised and applied to a simulated chemical process.
A novel implicit hybrid machine learning model and its application for reinforcement learning
2021, Computers and Chemical EngineeringCitation Excerpt :Higher order explicit methods have also been leveraged (Lovelett et al., 2019). Oussar et al. demonstrated that implicit methods can be used to solve hybrid models, using the fixed-point method (Oussar and Dreyfus, 2001). Oussar et al. warned against the use of implicit schemes because of their complex implementation.
Including steady-state information in nonlinear models: An application to the development of soft-sensors
2021, Engineering Applications of Artificial IntelligenceCitation Excerpt :The proposed methodology does not assume any particular structures and the auxiliary information is incorporated in the parameter estimation stage by changing the cost function in a multiobjective fashion. In terms of model class, grey-box identification was first implemented using linear structures (Tulleken, 1993; Eskinat et al., 1993; Johansen, 1996), but it seems more powerful for nonlinear structures, including polynomial models (Corrêa et al., 2002; Nepomuceno et al., 2003; Aguirre et al., 2004a; Barbosa et al., 2011), radial basis functions (RBF) (Aguirre et al., 2007; Chen et al., 2009, 2011; de Almeida Rego et al., 2014), fuzzy systems (Abonyi et al., 2001; Abdelazim and Malik, 2005; Sánchez et al., 2014), multilayer perceptron (MLP) or recurrent (RNN) neural networks (Psichogios and Ungar, 1992; Thompson and Kramer, 1994; Braake et al., 1998; Oussar and Dreyfus, 2001; Aguirre et al., 2004; Wu et al., 2020). Models that are linear with respect to the parameters usually lead to convex problems (Corrêa et al., 2002; Nepomuceno et al., 2003; Aguirre et al., 2004a; Barbosa et al., 2011), that are easier to deal with and the static curve can be sometimes determined analytically depending on the model structure (Aguirre et al., 2004a).
Grey box modeling of a packed-bed regenerator using recurrent neural networks
2019, IFAC-PapersOnLineSemi-empirical neural network based approach to modelling and simulation of controlled dynamical systems
2018, Procedia Computer Science