Elsevier

Neural Networks

Volume 14, Issue 9, November 2001, Pages 1161-1172
Neural Networks

Invited article
How to be a gray box: dynamic semi-physical modeling

https://doi.org/10.1016/S0893-6080(01)00096-XGet rights and content

Abstract

A general methodology for gray-box, or semi-physical, modeling is presented. This technique is intended to combine the best of two worlds: knowledge-based modeling, whereby mathematical equations are derived in order to describe a process, based on a physical (or chemical, biological, etc.) analysis, and black-box modeling, whereby a parameterized model is designed, whose parameters are estimated solely from measurements made on the process. The gray-box modeling technique is very valuable whenever a knowledge-based model exists, but is not fully satisfactory and cannot be improved by further analysis (or can only be improved at a very large computational cost). We describe the design methodology of a gray-box model, and illustrate it on a didactic example. We emphasize the importance of the choice of the discretization scheme used for transforming the differential equations of the knowledge-based model into a set of discrete-time recurrent equations. Finally, an application to a real, complex industrial process is presented.

Introduction

The traditional view of neural networks is that of ‘black-box models’, i.e. models that are obtained from data alone, through an elaborate parameter estimation process called training. Although these methods have gained wide acceptance in industry, it is understandable that, for processes that have been investigated extensively from the point of view of physics (or chemistry, biology, etc.), there is some reluctance to forsake altogether a knowledge-based model in order to design a black-box model even though the latter is expected to be more accurate. Very frequently, a knowledge-based model is available, which is not satisfactory for the purpose of interest, but still accounts for the main features of the dynamics of the process. In such a case, it is desirable to take advantage of the existing knowledge while keeping the flexibility of parameterized models trained from data. In the present paper, we review a technique called semi-physical modeling, or knowledge-based neural modeling, which allows the model designer to incorporate, into the structure of the neural network, whatever prior knowledge is available, provided the latter is expressed as algebraic or differential equations. We present a general methodology for designing semi-physical models; we emphasize the importance of the discretization scheme used to transform differential equations arising from physics into discrete-time equations that are suitable for numerical processing. Traditionally, recurrent neural networks have been used in the framework of explicit discretization schemes; we show that they can also be used within the framework of implicit discretization schemes, which may improve the stability of the recurrent network model to a large extent.

The first part of the paper is devoted to the presentation of the mathematical framework of semi-physical neural modeling. The various steps of the design of a model are explained and illustrated by a didactic example. An original training algorithm is introduced, for use with models that are based on an implicit discretization scheme. In the second part of the paper, we describe an industrial application that makes use of this design strategy.

Section snippets

Dynamic semi-physical neural modeling: mathematical framework

A knowledge-based model is a mathematical description of the phenomena that occur in a process, based on the equations of physics and chemistry (or biology, sociology, etc.); typically, the equations involved in the model may be transport equations, equations of thermodynamics, mass conservation equations, etc. They contain parameters that have a physical meaning (e.g. activation energies, diffusion coefficients, etc.), and they may also contain a small number of parameters that are determined

Outline of the problem

We present a problem that was investigated in the framework of a collaboration between our group and 3M Inc.

The process to be modeled is the oven drying of a thin coating on an impermeable substrate. The coating (polymer phase) contains a single nonvolatile component—the polymer—and a single volatile component—the solvent. The solvent diffuses through the polymer and evaporates into the gas phase when reaching the surface. No diffusion takes place through the substrate. The evaporation of the

Conclusion

We have shown that dynamic semi-physical (knowledge-based) neural modeling can be a very powerful strategy for the design of models that combine the best of two worlds: the legibility of knowledge-based models and the flexibility of training from experimental data. It allows the integration of the mathematical equations derived from a knowledge-based model into the structure of the neural network. A very important issue, namely the stability of the recurrent neural network model, has been

Acknowledgements

The authors are very grateful to Dr Romdhane, Dr Stoppiglia and Dr Kinoghlu, of 3M, for many fruitful, stimulating and enlightening discussions.

References (5)

  • O. Nerrand et al.

    Neural networks and non-linear adaptive filtering: unifying concepts and new algorithms

    Neural Computation

    (1993)
  • Oussar, Y. (1998). Réseaux d'ondelettes et réseaux de neurones pour la modélisation statique et dynamique de processus....
There are more references available in the full text version of this article.

Cited by (69)

  • On Recurrent Neural Networks for learning-based control: Recent results and ideas for future developments

    2022, Journal of Process Control
    Citation Excerpt :

    Depending on the case at hand, such grey-box approaches consist of shaping the structure of the RNN model in specific ways or using, during the training procedure, a suitable loss function, so as to impose consistency with the physical knowledge of the system. Contributions in this direction have been proposed by many authors, see e.g. the definition of Semi-Empirical NN [92], the use of canonical forms [93], or the methods described in [94,95]. As for control, a notable and almost unique contribution to this emerging field has been presented in [17], where a methodological approach has been devised and applied to a simulated chemical process.

  • A novel implicit hybrid machine learning model and its application for reinforcement learning

    2021, Computers and Chemical Engineering
    Citation Excerpt :

    Higher order explicit methods have also been leveraged (Lovelett et al., 2019). Oussar et al. demonstrated that implicit methods can be used to solve hybrid models, using the fixed-point method (Oussar and Dreyfus, 2001). Oussar et al. warned against the use of implicit schemes because of their complex implementation.

  • Including steady-state information in nonlinear models: An application to the development of soft-sensors

    2021, Engineering Applications of Artificial Intelligence
    Citation Excerpt :

    The proposed methodology does not assume any particular structures and the auxiliary information is incorporated in the parameter estimation stage by changing the cost function in a multiobjective fashion. In terms of model class, grey-box identification was first implemented using linear structures (Tulleken, 1993; Eskinat et al., 1993; Johansen, 1996), but it seems more powerful for nonlinear structures, including polynomial models (Corrêa et al., 2002; Nepomuceno et al., 2003; Aguirre et al., 2004a; Barbosa et al., 2011), radial basis functions (RBF) (Aguirre et al., 2007; Chen et al., 2009, 2011; de Almeida Rego et al., 2014), fuzzy systems (Abonyi et al., 2001; Abdelazim and Malik, 2005; Sánchez et al., 2014), multilayer perceptron (MLP) or recurrent (RNN) neural networks (Psichogios and Ungar, 1992; Thompson and Kramer, 1994; Braake et al., 1998; Oussar and Dreyfus, 2001; Aguirre et al., 2004; Wu et al., 2020). Models that are linear with respect to the parameters usually lead to convex problems (Corrêa et al., 2002; Nepomuceno et al., 2003; Aguirre et al., 2004a; Barbosa et al., 2011), that are easier to deal with and the static curve can be sometimes determined analytically depending on the model structure (Aguirre et al., 2004a).

View all citing articles on Scopus
View full text