Neural network design and model reduction approach for black box nonlinear system identification with reduced number of parameters

doi:10.1016/j.neucom.2012.08.013

Neurocomputing

Volume 101, 4 February 2013, Pages 170-180

https://doi.org/10.1016/j.neucom.2012.08.013 Get rights and content

Abstract

In this paper a dedicated recurrent neural network design and a model reduction approach are proposed in order to improve the balance between complexity and quality of black box nonlinear system identification models. The proposed neural network design, based on a three-layers architecture, helps to reduce the number of parameters of the model after the training phase without any significant loss of estimation accuracy. Nevertheless, the proposed architecture remains sufficiently general to provide a wide range of models among the most encountered in the literature. This reduction, achieved by a convenient choice of the activation functions and the initial conditions of the synaptic weights, is developed in two steps. The first step is to train the proposed architecture under two reasonable assumptions. Then the recurrent three-layers neural network is transformed into a representation of two-layer with less number of neurons, that is, a significant reduced number of parameters. The constructed architecture provided models with reasonable reduced number of parameters with a convenient estimation accuracy. To validate the proposed approach, we identify the Wiener–Hammerstein benchmark nonlinear system proposed in SYSID2009 [1].

Introduction

Models are used in engineering fields for analysis, supervision, fault detection, prediction, estimation of unmeasurable variables, optimization and model-based control process [2], [3], [4], [5], [6], [7], [8], [9], [10]. Therefore, accurate process representations with a reasonable low complexity are naturally required [4]. In the sequel, we shall denote for greatest convenience, the skillful balance between accuracy and complexity by model quality [11].

Basically, a model can be constructed along two routes or a combination of them [12]. One route is called physical modeling: this approach is based on the physical mechanisms that govern the system's behavior. The models achieved by this way are adequate approximations of the real process [11]. But, in many cases involving complex nonlinear systems, it is very difficult or impossible to derive dynamic models based on all the physical processes involved [4], [13], [14]. Numerous systems in the industrial fields show nonlinearities and uncertainties, which can profitably be considered as partially or totally black box [15].

Black-box models use general mathematical approximation functions in order to describe the systems input/output relations. One of the most important advantages of black box system identification techniques is the limited physical insights required to develop the model [16], but as a trade-off, these techniques imply the use of model structures that are as flexible as possible. Often, this variability leads to a high number of parameters [17]. Since the model quality is defined in terms of model complexity and its capabilities to reproduce the system behavior [11], the aim of this paper is to tackle this constraint with the design of a recurrent neural network adapted to system identification purposes and a model reduction approach. We shall see that using this original procedure yields to ready-to-use models with a small number of parameters in a black box approach.

Artificial neural networks consist of a large number of interconnected processing elements known as neurons that operate as microprocessors [18]. They are characterized by the following features [19], [20], [21], [22], [23], [24], [25]: approximation capability of nonlinear functions via activation functions, ability to process many inputs and outputs and the synaptic weights are automatically adjusted by the learning algorithm during the training. Recent research results show that neural networks are very effective for modeling the complex nonlinear systems when we consider the plant as a black box [15], [5], especially those which are hard to describe mathematically [18], [26], [27]. However, the main problem of neural networks is that they require a large number of neurons to deal with complex systems [25]. More neurons presume a better approximation but lead to a more complex model too [18], [28].

Even if, nowadays we have at our disposal modern CPU, FPGA or graphics processors architectures, in applications like adaptive control, in particular inverse control, the adaptive controllers have the same complexity as the reference models [6], [7], [8]. If low order reference models are introduced, the number of parameters to be computed will be reduced [29], therefore, models with small number of parameters are preferred. By reducing the number of neurons we reduce directly the number of parameters (synaptic weights) of the neural network model.

Different works have been achieved in order to solve the problem of large number of neurons and approximation accuracy. Some techniques consist in determining the “optimal” number of neurons. Trial and error method is one of them [30], but this approach is laborious and may not arrive to an optimum structure [31]. Network pruning techniques have also been successfully used for structural optimization [32], [6], [33]. These techniques consist in selecting a fairly good number of neurons, to be gradually reduced in the course of a series of trainings in order to find the “optimal” number of neurons. However, these methods still suffer from slow convergence [31]. More recently, evolutionary techniques as genetic algorithms (GAs) and particle swarm optimization (PSO) have been employed to optimize weights, layers, number of input–output nodes, neurons and derive optimal neural network structures [31], [34]. Ge et al. [35] proposed a learning algorithm centered on a dissimilation particle swarm optimization, serving to compute the optimal synaptic weights, the learning rate, and the architecture evolution (optimal number of neurons in the hidden layer). Xie et al. [27] developed an algorithm to optimize the neural network by means of a connection matrix and genetic algorithms. The matrix represents the connection between neurons in adjacent layers (direct matrix mapping encoding, DMME) and the genetic algorithms helps to find the optimal number of neurons in the hidden layer. Coelho and Wicthoff [4] use genetic programming (GP) to estimate the evolution of its architecture in order to generate more appropriate or accurate models. In [36] an hybrid multiobjective evolutionary artificial neural network, based on a combination of GAs and singular value architectural recombination (SVAR), is proposed to prune the neural network. A disadvantage of the previously outlined techniques is the excessive time requirements to find the most convenient number of neurons, since the neural network must be trained every time the model is modified or restructured [26]. Moreover, to solve the problem of finding the best trade-off between model complexity and model accuracy, rather subjective criterion is always used to decide whether the evolution of the neural network is appropriate and sufficient, thus, we can consider that this problem still remains unsolved.

Other techniques trying to solve the same problem are based on the design of the neural network. In [24] a novel time-delay recurrent neural network (TDRNN) is proposed to generate a simple structure. In [28] a neural network using a competitive scheme is proposed in order to provide an effective method with less network complexity. In [37] the selection of an appropriate FLANN (functional link artificial neural network) structure as the backbone of the model offers low complexity by means of a single layer ANN structure.

In the same vein, to overcome the disadvantages of the “architecture evolution techniques” and with the conviction that the improvement of the model quality is linked to a suitable neural network design, we decided to tackle the problem by proposing a dedicated neural network design and a model reduction approach. In this sense, the main contribution of the paper is to develop a particular architecture depending on two factors: (1) the activation functions in each layer and (2) the initial conditions of the synaptic weights. The reader shall notice that the proposed architecture nevertheless remains sufficiently general to provide a wide range of useful model types. The model reduction approach is developed in two steps: the first step consists in training a recurrent three-layers neural network chosen to tolerate an initial large number of neurons. In a second step, the three-layers architecture is transformed into a two-layers representation with a significant reduced number of weights retaining the approximation accuracy of the previous three-layers model. The preceding neural network provide models, with reasonable reduced number of parameters and a convenient estimation accuracy. These model types are currently used for model-based control techniques [5], [6], [7], [8]. The learning algorithm used to optimize the synaptic weights is the classical steepest descent algorithm with a back propagation configuration, since in this paper the main purpose is not to optimize the learning algorithm.

In the sequel, the paper is organized as follows: First we shall introduce the new neural network structure which allows us the investigation of the balance complexity/accuracy (see Section 2). In Section 3 the theorem on which our model reduction approach is based is given, as well as the related description of the method. In Section 4, the paper discusses the results of the identification of a benchmark system. Subsequently conclusions and perspectives are given in Section 5. Finally, the neural network training procedure is presented in Appendix A.

Section snippets

Neural network design

As mentioned above the new neural network design is presented in this section. Fig. 1 shows a recurrent three layers neural network with $2 ⁎ nn$ neurons in the input layer, two neurons in the hidden layer and one neuron in the output layer. By different combinations of activation functions, the proposed architecture allows us to generate easily four classical models first presented in [7], according to Narendra and Parthasarathy the proposed models are motivated by models which have been used in

A new model reduction method

As already explained, this approach provides models of relevant quality in the sense defined in the preceding paragraph under two simple assumptions, whose purpose is to achieve two design conditions. The first one is a neural architecture design condition and the second one is a training design condition.

Assumption 1

At least all the activation functions of one layer should be chosen linear, that is, $φ_{1} (T) = T$ or $φ_{2} (T) = T$ or $φ_{3} (T) = T$ in Fig. 1.

The reader shall notice that Assumption 1 is not very restrictive,

Results

For comprehensive reasons, we have chosen in the sequel to only present simulation results to illustrate the interest of our approach, even if satisfactory results have been driven from experimental set up such as an acoustic duct and piezzo-electric actuator. The identified system was the case study reported in the Wiener–Hammerstein benchmark, according to the description of [1].

Conclusion

A new neural network design for black box nonlinear system identification is proposed. The special structure leads to an accurate model with a straightforward structure. We conclude that the model reduction procedure does not carry along a loss of accuracy during the reduction or the training of the neural network, as it certainly happens when other traditional techniques are used.

With the proposed model reduction approach the number of neurons initially chosen to identify a system does not

Héctor Manuel Romero Ugalde received his Bachelor's degree in Electronic Engineering from the University of Veracruz (2002–2006 Xalapa, Ver. Mexico), and the Master of Science degree in Electronic Engineering from the National Research and Technological Development Center (CENIDET) (2006–2008 Cuernavaca, Mor. MEXICO).

Now he is pursuing the doctorate degree in automatic control under the co-supervision between CENIDET and Arts et Metiers ParisTech (2008–2012 Aix en Provence, France). His current

References (39)

P. Aadaleesan et al.
Nonlinear system identification using Wiener type Laguerre-wavelet network model
Chem. Eng. Sci.
(2008)
Faezeh Farivar et al.
An interdisciplinary overview and intelligent control of human prosthetic eye movements system for the emotional support by a huggable pet-type robot from a biomechatronical viewpoint
J. Franklin Inst.
(2012)
X. Han et al.
Nonlinear systems identification using dynamic multi-time scale neural networks
Neurocomputing
(2011)
M. Witters et al.
Black-box model identification for a continuously variable, electro-hydraulic semi-active damper
Mech. Syst. Signal Process.
(2010)
H. Ge et al.
Identification and control of nonlinear systems by a time-delay recurrent neural network
Neurocomputing
(2009)
S. Tzeng
Design of fuzzy wavelet neural networks using the ga approach for function approximation and system identification
Fuzzy Sets Syst.
(2010)
B. Subudhi et al.
A differential evolution based neural network approach to nonlinear system identification
Appl. Soft Comput.
(2011)
W. Xie et al.
Nonlinear system identification using optimized dynamic neural network
Neurocomputing
(2009)
R. Chen
Reducing network and computation complexities in neural based real-time scheduling scheme
Appl. Math. Comput.
(2011)
S.-K. Oh et al.
Genetic optimization-driven multi-layer hybrid fuzzy neural networks
Simulation Modelling Pract. Theory
(2006)

H. Ge et al.

Identification and control of nonlinear systems by a dissimilation particle swarm optimization-based Elman neural network

Nonlinear Anal. Real World Appl.

(2008)

B. Majhi et al.

Robust identification of nonlinear complex systems using low complexity and particle swarm optimization technique

Expert Syst. Appl.

(2011)

W. Yu

Multiple recurrent neural networks for stable adaptive control

Neurocomputing

(2006)

J. Schoukens, J. Suykens, L. Ljung, Wiener–Hammerstein benchmark, in: 15th IFAC Symposium on System Identification,...

K. Hangos et al.

Analysis and Control of Nonlinear Process Systems

(2004)

L. Coelho et al.

Nonlinear identification using a b-spline neural network and chaotic immune approaches

Mech. Syst. signal Process.

(2009)

M. Noorgard et al.

Neural Networks for Modelling and Control of Dynamic Systems

(2000)

K. Narendra et al.

Identification and control of dynamical systems using neural networks

IEEE Trans. Neural Networks

(1990)

Z. Yan et al.

Modeling and control of nonlinear discrete-time systems based on compound neural networks

Chin. J. Chem. Eng.

(2009)

Cited by (41)

Hybrid gray and black-box nonlinear system identification of an elastomer joint flexible robotic manipulator
2023, Mechanical Systems and Signal Processing
Series elastic actuators possess several properties that make them widely used in collaborative robots, which play a major role in the current paradigm of industry 4.0. However, the compliant element responsible for those desired properties can also be responsible for the addition of unmodeled nonlinearities in the system. Therefore, the aim of this paper is to propose a novel hybrid model approach and apply it in the modeling of an elastomer-based series elastic joint. The proposed hybrid model combines a phenomenological gray-box model with a black-box Nonlinear Auto-regressive model with Exogenous inputs, which is able to provide the desired physical insight while enhancing accuracy by addressing the unknown nonlinearities. The results showed that the proposed hybrid model is more than 60% more accurate than the phenomenological model, considering the mean square error, and obtained a multiple correlation coefficient up to 0.97, indicating its capacity to be used in the construction of a digital twin of the system.
Topological graph representation of stratigraphic properties of spatial-geological characteristics and compression modulus prediction by mechanism-driven learning
2023, Computers and Geotechnics
Citation Excerpt :
1) NNs-based stratigraphic property prediction models require storing large amounts of borehole and experimental mechanical data. These data increase linearly with depth, leading to the unbounded exponential growth of model parameters for NNs (Romero Ugalde et al., 2013), which is prone to insufficient memory in hardware devices, resulting in considerably reduced analysis efficiency. 2) In general, the parametric weighting of purely data-driven ML models creates an ill-posed problem (Kim and Nakata, 2018; Travassos et al., 2020), making the training-generated model and its predictions not unique, leading to discontinuous changes in the behavior of the solution relative to the initial conditions. (3))
The soil's compression modulus (Es) is one of the most critical mechanical parameters for studying land subsidence in urban strata. Meanwhile, the vertical heterogeneity and lateral discontinuity in the distribution of Es is one of the leading causes of TBM (tunnel-boring-machine) entrapment in the field. In this study, we first obtain the minimum redundancy input variables for Es prediction by comparing 20 purely data-driven models. Based on this, we propose a static directed and weighted graph based on spatial correlation and sequence stratigraphy for mechanism-driven learning, the spatial-geological stratigraphic graph (SG-SPG), where edges encode essential spatial-geological information in direction and weights. Then we built a spatial-geological graph attention network (SGGAT) for node-level estimation and edge-level correction of Es prediction, including two attention layers to adequately capture the dependencies between spatial location and lithology. The results show that our method outperforms state-of-the-art models (R² = 0.99, RMSE = 0.0593 MPa). Furthermore, through network analysis of SG-SPG, we find that increasing global transitivity significantly improves accuracy, while decreasing local connectivity and degree centrality of nodes slightly increases the prediction residuals. Our approach provides interpretability for graph neural network (GNN) and is advantageous for geophysical property prediction problems with Spatio-temporal heterogeneity.
Non-Autoregressive vs Autoregressive Neural Networks for System Identification
2021, IFAC-PapersOnLine
The application of neural networks to non-linear dynamic system identification tasks has a long history, which consists mostly of autoregressive approaches. Autoregression, the usage of the model outputs of previous time steps, is a method of transferring a system state between time steps, which is not necessary for modeling dynamic systems with modern neural network structures, such as gated recurrent units (GRUs) and Temporal Convolutional Networks (TCNs). We compare the accuracy and execution performance of autoregressive and non-autoregressive implementations of a GRU and TCN on the simulation task of three publicly available system identification benchmarks. Our results show, that the non-autoregressive neural networks are significantly faster and at least as accurate as their autoregressive counterparts. Comparisons with other state-of-the-art black-box system identification methods show, that our implementation of the non-autoregressive GRU is the best performing neural network-based system identification method, and in the benchmarks without extrapolation, the best performing black-box method.
Nonlinear black-box system identification through coevolutionary algorithms and radial basis function artificial neural networks
2020, Applied Soft Computing Journal
The present work deals with the application of coevolutionary algorithms and artificial neural networks to perform input selection and related parameter estimation for nonlinear black-box models in system identification. In order to decouple the resolution of the input selection and parameter estimation, we propose a problem decomposition formulation and solve it by a coevolutionary algorithm strategy. The novel methodology is successfully applied to identify a magnetorheological damper, a continuous polymerization reactor and a piezoelectric robotic micromanipulator. The results show that the method provides valid models in terms of accuracy and statistical properties. The main advantage of the method is the joint input and parameter estimation, towards automating a tedious and error prone procedure with global optimization algorithms.
Fractional order neural networks for system identification
2020, Chaos, Solitons and Fractals
Citation Excerpt :
The neural network is designed to ensure a good balance between accuracy and complexity after the training, which is performed only once. In this paper, we follow the neural network design method proposed in [5–7], to derive balanced simplicity-accuracy system identification models. Furthermore, we take advantage of fractional order calculus to reduce even more the number of parameters of the proposed neural network-based models.
Neural networks and fractional order calculus have shown to be powerful tools for system identification. In this paper we combine both approaches to propose a fractional order neural network (FONN) for system identification. The learning algorithm was generalized considering the Grünwald-Letnikov fractional derivative. This new black box modeling approach is validated by the identification of three different systems (two benchmark systems and a real system). Comparisons vs others approaches showed that the proposed FONN model reached better accuracy with less number of parameters.
A varying-gain recurrent neural-network with super exponential convergence rate for solving nonlinear time-varying systems
2019, Neurocomputing
In order to solve a nonlinear time-varying system, a novel varying-gain recurrent neural network (termed as VG-RNN) is proposed and analyzed. To achieve a fast convergent performance, a vector-based unbounded error function is first defined. Second, a varying-gain neural dynamic approach is employed to design the recurrent neural network formula. Being different from the traditional constant-gain recurrent neural networks with fixed design parameters such as the gradient-based neural network (termed as GNN) and the zeroing neural network (termed as ZNN), the gain coefficient of the proposed VG-RNN is time-varying, which can change with time evolves. Otherwise, compared to the previous numerical methods on solving nonlinear time-varying systems, the solution obtained by VG-RNN is more precise. Third, rigorous mathematics analysis proves the super exponential convergence and accuracy of the proposed VG-RNN. Numerical experiments demonstrate the high accuracy, effectiveness and superiority of the VG-RNN compared with the conventional neural networks for solving nonlinear time-varying systems. Furthermore, we hope to apply the theory proposed in this paper to practical nonlinear time-varying automatic control systems, such as robots with nonlinear time-varying systems.

View all citing articles on Scopus

Now he is pursuing the doctorate degree in automatic control under the co-supervision between CENIDET and Arts et Metiers ParisTech (2008–2012 Aix en Provence, France). His current research interests include neural networks and non linear systems identification.

Jean Claude Carmona is a full professor in Automatic Control at Arts et Metiers ParisTech, engineering high school in Aix en Provence, France.

He received his PhD in automatic control in 1990 then his HDR degree in 2004 in the Aix Marseille University.

His main research interests include identification methods, particularly robust estimation and nonlinear identification methods dedicated to robust control. Its main application domains are the complex mechanical systems, especially for vibrations control purposes and hybrid multisources systems using renewable energies.

Victor M. Alvarado was born in Mexico in 1968. He received the Ph.D. degree in automatic control from Laboratoire d’Automatique de Grenoble in France, 2001. He is presently an Associate Professor of automatic control in the Electronics Engineering Department of CENIDET, Cuernavaca, Mexico. His research interests encompass theory and applications in systems identification, model validation, and process control.

Juan Reyes-Reyes earned his electronics technician high school degree at the Technological Institute of Saltillo (Instituto Tecnológico de Saltillo) from 1987 to 1990, at the same institute (1990–1994 Saltillo, Coah. MEXICO) he earned his Industrial and Electronics Engineer bachelor's degree.

Additionally, he achieved his Electrical Engineering Master in Sciences' degree (1995–1997 Mexico City) at the Advanced Research and Studies Center (CINVESTAV), in the same center he also achieved his Automatic Control Doctor in Sciences' degree (1998–2001).

He has Level 1 membership grade in the National Researchers System (Sistema Nacional de Investigadores, MEXICO) with the membership number 32033. He is member of the CONACYT Engineering and Industry Recognized Referees Record (MEXICO) whose membership number is RCEA-07-19975-2010. He was research professor at the Electronics Engineering Department (2009) of the National Research and Technological Development Center (CENIDET) in Cuernavaca, Mor. Mexico.

Nowadays (June 2012) he is research professor at the Technological Institute of Zacatepec (Instituto Tecnológico de Zacatepec).

His areas of interest are control of uncertain non-linear systems, dynamic neural networks, passivity, Lyapunov analysis, fault tolerant systems, hardware and software integration.

View full text

Neural network design and model reduction approach for black box nonlinear system identification with reduced number of parameters

Abstract

Introduction

Section snippets

Neural network design

A new model reduction method

Results

Conclusion

Chem. Eng. Sci.

J. Franklin Inst.

Neurocomputing

Mech. Syst. Signal Process.

Neurocomputing

Fuzzy Sets Syst.

Appl. Soft Comput.

Neurocomputing

Appl. Math. Comput.

Simulation Modelling Pract. Theory

Nonlinear Anal. Real World Appl.

Expert Syst. Appl.

Neurocomputing

Analysis and Control of Nonlinear Process Systems

Nonlinear identification using a b-spline neural network and chaotic immune approaches

Mech. Syst. signal Process.

Neural Networks for Modelling and Control of Dynamic Systems

Identification and control of dynamical systems using neural networks

IEEE Trans. Neural Networks

Modeling and control of nonlinear discrete-time systems based on compound neural networks

Chin. J. Chem. Eng.