Elsevier

Neurocomputing

Volume 101, 4 February 2013, Pages 170-180
Neurocomputing

Neural network design and model reduction approach for black box nonlinear system identification with reduced number of parameters

https://doi.org/10.1016/j.neucom.2012.08.013Get rights and content

Abstract

In this paper a dedicated recurrent neural network design and a model reduction approach are proposed in order to improve the balance between complexity and quality of black box nonlinear system identification models. The proposed neural network design, based on a three-layers architecture, helps to reduce the number of parameters of the model after the training phase without any significant loss of estimation accuracy. Nevertheless, the proposed architecture remains sufficiently general to provide a wide range of models among the most encountered in the literature. This reduction, achieved by a convenient choice of the activation functions and the initial conditions of the synaptic weights, is developed in two steps. The first step is to train the proposed architecture under two reasonable assumptions. Then the recurrent three-layers neural network is transformed into a representation of two-layer with less number of neurons, that is, a significant reduced number of parameters. The constructed architecture provided models with reasonable reduced number of parameters with a convenient estimation accuracy. To validate the proposed approach, we identify the Wiener–Hammerstein benchmark nonlinear system proposed in SYSID2009 [1].

Introduction

Models are used in engineering fields for analysis, supervision, fault detection, prediction, estimation of unmeasurable variables, optimization and model-based control process [2], [3], [4], [5], [6], [7], [8], [9], [10]. Therefore, accurate process representations with a reasonable low complexity are naturally required [4]. In the sequel, we shall denote for greatest convenience, the skillful balance between accuracy and complexity by model quality [11].

Basically, a model can be constructed along two routes or a combination of them [12]. One route is called physical modeling: this approach is based on the physical mechanisms that govern the system's behavior. The models achieved by this way are adequate approximations of the real process [11]. But, in many cases involving complex nonlinear systems, it is very difficult or impossible to derive dynamic models based on all the physical processes involved [4], [13], [14]. Numerous systems in the industrial fields show nonlinearities and uncertainties, which can profitably be considered as partially or totally black box [15].

Black-box models use general mathematical approximation functions in order to describe the systems input/output relations. One of the most important advantages of black box system identification techniques is the limited physical insights required to develop the model [16], but as a trade-off, these techniques imply the use of model structures that are as flexible as possible. Often, this variability leads to a high number of parameters [17]. Since the model quality is defined in terms of model complexity and its capabilities to reproduce the system behavior [11], the aim of this paper is to tackle this constraint with the design of a recurrent neural network adapted to system identification purposes and a model reduction approach. We shall see that using this original procedure yields to ready-to-use models with a small number of parameters in a black box approach.

Artificial neural networks consist of a large number of interconnected processing elements known as neurons that operate as microprocessors [18]. They are characterized by the following features [19], [20], [21], [22], [23], [24], [25]: approximation capability of nonlinear functions via activation functions, ability to process many inputs and outputs and the synaptic weights are automatically adjusted by the learning algorithm during the training. Recent research results show that neural networks are very effective for modeling the complex nonlinear systems when we consider the plant as a black box [15], [5], especially those which are hard to describe mathematically [18], [26], [27]. However, the main problem of neural networks is that they require a large number of neurons to deal with complex systems [25]. More neurons presume a better approximation but lead to a more complex model too [18], [28].

Even if, nowadays we have at our disposal modern CPU, FPGA or graphics processors architectures, in applications like adaptive control, in particular inverse control, the adaptive controllers have the same complexity as the reference models [6], [7], [8]. If low order reference models are introduced, the number of parameters to be computed will be reduced [29], therefore, models with small number of parameters are preferred. By reducing the number of neurons we reduce directly the number of parameters (synaptic weights) of the neural network model.

Different works have been achieved in order to solve the problem of large number of neurons and approximation accuracy. Some techniques consist in determining the “optimal” number of neurons. Trial and error method is one of them [30], but this approach is laborious and may not arrive to an optimum structure [31]. Network pruning techniques have also been successfully used for structural optimization [32], [6], [33]. These techniques consist in selecting a fairly good number of neurons, to be gradually reduced in the course of a series of trainings in order to find the “optimal” number of neurons. However, these methods still suffer from slow convergence [31]. More recently, evolutionary techniques as genetic algorithms (GAs) and particle swarm optimization (PSO) have been employed to optimize weights, layers, number of input–output nodes, neurons and derive optimal neural network structures [31], [34]. Ge et al. [35] proposed a learning algorithm centered on a dissimilation particle swarm optimization, serving to compute the optimal synaptic weights, the learning rate, and the architecture evolution (optimal number of neurons in the hidden layer). Xie et al. [27] developed an algorithm to optimize the neural network by means of a connection matrix and genetic algorithms. The matrix represents the connection between neurons in adjacent layers (direct matrix mapping encoding, DMME) and the genetic algorithms helps to find the optimal number of neurons in the hidden layer. Coelho and Wicthoff [4] use genetic programming (GP) to estimate the evolution of its architecture in order to generate more appropriate or accurate models. In [36] an hybrid multiobjective evolutionary artificial neural network, based on a combination of GAs and singular value architectural recombination (SVAR), is proposed to prune the neural network. A disadvantage of the previously outlined techniques is the excessive time requirements to find the most convenient number of neurons, since the neural network must be trained every time the model is modified or restructured [26]. Moreover, to solve the problem of finding the best trade-off between model complexity and model accuracy, rather subjective criterion is always used to decide whether the evolution of the neural network is appropriate and sufficient, thus, we can consider that this problem still remains unsolved.

Other techniques trying to solve the same problem are based on the design of the neural network. In [24] a novel time-delay recurrent neural network (TDRNN) is proposed to generate a simple structure. In [28] a neural network using a competitive scheme is proposed in order to provide an effective method with less network complexity. In [37] the selection of an appropriate FLANN (functional link artificial neural network) structure as the backbone of the model offers low complexity by means of a single layer ANN structure.

In the same vein, to overcome the disadvantages of the “architecture evolution techniques” and with the conviction that the improvement of the model quality is linked to a suitable neural network design, we decided to tackle the problem by proposing a dedicated neural network design and a model reduction approach. In this sense, the main contribution of the paper is to develop a particular architecture depending on two factors: (1) the activation functions in each layer and (2) the initial conditions of the synaptic weights. The reader shall notice that the proposed architecture nevertheless remains sufficiently general to provide a wide range of useful model types. The model reduction approach is developed in two steps: the first step consists in training a recurrent three-layers neural network chosen to tolerate an initial large number of neurons. In a second step, the three-layers architecture is transformed into a two-layers representation with a significant reduced number of weights retaining the approximation accuracy of the previous three-layers model. The preceding neural network provide models, with reasonable reduced number of parameters and a convenient estimation accuracy. These model types are currently used for model-based control techniques [5], [6], [7], [8]. The learning algorithm used to optimize the synaptic weights is the classical steepest descent algorithm with a back propagation configuration, since in this paper the main purpose is not to optimize the learning algorithm.

In the sequel, the paper is organized as follows: First we shall introduce the new neural network structure which allows us the investigation of the balance complexity/accuracy (see Section 2). In Section 3 the theorem on which our model reduction approach is based is given, as well as the related description of the method. In Section 4, the paper discusses the results of the identification of a benchmark system. Subsequently conclusions and perspectives are given in Section 5. Finally, the neural network training procedure is presented in Appendix A.

Section snippets

Neural network design

As mentioned above the new neural network design is presented in this section. Fig. 1 shows a recurrent three layers neural network with 2nn neurons in the input layer, two neurons in the hidden layer and one neuron in the output layer. By different combinations of activation functions, the proposed architecture allows us to generate easily four classical models first presented in [7], according to Narendra and Parthasarathy the proposed models are motivated by models which have been used in

A new model reduction method

As already explained, this approach provides models of relevant quality in the sense defined in the preceding paragraph under two simple assumptions, whose purpose is to achieve two design conditions. The first one is a neural architecture design condition and the second one is a training design condition.

Assumption 1

At least all the activation functions of one layer should be chosen linear, that is, φ1(T)=T or φ2(T)=T or φ3(T)=T in Fig. 1.

The reader shall notice that Assumption 1 is not very restrictive,

Results

For comprehensive reasons, we have chosen in the sequel to only present simulation results to illustrate the interest of our approach, even if satisfactory results have been driven from experimental set up such as an acoustic duct and piezzo-electric actuator. The identified system was the case study reported in the Wiener–Hammerstein benchmark, according to the description of [1].

Conclusion

A new neural network design for black box nonlinear system identification is proposed. The special structure leads to an accurate model with a straightforward structure. We conclude that the model reduction procedure does not carry along a loss of accuracy during the reduction or the training of the neural network, as it certainly happens when other traditional techniques are used.

With the proposed model reduction approach the number of neurons initially chosen to identify a system does not

Héctor Manuel Romero Ugalde received his Bachelor's degree in Electronic Engineering from the University of Veracruz (2002–2006 Xalapa, Ver. Mexico), and the Master of Science degree in Electronic Engineering from the National Research and Technological Development Center (CENIDET) (2006–2008 Cuernavaca, Mor. MEXICO).

Now he is pursuing the doctorate degree in automatic control under the co-supervision between CENIDET and Arts et Metiers ParisTech (2008–2012 Aix en Provence, France). His current

References (39)

  • H. Ge et al.

    Identification and control of nonlinear systems by a dissimilation particle swarm optimization-based Elman neural network

    Nonlinear Anal. Real World Appl.

    (2008)
  • B. Majhi et al.

    Robust identification of nonlinear complex systems using low complexity and particle swarm optimization technique

    Expert Syst. Appl.

    (2011)
  • W. Yu

    Multiple recurrent neural networks for stable adaptive control

    Neurocomputing

    (2006)
  • J. Schoukens, J. Suykens, L. Ljung, Wiener–Hammerstein benchmark, in: 15th IFAC Symposium on System Identification,...
  • K. Hangos et al.

    Analysis and Control of Nonlinear Process Systems

    (2004)
  • L. Coelho et al.

    Nonlinear identification using a b-spline neural network and chaotic immune approaches

    Mech. Syst. signal Process.

    (2009)
  • M. Noorgard et al.

    Neural Networks for Modelling and Control of Dynamic Systems

    (2000)
  • K. Narendra et al.

    Identification and control of dynamical systems using neural networks

    IEEE Trans. Neural Networks

    (1990)
  • Z. Yan et al.

    Modeling and control of nonlinear discrete-time systems based on compound neural networks

    Chin. J. Chem. Eng.

    (2009)
  • Cited by (41)

    • Topological graph representation of stratigraphic properties of spatial-geological characteristics and compression modulus prediction by mechanism-driven learning

      2023, Computers and Geotechnics
      Citation Excerpt :

      1) NNs-based stratigraphic property prediction models require storing large amounts of borehole and experimental mechanical data. These data increase linearly with depth, leading to the unbounded exponential growth of model parameters for NNs (Romero Ugalde et al., 2013), which is prone to insufficient memory in hardware devices, resulting in considerably reduced analysis efficiency. 2) In general, the parametric weighting of purely data-driven ML models creates an ill-posed problem (Kim and Nakata, 2018; Travassos et al., 2020), making the training-generated model and its predictions not unique, leading to discontinuous changes in the behavior of the solution relative to the initial conditions. (3))

    • Fractional order neural networks for system identification

      2020, Chaos, Solitons and Fractals
      Citation Excerpt :

      The neural network is designed to ensure a good balance between accuracy and complexity after the training, which is performed only once. In this paper, we follow the neural network design method proposed in [5–7], to derive balanced simplicity-accuracy system identification models. Furthermore, we take advantage of fractional order calculus to reduce even more the number of parameters of the proposed neural network-based models.

    View all citing articles on Scopus

    Héctor Manuel Romero Ugalde received his Bachelor's degree in Electronic Engineering from the University of Veracruz (2002–2006 Xalapa, Ver. Mexico), and the Master of Science degree in Electronic Engineering from the National Research and Technological Development Center (CENIDET) (2006–2008 Cuernavaca, Mor. MEXICO).

    Now he is pursuing the doctorate degree in automatic control under the co-supervision between CENIDET and Arts et Metiers ParisTech (2008–2012 Aix en Provence, France). His current research interests include neural networks and non linear systems identification.

    Jean Claude Carmona is a full professor in Automatic Control at Arts et Metiers ParisTech, engineering high school in Aix en Provence, France.

    He received his PhD in automatic control in 1990 then his HDR degree in 2004 in the Aix Marseille University.

    His main research interests include identification methods, particularly robust estimation and nonlinear identification methods dedicated to robust control. Its main application domains are the complex mechanical systems, especially for vibrations control purposes and hybrid multisources systems using renewable energies.

    Victor M. Alvarado was born in Mexico in 1968. He received the Ph.D. degree in automatic control from Laboratoire d’Automatique de Grenoble in France, 2001. He is presently an Associate Professor of automatic control in the Electronics Engineering Department of CENIDET, Cuernavaca, Mexico. His research interests encompass theory and applications in systems identification, model validation, and process control.

    Juan Reyes-Reyes earned his electronics technician high school degree at the Technological Institute of Saltillo (Instituto Tecnológico de Saltillo) from 1987 to 1990, at the same institute (1990–1994 Saltillo, Coah. MEXICO) he earned his Industrial and Electronics Engineer bachelor's degree.

    Additionally, he achieved his Electrical Engineering Master in Sciences' degree (1995–1997 Mexico City) at the Advanced Research and Studies Center (CINVESTAV), in the same center he also achieved his Automatic Control Doctor in Sciences' degree (1998–2001).

    He has Level 1 membership grade in the National Researchers System (Sistema Nacional de Investigadores, MEXICO) with the membership number 32033. He is member of the CONACYT Engineering and Industry Recognized Referees Record (MEXICO) whose membership number is RCEA-07-19975-2010. He was research professor at the Electronics Engineering Department (2009) of the National Research and Technological Development Center (CENIDET) in Cuernavaca, Mor. Mexico.

    Nowadays (June 2012) he is research professor at the Technological Institute of Zacatepec (Instituto Tecnológico de Zacatepec).

    His areas of interest are control of uncertain non-linear systems, dynamic neural networks, passivity, Lyapunov analysis, fault tolerant systems, hardware and software integration.

    View full text