Survey of random neural network applications

https://doi.org/10.1016/S0377-2217(99)00481-6Get rights and content

Abstract

This paper consists of a survey of various engineering applications based on the random neural network (RNN) model [Neural Computation 1(4) (1989) 502–511; 2(2) (1990) 239–247; Comptes-Rendus Acad. Sci. t. Série II 310(2) (1990) 177–180; Applied Math. Modelling 15 (1991) 534–541; Neural Computation 5(1) (1993) 154–164], and also a summary of the recent image processing techniques such as still image compression, image enlargement, and image fusion. The advantage of the RNN model is that it is closer to biophysical reality and mathematically more tractable than standard neural methods, especially when used as a recurrent structure.

Introduction

Neural networks and related techniques are powerful tools that prove efficient in real-world engineering applications such as associative memory, computer communications [6], [7], combinatorial optimization problems, system identification and control, function approximation, pattern recognition, signal processing [8], where problems are awkwardly defined or difficult to formulate. Neural networks possess properties which give them advantages over other methods. Because of their parallel architecture, they can overcome most of the limiting computational difficulties. Since they are trained on example data, they are more adaptable to changes in the input data by allowing the training to continue during the processing of new information. The high degree of connectivity allows neural networks to self-organize, which is important when the structure of the data is not known beforehand. And finally, due to the analogy between neural networks and neurobiological systems, current biological networks could be used as models for designing artificial neural networks.

This paper will touch upon previous applications of the random neural network (RNN) model which is described in detail in [1], [2], [3], [4], [5], together with a brief description of the recent image processing methods [9], [30], also based on the same model. Hence, it would be helpful to explain the dynamics of the model first.

In the random neural network model by Gelenbe [1], [2], [4], signals in the form of impulses which have unit amplitude travel among the neurons. Positive signals represent excitation, whereas negative signals represent inhibition to the receiving neuron. Thus, an excitatory impulse is interpreted as a “+1” signal, while an inhibitory impulse is interpreted as a “−1” signal. Each neuron i has a state ki(t), which is its potential at time t, represented by a non-negative integer.

When the potential of neuron i is positive, it is referred to as being `excited', and it can transmit impulses (fire). The impulses will be sent out at a Poisson rate ri, with independent, identical exponentially distributed inter-impulse intervals. The impulses transmitted will arrive at neuron j as excitation signals with probability p+ij, and as inhibitory signals with probability pij. A neuron's transmitted impulse may also leave the network with probability di, therefore, di+∑j=1n[p+ij+pij]=1. To make these probabilities easier to work with, let wij+=rip+ij, and wij=ripij; then firing rate of neuron i, ri, is ∑j=1n[wij++wij]. The w matrices can be viewed as being analogous to the synaptic weights in connectionist models, though they specifically represent rates of excitatory and inhibitory impulse emission. Since the w matrices are formed through a product of rates and probabilities, they are guaranteed to be non-negative.

Exogenous excitatory and inhibitory signals, meaning those arriving to the neuron from a source outside of the network, also arrive to neuron i at rates Λi and λi, respectively. These are analogous to the input received by the input neurons in a connectionist model, again however, they represent rates.

Fig. 1 shows the representation of a neuron in the RNN using the model parameters that have been defined above. In this figure, only the transitions to and from a single neuron i are considered in a recurrent fashion. All the other neurons can be interpreted as the replicates of neuron i.

At this point, it is necessary to consider the dynamics of the RNN model by analyzing the possible state transitions. Within a time interval of Δt, several transitions can occur which change a neuron's state ki(t):

  • The potential ki(t) of a neuron will decrease by one whenever it fires, regardless of the type of the signal emitted (excitation or inhibition). Also, when an exogenous inhibitory signal arrives from outside the network to neuron i, its potential drops to ki(t)−1 at time t+Δt. Moreover, neuron i might receive an inhibitory impulse from another neuron j, whose effect will again be to decrement the value of ki at time t by one.

  • Arrival of an exogenous excitatory signal from outside, or an excitatory impulse from another neuron within the network will result in incrementing the neuron potential by one, yielding ki(t)+1.

  • Needless to say, the value of ith neuron's state remains unchanged when none of the events described above occur.

In the case when self-inhibition is allowed, the value of the neuron's state can drop by two units in a single time step, however this case will not be considered in the following expressions. Also in this model, self-excitation is not of interest because in its presence, the potential of the neuron may increase without bound which would lead to instability. There are also some boundary conditions which prevent some of the transitions from occurring. First of all, a neuron can fire only when it has a positive potential as explained above. Second, when the neuron has a potential of zero, the arrival of new inhibitory signals does not decrease its value further. All of these constraints will be unified in a single expression when the state transitions are expressed in mathematical form.

Let k(t)=k1(t),…,kn(t) be the vector of signal potentials at time t, and k=k1,…,kn be a particular value of the vector, and lets define the probability p(k,t)=Pr[k(t)=k]. The behavior of the probability distribution of the network state can be derived through the following equations. Since k(t):t⩾0 is a continuous time Markov chain, it satisfies an infinite system of Chapman–Kolmogorov equations.p(k,t+Δt)=ip(ki+,t)r(i)d(i)Δt+p(ki,t)Λ(i)Δt1[ki(t)>0]+p(ki+,t)λ(i)Δt+p(ki,t)(1−Λ(i)Δt)(1−λ(i)Δt)(1−r(i)Δt)1[ki(t)>0]+jp(kij+−,t)r(i)p+(i,j)Δt1[kj(t)>0]+p(kij++,t)r(i)p(i,j)Δt+p(ki+,t)r(i)p(i,j)Δt1[kj(t)=0]+o(Δt),where1[x]=1ifxistrue,0otherwise.For steady state analysis, let p(k) denote the stationary probability distribution which is equal to limt→∞Pr[k(t)=k] if it exists. Thus, in steady state, stationary probability distribution, p(k), must satisfy the global balance equations:p(k)i[Λ(i)+[λ(i)+r(i)]1[ki>0]]=ip(ki+)r(i)d(i)+p(ki)Λ(i)1[ki>0]+p(ki+)λ(i)+j{p(kij+−)r(i)p+(i,j)1[kj>0]+p(kij++)r(i)p(i,j)+p(ki+)r(i)p(i,j)1[kj=0]}.

The stationary probability distribution associated with the model is the value which will be taken to be the output of the network, and is given byqi=limt→∞Pr[ki(t)>0],i=1,…,n,which reduces to the formqi+(i)/[r(i)+λ(i)],where the λ+(i),λ(i) for i=1,…,n satisfy the system of non-linear simultaneous equationsλ+(i)=jqjwji++Λ(i),λ(i)=jqjwji+λ(i).To put Eq. (2) into words, the steady state probability that the neuron i is excited is simply equal to the ratio of the sum of all the rates of arriving excitatory signals to the sum of the rates of arriving inhibitory signals together with the firing rate of that particular neuron.

The algorithm chooses the set of network parameters W in order to learn the training set of K input–output pairs (ι, Y) where the set of successive inputs is denoted ι={ι1,…,ιK}, and ιk=(Λk,λk) are pairs of excitation and inhibition signal flow rates entering each neuron from outside of the network:Λk=[Λk(1),…,Λk(n)],λk=[λk(1),…,λk(n)].The successive desired outputs are the vectors Y={y1,…,yK}, where each vector yk=(y1k,…,ynk), whose elements yik∈[0,1] correspond to the desired output values for each neuron. The network adjusts its parameters to produce the set of desired output vectors in a manner that minimizes a cost function Ek:Ek=12i=1nai(qi−yik)2,ai⩾0.

In this network, all neurons are generalized to be output neurons, therefore, if it is desired that a neuron j is to be removed from the network output, and therefore the error function, it suffices to set aj=0.

Recall that the steady state output rate of all neurons in the network is given by , .

Both of the n by n weight matrices W+k={w+k(i,j)} and Wk={wk(i,j)} must be adjusted after each input is presented, by computing for each input ιk=(Λk,λk), a new value W+k and Wk of the weight matrices. Since the weight matrices represent a rate times a probability, only solutions for which all values in the matrices are positive are valid.

Let w(u,v) denote any weight term, which would be either w(u,v)≡w(u,v), or w(u,v)≡w+(u,v). The weights will be updated using gradient descent method:wnew(u,v)=wold(u,v)−ηE/w(u,v).The partial derivative of the cost function can be computed and substituted to obtain the update difference equationwk(u,v)=wk−1(u,v)−ηi=1nai(qik−yik)[qi/w(u,v)]k,where η>0 is the learning parameter which is constant over each iteration of training, and

  • 1.

    qik is calculated using the input ιk and w(u,v)=wk−1(u,v), in , .

  • 2.

    [∂qi/∂w(u,v)]k is evaluated at the values qi=qik, w+(u,v)=w+k−1(u,v) and w(u,v)=wk−1(u,v).

To compute [∂qi/∂w(u,v)]k the following equation is derived from expressions , :qi/w(u,v)=jqj/w(u,v)[w+(j,i)−w(j,i)qi]/(r(i)+λ(i))1[u=i]qi/(r(i)+λ(i))+1[w(u,v)≡w+(u,i)]qu/(r(i)+λ(i))1[w(u,v)≡w(u,i)]quqi/(r(i)+λ(i)).Let q=(q1,…,qn), and define the n×n matrixW=[w+(i,j)−w(i,j)qj]/(r(j)+λ(j)),i,j=1,…,n.The vector equations can now be written asq/w+(u,v)=q/w+(u,v)W+(u,v)qu,q/w(u,v)=q/w(u,v)W(u,v)qu,where the elements of the n-vectorsγ+(u,v)=[γ+1(u,v),…,γ+n(u,v)]andγ(u,v)=[γ1(u,v),…,γn(u,v)]areγ+i(u,v)=−1/(r(i)+λ(i))ifu=i,v≠i,+1/(r(i)+λ(i))ifu≠i,v=i,0forallothervaluesof(u,v),γi(u,v)=−(1+qi)/(r(i)+λ(i))ifu=i,v=i,−1/(r(i)+λ(i))ifu=i,v≠i,−qi/(r(i)+λ(i))ifu≠i,v=i,0forallothervaluesof(u,v).Notice thatq/w+(u,v)=γ+(u,v)qu[IW]−1,q/w(u,v)=γ(u,v)qu[IW]−1,where I denotes the n by n identity matrix. Hence the main computational effort in this algorithm is to obtain [IW]−1. This is of time complexity O(n3), or O(mn2) if an m-step relaxation method is used.

From the above, the complete learning algorithm for the network can be given. First initialize the matrices W+0 and W0 in some appropriate manner. This initiation can be made at random if no better method can be determined. Choose a value of η, and then for each successive value of k, starting with k=1 proceed as follows:

  • 1.

    Set the input values to ιk=(Λk,λk).

  • 2.

    Solve the system of non-linear equations given in , with these values, perhaps by using an iterative method such as Gauss–Seidel.

  • 3.

    Solve the system of linear equations (5) with the results of (2).

  • 4.

    Using Eq. (4) and the results of , , update the matrices W+k and Wk. Since the “best” matrices (in terms of gradient descent of the quadratic cost function) which satisfy the non-negativity constraint are sought, in any step k of the algorithm, if the iteration yields a negative value of a term, there are two alternatives:

    • (a) Set the term to zero, and stop the iteration for this term in this term in this step k, in the next step, k+1, iterate on this term with the same rule starting from its current zero value.

    • (b) Go back to the previous value of the term and iterate with a smaller value of η.

This general scheme can be specialized to feed-forward networks by noting that the matrix [IW] will be triangular, yielding a computational complexity of O(n2), rather than O(n3), for each gradient iteration. Furthermore, in a feed-forward network, the equations given in , are simplified in that qi is only dependent upon qj for j<i. This reduces the computational effort required to solve , .

Section snippets

Survey of previous RNN applications

The RNN model has been proven to be successful in a variety of applications when used either in a feed-forward or a fully recurrent architecture. In most problems, RNN yields strong generalization capabilities, even when the training data set is relatively small compared to the actual testing data. The model also achieves fast learning due to its computational simplicity for weight updating process. In the following, we will briefly describe the past work related to RNN.

Associative memory: In

Summary of recent image processing techniques

Applications involving images require a huge number of operations. For image processing applications, this suggests that the most efficient computational model would be a parallel, highly interconnected neural network. In the following sections, we will summarize our recent research in image processing (i.e., on still image compression, image enlargement, and sensor image fusion), where the RNN is the common tool that is deployed. We will illustrate our results and evaluate the performances of

Conclusions

We have surveyed the wide range of engineering applications of the RNN model [1], [2], [3], [4], [5], and demonstrated some of the recent image processing techniques based on the model. For still image compression, it is shown that the proposed network model is successful in compressing and decompressing images that it did not see before. The obtained visual and PSNR quality is acceptable. The robustness analysis reveals the applicability of the RNN compressor/decompressor for hardware

Acknowledgements

This research was supported by the Office of Naval Research under Grant No. N00014-97-1-0112, the Army Research Office under Grant No. DAAH04-96-1-0388 and the IBM Corporation with an equipment grant and a research grant to Duke University.

References (30)

  • E. Gelenbe et al.

    Global behaviour of homogeneous random neural systems

    Applied Mathematical Modelling

    (1991)
  • E. Gelenbe

    Distributed associative memory and the computation of membership functions

    Information Sciences

    (1991)
  • E. Gelenbe

    Random neural networks with negative and positive signals and product form solution

    Neural Computation

    (1989)
  • E. Gelenbe

    Stability of the random neural network model

    Neural Computation

    (1990)
  • E. Gelenbe

    Réseaux neuronaux aléatoires stables

    Comptes-Rendus Acad. Sci. t. Série II

    (1990)
  • E. Gelenbe

    Learning in the recurrent random neural network

    Neural Computation

    (1993)
  • E. Gelenbe, Neural Nets and Teletraffic: Adapting the Interface, Keynote Address, Fifth International Conference on...
  • E. Gelenbe, User oriented video: Quality, traffic and computational trade-offs, Invited plenary paper, in: Proceedings...
  • E. Gelenbe et al.

    Learning neural networks for detection and classification of synchronous recurrent transient signals

    Signal Processing

    (1998)
  • H. Bakırcıoğlu, E. Gelenbe, T. Koçak, Image processing with the random neural network model, Invited paper, Proceedings...
  • J. Bourrely et al.

    Mémoires associatives: Évaluation et architectures

    Comptes-Rendus Acad. Sci. Paris t. Série II

    (1989)
  • E. Gelenbe, A. Stafylopatis, A. Likas, An extended random network model with associative memory capabilities, in:...
  • E. Gelenbe et al.

    Dynamical random neural network approach to the traveling salesman problem ELEKTRIK

    Turkish Journal of Electrical Engineering and Computer Sciences

    (1994)
  • E. Gelenbe, F. Batty, Minimum cost graph covering with the random neural network, in: O. Balci, R. Sharda, S. Zenios...
  • A. Ghanwani

    A qualitative comparison of neural network models applied to the vertex covering problem ELEKTRIK

    Turkish Journal of Electrical Engineering and Computer Sciences

    (1994)
  • Cited by (56)

    • A novel method for structure selection of the Recurrent Random Neural Network using multiobjective optimisation

      2019, Applied Soft Computing Journal
      Citation Excerpt :

      More recently, a new type of RNN based on Tsallis statistics has been proposed [12]. The reader can also found an interesting survey on applications of the RNN in [13]. Very recent works have been published on RNN.

    • RANDOM NEURAL NETWORK METHODS and DEEP LEARNING

      2021, Probability in the Engineering and Informational Sciences
    • A deep understanding of long short-term memory for solving vanishing error problem: LSTM-VGP

      2023, Machine Learning Algorithms Using Scikit and TensorFlow Environments
    View all citing articles on Scopus
    View full text