Elsevier

Neural Networks

Volume 10, Issue 6, August 1997, Pages 1069-1081
Neural Networks

Approximations of Functions by a Multilayer Perceptron: a New Approach

https://doi.org/10.1016/S0893-6080(97)00010-5Get rights and content

Abstract

We provide a radically elementary proof of the universal approximation property of the one-hidden layer perceptron based on the Taylor expansion and the Vandermonde determinant. It works for both Lq and uniform approximation on compact sets. This approach naturally yields some bounds for the design of the hidden layer and convergence results (including some rates) for the derivatives. A partial answer to Hornik's conjecture on the universality of the bias is proposed. An extension to vector valued functions is also carried out. © 1997 Elsevier Science Ltd.

Section snippets

Earlier Results

The density of perceptron functions in the set of continuous functions with respect to the topology of the uniform convergence on compact sets was first established in Hornik et al. (1989). They proved that if the activation function ψ is a “squashing” function—that is, non-decreasing with finite limits at ± ∞— the family of related perception function x → Σi=0m λiψ(αi· x + θi), αiRd and θiR is everywhere dense in C(Rd,R) endowed with the topology of uniform convergence on compact sets.

APPROXIMATION ON COMPACT SETS OF R

Assume first that ψ is Cn. For any polynomial P of degree pn, we exhibit a sequence of (p + 1,ψ)-perceptrons that UK(n) converges to P, on every K compact set of R.

Proposition 1. Let ψ ∈ Cn(R,R) such thatk, 0 ≤ kn, ψ(k)(0) ≠ 0. Let p ∈ {0,⋯,n} and (ci)0 ≤ in non-zero pairwise distinct real numbers, then for every polynomial P such that d0P = p, there exist p + 1 rational functions λi(h): = Qi(1/h) where Qi are some polynomials of degree p, such that∀KcompactsetofR,i=0pλj(h)ψ(cihx)U

APPROXIMATION ON COMPACT SETS OF Rd

The outline of the proof is quite similar to the one-dimensional case, although more technical. The first step is still to approximate the polynomials functions.

Proposition 2. Let ψ ∈ Cn (R,R) such thatk, 0 ≤ kn, ψ(k)(0) ≠ 0. Let p ∈ {0,…,n} and PRp[X1,…,Xd], the set of all polynomials with global degree ⩽ p. Let Npd : = dimR Rp[X1,…,Xd]. Then there exist Npd vectors ciRd, 1 ≤ iNpd, and Npd rational functions λi(h): = Qi(1/h), where Q1,…, QNpdRp[X] such that:∀KcompactsetofRd,

DESIGN OF THE HIDDEN LAYER

This section is devoted to providing some approximation error bounds for a (n + 1)-perceptron as n goes to + ∞. Proposition 2 shows that a perception having Nnd units (or n units in the one-dimensional case) on the hidden layer is at least as efficient as polynomials with global degree ⩽ n as a uniform approximator on compact sets of Rd of continuous functions. So for a given function, the optimal errors estimates (as n → + ∞) are bounded by those of the polynomial of best uniform

MULTILAYER PERCEPTRON WITH A MULTIDIMENSIONAL OUTPUT SPACE

The aim of this section is to provide some approximation results for vector valued function by perceptrons.

Definition. We call (m, q, ψ)-perceptron, q2, any function of the form: x → Σi = 1m Λiψ(αi.x), where xRd, ΛiRq, αiRd, 1 ≤ in.

Such a function actually is the output of a perceptron having d units on the input layer, m units on the hidden layer and q units on the output layer. A natural question is to wonder if the results obtained for the (m,ψ)-perceptron in 2 APPROXIMATION

Acknowledgements

We thank the referees for giving us additional references, some of them being posterior to our work.

References (14)

There are more references available in the full text version of this article.

Cited by (119)

  • Ridge regression and artificial neural network to predict the thermodynamic properties of alkali metal Rankine cycles for space nuclear power

    2022, Energy Conversion and Management
    Citation Excerpt :

    ANN requires less information correlation degree of raw data because it has strong adaptive ability to complex and noisy information. ANN includes multi-layer perceptron (MLP) [48,49], BP neural network [50], convolutional neural network (CNN) [51], cyclic neural network (RNN) [52], etc. Fig. 7 shows the principle flow chart of radial basis neural network.

  • Extreme learning machine collocation for the numerical solution of elliptic PDEs with sharp gradients

    2021, Computer Methods in Applied Mechanics and Engineering
    Citation Excerpt :

    In Section 5, we present the numerical results obtained with the proposed approach for the solution of several benchmark 1D boundary-layer elliptic problems and compare them with both the exact-analytical and the Finite Difference numerical solutions. Similar results apply for continuous functions and for the derivatives of differentiable functions [23,36,38]. In this section, we report the results that guarantee convergence of the proposed ELM collocation method.

View all citing articles on Scopus
View full text