Approximations of Functions by a Multilayer Perceptron: a New Approach
Section snippets
Earlier Results
The density of perceptron functions in the set of continuous functions with respect to the topology of the uniform convergence on compact sets was first established in Hornik et al. (1989). They proved that if the activation function ψ is a “squashing” function—that is, non-decreasing with finite limits at ± ∞— the family of related perception function x → Σi=0m λiψ(αi· x + θi), αi ∈ d and θi ∈ is everywhere dense in C(d,) endowed with the topology of uniform convergence on compact sets.
APPROXIMATION ON COMPACT SETS OF
Assume first that ψ is Cn. For any polynomial P of degree p ≤ n, we exhibit a sequence of (p + 1,ψ)-perceptrons that UK(n) converges to P, on every K compact set of .
Proposition 1. Let ψ ∈ Cn(,) such that ∀k, 0 ≤ k ≤ n, ψ(k)(0) ≠ 0. Let p ∈ {0,⋯,n} and (ci)0 ≤ i ≤ n non-zero pairwise distinct real numbers, then for every polynomial P such that d0P = p, there exist p + 1 rational functions λi(h): = Qi(1/h) where Qi are some polynomials of degree p, such that
APPROXIMATION ON COMPACT SETS OF d
The outline of the proof is quite similar to the one-dimensional case, although more technical. The first step is still to approximate the polynomials functions.
Proposition 2. Let ψ ∈ Cn (,) such that ∀k, 0 ≤ k ⩽ n, ψ(k)(0) ≠ 0. Let p ∈ {0,…,n} and P ∈ p[X1,…,Xd], the set of all polynomials with global degree ⩽ p. Let Npd : = dim p[X1,…,Xd]. Then there exist Npd vectors ci ∈ d, 1 ≤ i ≤ Npd, and Npd rational functions λi(h): = Qi(1/h), where Q1,…, QNpd ∈ p[X] such that:
DESIGN OF THE HIDDEN LAYER
This section is devoted to providing some approximation error bounds for a (n + 1)-perceptron as n goes to + ∞. Proposition 2 shows that a perception having Nnd units (or n units in the one-dimensional case) on the hidden layer is at least as efficient as polynomials with global degree ⩽ n as a uniform approximator on compact sets of d of continuous functions. So for a given function, the optimal errors estimates (as n → + ∞) are bounded by those of the polynomial of best uniform
MULTILAYER PERCEPTRON WITH A MULTIDIMENSIONAL OUTPUT SPACE
The aim of this section is to provide some approximation results for vector valued function by perceptrons.
Definition. We call (m, q, ψ)-perceptron, q ≥ 2, any function of the form: x → Σi = 1m Λiψ(αi.x), where x ∈ d, Λi ∈ q, αi ∈ d, 1 ≤ i ≤ n.
Such a function actually is the output of a perceptron having d units on the input layer, m units on the hidden layer and q units on the output layer. A natural question is to wonder if the results obtained for the (m,ψ)-perceptron in 2 APPROXIMATION
Acknowledgements
We thank the referees for giving us additional references, some of them being posterior to our work.
References (14)
Some new results on neural network approximation
Neural Networks
(1993)- et al.
Multilayer feedforward networks are universal approximators
Neural Networks
(1989) - et al.
Multilayer feedforward networks with non-polynomial activation function can approximate any function
Neural Networks
(1993) Universal approximation bounds for superposition of a sigmoidal function
IEEE Transactions on Information Theory
(1993)- et al.
Optimal non-linear approximation
Manuscripta Math
(1994) - Ellacott, S.W. (1994). Aspects of the numerical analysis of neural networks. In A. Iserles (Ed.), Acta Numerica (pp....
- et al.
Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks
Neural Networks
(1990)
Cited by (119)
Identification of autism spectrum disorder based on electroencephalography: A systematic review
2024, Computers in Biology and MedicineRidge regression and artificial neural network to predict the thermodynamic properties of alkali metal Rankine cycles for space nuclear power
2022, Energy Conversion and ManagementCitation Excerpt :ANN requires less information correlation degree of raw data because it has strong adaptive ability to complex and noisy information. ANN includes multi-layer perceptron (MLP) [48,49], BP neural network [50], convolutional neural network (CNN) [51], cyclic neural network (RNN) [52], etc. Fig. 7 shows the principle flow chart of radial basis neural network.
High-order approximation rates for shallow neural networks with cosine and ReLU<sup>k</sup> activation functions
2022, Applied and Computational Harmonic AnalysisMathematical modeling of interdependent infrastructure: An object-oriented approach for generalized network-system analysis
2022, Reliability Engineering and System SafetyExtreme learning machine collocation for the numerical solution of elliptic PDEs with sharp gradients
2021, Computer Methods in Applied Mechanics and EngineeringCitation Excerpt :In Section 5, we present the numerical results obtained with the proposed approach for the solution of several benchmark 1D boundary-layer elliptic problems and compare them with both the exact-analytical and the Finite Difference numerical solutions. Similar results apply for continuous functions and for the derivatives of differentiable functions [23,36,38]. In this section, we report the results that guarantee convergence of the proposed ELM collocation method.