Abstract
Model complexity of shallow (one-hidden-layer) perceptron networks computing multivariable functions on finite domains is investigated. Lower bounds are derived on growth of the number of network units or sizes of output weights in terms of variations of functions to be computed. A concrete construction of a class of functions which cannot be computed by percetron networks with considerably smaller numbers of units and output weights than the sizes of the function’s domains is presented. In particular, functions on Boolean d-dimensional cubes are constructed which cannot be computed by shallow perceptron networks with numbers of hidden units and sizes of output weights depending on d polynomially. A subclass of these functions is described whose elements can be computed by two-hidden-layer networks with the number of units depending on d linearly.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Fine, T.L.: Feedforward Neural Network Methodology. Springer, Heidelberg (1999)
Kecman, V.: Learning and Soft Computing. MIT Press, Cambridge (2001)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)
LeCunn, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Ba, L.J., Caruana, R.: Do deep networks really need to be deep?. In: Ghahrani, Z., et al. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 1–9 (2014)
Kainen, P.C., Kůrková, V., Sanguineti, M.: Dependence of computational models on input dimension: tractability of approximation and optimization tasks. IEEE Trans. Inf. Theory 58, 1203–1214 (2012)
Maiorov, V., Pinkus, A.: Lower bounds for approximation by MLP neural networks. Neurocomputing 25, 81–91 (1999)
Bianchini, M., Scarselli, F.: On the complexity of neural network classifiers: a comparison between shallow and deep architectures. IEEE Trans. Neural Netw. Learn. Syst. 25(8), 1553–1565 (2014)
Bengio, Y., Delalleau, O., Roux, N.L.: The curse of highly variable functions for local kernel machines. In: Advances in Neural Information Processing Systems, vol. 18, pp. 107–114. MIT Press (2006)
Kůrková, V., Sanguineti, M.: Model complexities of shallow networks representing highly varying functions. Neurocomputing 171, 598–604 (2016)
Ito, Y.: Finite mapping by neural networks and truth functions. Math. Sci. 17, 69–77 (1992)
Barron, A.R.: Neural net approximation. In: Narendra, K. (ed.) Proceedings of the 7th Yale Workshop on Adaptive and Learning Systems, pp. 69–72. Yale University Press (1992)
Kůrková, V.: Dimension-independent rates of approximation by neural networks. In: Warwick, K., Kárný, M. (eds.) Computer-Intensive Methods in Control and Signal Processing: The Curse of Dimensionality, pp. 261–270. Birkhäuser, Boston (1997)
Kůrková, V., Sanguineti, M.: Comparison of worst-case errors in linear and neural network approximation. IEEE Trans. Inf. Theory 48, 264–275 (2002)
Kainen, P.C., Kůrková, V., Vogt, A.: A Sobolev-type upper bound for rates of approximation by linear combinations of heaviside plane waves. J. Approximation Theory 147, 1–10 (2007)
Kůrková, V.: Minimization of error functionals over perceptron networks. Neural Comput. 20, 250–270 (2008)
Kůrková, V.: Complexity estimates based on integral transforms induced by computational units. Neural Netw. 33, 160–167 (2012)
Gnecco, G., Sanguineti, M.: On a variational norm tailored to variable-basis approximation schemes. IEEE Trans. Inf. Theory 57, 549–558 (2011)
Kůrková, V., Savický, P., Hlaváčková, K.: Representations and rates of approximation of real-valued Boolean functions by neural networks. Neural Netw. 11, 651–659 (1998)
Cover, T.: Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Trans. Electron. Comput. 14, 326–334 (1965)
Erdös, P., Spencer, J.H.: Probabilistic Methods in Combinatorics. Academic Press, New York (1974)
Sloane, N.J.A.: A library of Hadamard matrices. http://www.research.att.com/~njas/hadamard/
MacWilliams, F., Sloane, N.J.A.: The Theory of Error-Correcting Codes. North-Holland, Asterdam (1977)
Acknowledgments
This work was partially supported by the Czech Grant Agency grant 15-18108S and institutional support of the Institute of Computer Science RVO 67985807.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kůrková, V. (2016). Lower Bounds on Complexity of Shallow Perceptron Networks. In: Jayne, C., Iliadis, L. (eds) Engineering Applications of Neural Networks. EANN 2016. Communications in Computer and Information Science, vol 629. Springer, Cham. https://doi.org/10.1007/978-3-319-44188-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-44188-7_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44187-0
Online ISBN: 978-3-319-44188-7
eBook Packages: Computer ScienceComputer Science (R0)