Sypnosis
We estimate a neural network’s ability to generalize from examples using ideas from statistical mechanics. We discuss the connection between this approach and other powerful concepts from mathematical statistics, computer science, and information theory that are useful in explaining the performance of such machines. For the simplest network, the perceptron, we introduce a variety of learning problems that can be treated exactly by the replica method of statistical physics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
W. Kinzel, M. Opper (1991) Dynamics of learning, In: Physics of Neural Networks, J. L. van Hemmen, E. Domany, K. Schulten (Eds.) (Springer-Verlag, New York), p. 149
T.L.H. Watkin, A. Rau, M. Biehl (1993) Rev. Mod. Phys. 65:499
N. Sauer (1972) J. Comb. Theory A 13:145
V.N. Vapnik (1982) Estimation of Dependences Based on Empirical Data (Springer-Verlag, New York)
E. Baum, D. Haussler (1989) Neural Comput.1(1):151–160
A. Blumer, A. Ehrenfeucht, D. Haussier, M. K. Warmut h (1989) J. Assoc. Comp. Mach. 36:929
E. Levin, N. Tishby, S. Solla (1989) A statistical approach to learning and generalization in neural networks, In: Proc. 2nd Workshop on Computational Learning Theory (Morgan Kaufmann)
G. Gyorgyi, N. Tishby (1990) Statistical theory of learning a rule, In: Neural Networks and Spin Glasses, (World Scientific)
M. Opper, D. Haussler (1991) Phys. Rev. Lett. 66:2677
M. Opper, D. Haussier (1991) In: IVth Annual Workshop on Computational Learning Theory (COLT91) (Santa Cruz, 1991) (Morgan Kaufmann, San Mateo, CA), pp. 75–87
D. Haussier, M. Kearns, M. Opper, R.E. Schapire (1991) Estimating average — Case learning curves using Bayesian, statistical physics and VC dimension methods, In: Neural Information Processing (NIPS 91)
E. Gardner (1988) J. Physics A 21:257–270
D. Haussier, M. Kearns, R. Schapire (1991) In: IVth Annual Workshop on Computational Learning Theory (COLT91) (Santa Cruz, 1991) (Morgan Kaufmann, San Mateo, CA), pp. 61–74
D. Haussier, A. Barron (1992) How well do Bayes methods work for on-line prediction of +1,−1 values? In: Proc. Third NEC Symposium on Computation and Cognition (SIAM, Philadelphia, PA)
J. Rissanen (1986) Ann. Stat 14:1080
R. Meir, J.F. Fontanari (1993) Proc. IVth International Bar-Ilan Conference on Frontiers in Condensed Matter Physics, published in Physica A 200:644
H. Sompolinsky, N. Tishby, H.S. Seung (1990) Phys. Rev. Lett. 65:1683
S. Amari, N. Murata (1993) Neural Computation 5:140
T.M. Cover (1965) IEEE Trans. El. Comp. 14:326–334
G. Stambke (19XX) diploma thesis
G. Gyorgyi (1990) Phys. Rev. Lett. 64:2957
M. Mezard, G. Parisi, M.A. Virasoro (1987) Spin Glass Theory and Beyond, Lecture Notes in Physics, 9 (World Scientific)
T.L.H. Watkin (1993) Europhys. Lett. 21:871
R. Meir, J.F. Fontanari (1992) Phys. Rev. A 45:8874
S. Amari (1993) Neural Networks 6:161
M. Opper, D. Haussler, in preparation
F. Vallet, J. Cailton, P. Refregier (1989) Europhys. Lett. 9:315–320
D.E. Rumelhart, J.L. McClelland, eds. (1986) Parallel Distributed Memory (MIT Press, Cambridge, MA)
B. Widrow, M.E. Hoff (1960) WESCON Convention, Report IV, 96
I. Kanter, H. Sompolinsky (1987) Phys. Rev. A 35:380
M. Opper, W. Kinzel, J. Kleinz, R. Nehl (1990) J. Phys. A 23:L581
M. Opper (1989) Europhys. Lett. 8:389
A.J. Hertz, A. Krogh, G.I. Thorbergsson (1989) J. Phys. A 22:2133
A. Krogh, J. Hertz (1991) In: Advances in Neural Information Processing Systems III (Morgan Kaufmann, San Mateo, CA)
Y. LeCun, I. Kanter, S. Solla (1991) Phys. Rev. Lett. 66:2396
M. Opper (1988) Phys. Rev. A 38:3824
F. Rosenblatt (1961) Principles of Neurodynamics — Perceptrons and the Theory of Brain (Spartan Books, Washington DC)
W. Krauth, M. Mezard (1987) J. Phys. A 20:L745
J. Anlauf, M. Biehl (1989) Europhys. Lett. 10:687
P. Ruján (1993) J. de Phys. (Pans) I 3:277
W. Kinzel, P. Ruján (1990) Europhys. Lett. 13:473
T.L.H. Watkin, A. Rau (1992) J. Phys. A 25:113
H.S. Seung, M. Opper, H. Sompolinsky (1992) In: Vth Annual Workshop on Computational Learning Theory (COLT92) (Pittsburgh 1992) pp. 287–294 (Assoc. for Computing Machinery, New York)
E. Gardner, B. Derrida (1989) J. Phys. A 22:1983
G. Gyorgyi (1990) Phys. Rev. A. 41:7097
H. Schwarze, M. Opper, W. Kinzel (1992) Phys. Rev. A 46:6185
H. Seung, H. Sompolinsky, N. Tishby (1992) Phys. Rev. A 45:6056
H. Horner (1992) Z. Phys. B 87:371
H.K. Patel (1993) Z. Physik B 91:257
M. Biehl, H. Schwarze (1992) Europhys. Lett. 20:733
M. Biehl, H. Schwarze (1993) J. Phys. A 26:2561
M. Biehl (19XX) diploma thesis, University of Giessen
O. Kinouchi, N. Caticha (1992) J. Phys. A 25:6243
P. Kuhlmann, K.R. Müller (1994) J. Phys. A 27:3759
R. Garces, P. Kuhlmann, H. Eissfeiler (1992) J. Phys. A 25:L1335
J. Hertz, A. Krogh, R.G. Palmer (1991) Introduction to the Theory of Neural Computation (Addison-Wesley, Reading, MA)
S. Bös, W. Kinzel, M. Opper (1993) Phys. Rev. E 47:1384
M. Biehl, A. Mietzner (1993) Europhys. Lett. 24:421
T. Kohonen (1988) Self Organisation and Associative Memory (Springer-Verlag, Berlin)
M. Opper (1995) in preparatio
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Springer Science+Business Media New York
About this chapter
Cite this chapter
Opper, M., Kinzel, W. (1996). Statistical Mechanics of Generalization. In: Domany, E., van Hemmen, J.L., Schulten, K. (eds) Models of Neural Networks III. Physics of Neural Networks. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-0723-8_5
Download citation
DOI: https://doi.org/10.1007/978-1-4612-0723-8_5
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-6882-6
Online ISBN: 978-1-4612-0723-8
eBook Packages: Springer Book Archive