Skip to main content

Statistical Mechanics of Generalization

  • Chapter
Models of Neural Networks III

Part of the book series: Physics of Neural Networks ((NEURAL NETWORKS))

Sypnosis

We estimate a neural network’s ability to generalize from examples using ideas from statistical mechanics. We discuss the connection between this approach and other powerful concepts from mathematical statistics, computer science, and information theory that are useful in explaining the performance of such machines. For the simplest network, the perceptron, we introduce a variety of learning problems that can be treated exactly by the replica method of statistical physics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W. Kinzel, M. Opper (1991) Dynamics of learning, In: Physics of Neural Networks, J. L. van Hemmen, E. Domany, K. Schulten (Eds.) (Springer-Verlag, New York), p. 149

    Google Scholar 

  2. T.L.H. Watkin, A. Rau, M. Biehl (1993) Rev. Mod. Phys. 65:499

    Article  MathSciNet  ADS  Google Scholar 

  3. N. Sauer (1972) J. Comb. Theory A 13:145

    Article  MathSciNet  MATH  Google Scholar 

  4. V.N. Vapnik (1982) Estimation of Dependences Based on Empirical Data (Springer-Verlag, New York)

    MATH  Google Scholar 

  5. E. Baum, D. Haussler (1989) Neural Comput.1(1):151–160

    Article  Google Scholar 

  6. A. Blumer, A. Ehrenfeucht, D. Haussier, M. K. Warmut h (1989) J. Assoc. Comp. Mach. 36:929

    Article  MATH  Google Scholar 

  7. E. Levin, N. Tishby, S. Solla (1989) A statistical approach to learning and generalization in neural networks, In: Proc. 2nd Workshop on Computational Learning Theory (Morgan Kaufmann)

    Google Scholar 

  8. G. Gyorgyi, N. Tishby (1990) Statistical theory of learning a rule, In: Neural Networks and Spin Glasses, (World Scientific)

    Google Scholar 

  9. M. Opper, D. Haussler (1991) Phys. Rev. Lett. 66:2677

    Article  MathSciNet  ADS  MATH  Google Scholar 

  10. M. Opper, D. Haussier (1991) In: IVth Annual Workshop on Computational Learning Theory (COLT91) (Santa Cruz, 1991) (Morgan Kaufmann, San Mateo, CA), pp. 75–87

    Google Scholar 

  11. D. Haussier, M. Kearns, M. Opper, R.E. Schapire (1991) Estimating average — Case learning curves using Bayesian, statistical physics and VC dimension methods, In: Neural Information Processing (NIPS 91)

    Google Scholar 

  12. E. Gardner (1988) J. Physics A 21:257–270

    Article  ADS  Google Scholar 

  13. D. Haussier, M. Kearns, R. Schapire (1991) In: IVth Annual Workshop on Computational Learning Theory (COLT91) (Santa Cruz, 1991) (Morgan Kaufmann, San Mateo, CA), pp. 61–74

    Google Scholar 

  14. D. Haussier, A. Barron (1992) How well do Bayes methods work for on-line prediction of +1,−1 values? In: Proc. Third NEC Symposium on Computation and Cognition (SIAM, Philadelphia, PA)

    Google Scholar 

  15. J. Rissanen (1986) Ann. Stat 14:1080

    Article  MathSciNet  MATH  Google Scholar 

  16. R. Meir, J.F. Fontanari (1993) Proc. IVth International Bar-Ilan Conference on Frontiers in Condensed Matter Physics, published in Physica A 200:644

    Google Scholar 

  17. H. Sompolinsky, N. Tishby, H.S. Seung (1990) Phys. Rev. Lett. 65:1683

    Article  ADS  Google Scholar 

  18. S. Amari, N. Murata (1993) Neural Computation 5:140

    Article  Google Scholar 

  19. T.M. Cover (1965) IEEE Trans. El. Comp. 14:326–334

    Article  MATH  Google Scholar 

  20. G. Stambke (19XX) diploma thesis

    Google Scholar 

  21. G. Gyorgyi (1990) Phys. Rev. Lett. 64:2957

    Article  ADS  Google Scholar 

  22. M. Mezard, G. Parisi, M.A. Virasoro (1987) Spin Glass Theory and Beyond, Lecture Notes in Physics, 9 (World Scientific)

    Google Scholar 

  23. T.L.H. Watkin (1993) Europhys. Lett. 21:871

    Article  ADS  Google Scholar 

  24. R. Meir, J.F. Fontanari (1992) Phys. Rev. A 45:8874

    Article  ADS  Google Scholar 

  25. S. Amari (1993) Neural Networks 6:161

    Article  Google Scholar 

  26. M. Opper, D. Haussler, in preparation

    Google Scholar 

  27. F. Vallet, J. Cailton, P. Refregier (1989) Europhys. Lett. 9:315–320

    Article  ADS  Google Scholar 

  28. D.E. Rumelhart, J.L. McClelland, eds. (1986) Parallel Distributed Memory (MIT Press, Cambridge, MA)

    Google Scholar 

  29. B. Widrow, M.E. Hoff (1960) WESCON Convention, Report IV, 96

    Google Scholar 

  30. I. Kanter, H. Sompolinsky (1987) Phys. Rev. A 35:380

    Article  ADS  Google Scholar 

  31. M. Opper, W. Kinzel, J. Kleinz, R. Nehl (1990) J. Phys. A 23:L581

    Article  MathSciNet  ADS  Google Scholar 

  32. M. Opper (1989) Europhys. Lett. 8:389

    Article  ADS  Google Scholar 

  33. A.J. Hertz, A. Krogh, G.I. Thorbergsson (1989) J. Phys. A 22:2133

    Article  MathSciNet  ADS  Google Scholar 

  34. A. Krogh, J. Hertz (1991) In: Advances in Neural Information Processing Systems III (Morgan Kaufmann, San Mateo, CA)

    Google Scholar 

  35. Y. LeCun, I. Kanter, S. Solla (1991) Phys. Rev. Lett. 66:2396

    Article  ADS  Google Scholar 

  36. M. Opper (1988) Phys. Rev. A 38:3824

    Article  ADS  Google Scholar 

  37. F. Rosenblatt (1961) Principles of Neurodynamics — Perceptrons and the Theory of Brain (Spartan Books, Washington DC)

    Google Scholar 

  38. W. Krauth, M. Mezard (1987) J. Phys. A 20:L745

    Article  MathSciNet  ADS  Google Scholar 

  39. J. Anlauf, M. Biehl (1989) Europhys. Lett. 10:687

    Article  ADS  Google Scholar 

  40. P. Ruján (1993) J. de Phys. (Pans) I 3:277

    ADS  Google Scholar 

  41. W. Kinzel, P. Ruján (1990) Europhys. Lett. 13:473

    Article  ADS  Google Scholar 

  42. T.L.H. Watkin, A. Rau (1992) J. Phys. A 25:113

    Article  MathSciNet  ADS  Google Scholar 

  43. H.S. Seung, M. Opper, H. Sompolinsky (1992) In: Vth Annual Workshop on Computational Learning Theory (COLT92) (Pittsburgh 1992) pp. 287–294 (Assoc. for Computing Machinery, New York)

    Book  Google Scholar 

  44. E. Gardner, B. Derrida (1989) J. Phys. A 22:1983

    Article  MathSciNet  ADS  Google Scholar 

  45. G. Gyorgyi (1990) Phys. Rev. A. 41:7097

    Article  ADS  Google Scholar 

  46. H. Schwarze, M. Opper, W. Kinzel (1992) Phys. Rev. A 46:6185

    Article  ADS  Google Scholar 

  47. H. Seung, H. Sompolinsky, N. Tishby (1992) Phys. Rev. A 45:6056

    Article  MathSciNet  ADS  Google Scholar 

  48. H. Horner (1992) Z. Phys. B 87:371

    Article  ADS  Google Scholar 

  49. H.K. Patel (1993) Z. Physik B 91:257

    Article  ADS  Google Scholar 

  50. M. Biehl, H. Schwarze (1992) Europhys. Lett. 20:733

    Article  ADS  Google Scholar 

  51. M. Biehl, H. Schwarze (1993) J. Phys. A 26:2561

    Article  MathSciNet  Google Scholar 

  52. M. Biehl (19XX) diploma thesis, University of Giessen

    Google Scholar 

  53. O. Kinouchi, N. Caticha (1992) J. Phys. A 25:6243

    Article  MathSciNet  ADS  MATH  Google Scholar 

  54. P. Kuhlmann, K.R. Müller (1994) J. Phys. A 27:3759

    Article  MathSciNet  ADS  MATH  Google Scholar 

  55. R. Garces, P. Kuhlmann, H. Eissfeiler (1992) J. Phys. A 25:L1335

    Article  ADS  MATH  Google Scholar 

  56. J. Hertz, A. Krogh, R.G. Palmer (1991) Introduction to the Theory of Neural Computation (Addison-Wesley, Reading, MA)

    Google Scholar 

  57. S. Bös, W. Kinzel, M. Opper (1993) Phys. Rev. E 47:1384

    Article  ADS  Google Scholar 

  58. M. Biehl, A. Mietzner (1993) Europhys. Lett. 24:421

    Article  ADS  Google Scholar 

  59. T. Kohonen (1988) Self Organisation and Associative Memory (Springer-Verlag, Berlin)

    Book  Google Scholar 

  60. M. Opper (1995) in preparatio

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer Science+Business Media New York

About this chapter

Cite this chapter

Opper, M., Kinzel, W. (1996). Statistical Mechanics of Generalization. In: Domany, E., van Hemmen, J.L., Schulten, K. (eds) Models of Neural Networks III. Physics of Neural Networks. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-0723-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-0723-8_5

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-6882-6

  • Online ISBN: 978-1-4612-0723-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics