Skip to main content

Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond

  • Chapter
Book cover Learning in Graphical Models

Part of the book series: NATO ASI Series ((ASID,volume 89))

Abstract

The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aizerman, M. A., E. M. Braverman, and L. I. Rozoner (1964). Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25, 821–837.

    Google Scholar 

  • Barber, D. and C. K. I. Williams (1997). Gaussian Processes for Bayesian Classification via Hybrid Monte Carlo. In M. C. Mozer, M. I. Jordan, and T. Petsche (Eds.), Advances in Neural Information Processing Systems 9. MIT Press.

    Google Scholar 

  • Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford: Clarendon Press.

    Google Scholar 

  • Box, G. E. P. and G. C. Tiao (1973). Bayesian Inference in Statistical Analysis. Reading, Mass.: Addison-Wesley.

    MATH  Google Scholar 

  • Bridle, J. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. Fougelman-Soulie and J. Herault (Eds.), NATO ASI series on systems and computer science. Springer-Verlag.

    Google Scholar 

  • Cressie, N. A. C. (1993). Statistics for Spatial Data. New York: Wiley.

    Google Scholar 

  • Duane, S., A. D. Kennedy, B. J. Pendleton, and D. Roweth (1987). Hybrid Monte Carlo. Physics Letters B 195, 216–222.

    Article  Google Scholar 

  • Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin (1995). Bayesian Data Analysis. London: Chapman and Hall.

    Google Scholar 

  • Gibbs, M. and D. J. C. Mackay (1997a). Efficient Implementation of Gaussian Processes. Draft manuscript, available from http://wol.ra.phy.cam.ac.uk/mackay/honiepage.html.

  • Gibbs, M. and D. J. C. Mackay (1997b). Variational Gaussian Process Classifiers. Draft manuscript, available via http://wol.ra.phy.cam.ac.uk/mackay/homepage.html.

  • Girard, D. (1989). A fast ”Monte Carlo cross-validation” procedure for large least squares problems with noisy data. Numer. Math. 56, 1–23.

    Article  MathSciNet  MATH  Google Scholar 

  • Girosi, F., M. Jones, and T. Poggio (1995). Regularization Theory and Neural Networks Architectures. Neural Computation 7(2), 219–269.

    Article  Google Scholar 

  • Goldberg, P. W., C. K. I. Williams, and C. M. Bishop (1997). Regression with Inputdependent Noise: A Gaussian Process Treatment. Accepted to NIPS*97.

    Google Scholar 

  • Green, P. J. and B. W. Silverman (1994). Nonparametric regression and generalized linear models. London: Chapman and Hall.

    MATH  Google Scholar 

  • Handcock, M. S. and M. L. Stein (1993). A Bayesian Analysis of kriging. Technometrics 35(4), 403–410.

    Article  Google Scholar 

  • Hastie, T. (1996). Pseudosplines. Journal of the Royal Statistical Society B 58, 379–396.

    MathSciNet  MATH  Google Scholar 

  • Hastie, T. J. and R. J. Tibshirani (1990). Generalized Additive Models. London: Chapman and Hall.

    MATH  Google Scholar 

  • Hornik, K. (1993). Some new results on neural network approximation. Neural Networks 6(8), 1069–1072.

    Article  Google Scholar 

  • Hutchinson, M. (1989). A stochastic estimator for the trace of the influence matrix for Laplacian smoothing splines. Communications in statistics:Simulation and computation 18, 1059–1076.

    Article  MathSciNet  MATH  Google Scholar 

  • Journel, A. G. and C. J. Huijbregts (1978). Mining Geostatistics. Academic Press.

    Google Scholar 

  • Kimeldorf, G. and G. Wahba (1970). A correspondence between Bayesian estimation of stochastic processes and smoothing by splines. Annals of Mathematical Statistics 41, 495–502.

    Article  MathSciNet  MATH  Google Scholar 

  • MacKay, D. J. C. (1992). A Practical Bayesian Framework for Backpropagation Networks. Neural Computation 4(3), 448–472.

    Article  Google Scholar 

  • MacKay, D. J. C. (1993). Bayesian Methods for Backpropagation Networks. In J. L. van Hemmen, E. Domany, and K. Schulten (Eds.), Models of Neural Networks II. Springer.

    Google Scholar 

  • Mardia, K. V. and R. J. Marshall (1984). Maximum likelihood estimation for models of residual covariance in spatial regression. Biometrika 71(1), 135–146.

    Article  MathSciNet  MATH  Google Scholar 

  • Neal, R. M. (1997). Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification. Draft manuscript, available from http://www.cs.toronto.edu/~radford/.

  • Neal, R. M. (1996). Bayesian Learning for Neural Networks. New York: Springer. Lecture Notes in Statistics 118.

    Book  MATH  Google Scholar 

  • O’Hagan, A. (1978). Curve Fitting and Optimal Design for Prediction (with discussion). Journal of the Royal Statistical Society B 40(1), 1–42.

    MathSciNet  MATH  Google Scholar 

  • O’Sullivan, F., B. S. Yandell, and W. J. Raynor (1986). Automatic Smoothing of Regression Functions in Generalized Linear Models. Journal of the American Statistical Association 81, 96–103.

    Article  MathSciNet  Google Scholar 

  • Poggio, T. and F. Girosi (1990). Networks for approximation and learning. Proceedings of IEEE 78, 1481–1497.

    Article  Google Scholar 

  • Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery (1992). Numerical Recipes in C (second ed.). Cambridge University Press.

    Google Scholar 

  • Rasmussen, C. E. (1996). Evaluation of Gaussian Processes and Other Methods for Nonlinear Regression. Ph.D. thesis, Dept. of Computer Science, University of Toronto. Available from http://ward.cs.utoronto.ca/~carl/.

  • Ripley, B. (1996). Pattern Recognition and Neural Networks. Cambridge, UK: Cambridge University Press.

    MATH  Google Scholar 

  • Sacks, J., W. J. Welch, T. J. Mitchell, and H. P. Wynn (1989). Design and Analysis of Computer Experiments. Statistical Science 4(4), 409–435.

    Article  MathSciNet  MATH  Google Scholar 

  • Sampson, P. D. and P. Guttorp (1992). Nonparametric estimation of nonstationary covariance structure. Journal of the American Statistical Association 87, 108–119.

    Article  Google Scholar 

  • Silverman, B. W. (1978). Density Ratios, Empirical Likelihood and Cot Death. Applied Statistics 27(1), 26–33.

    Article  Google Scholar 

  • Silverman, B. W. (1985). Some aspects of the spline smoothing approach to non-parametric regression curve fitting (with discussion). J. Roy. Stat. Sac. B 47(1), 1–52.

    MATH  Google Scholar 

  • Skilling, J. (1993). Bayesian numerical analysis. In W. T. Grandy, Jr. and P. Milonni (Eds.), Physics and Probability. Cambridge University Press.

    Google Scholar 

  • Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. New York: Springer Verlag.

    Book  MATH  Google Scholar 

  • von Mises, R. (1964). Mathematical Theory of Probability and Statistics. Academic Press.

    MATH  Google Scholar 

  • Wahba, G. (1990). Spline Models for Observational Data. Society for Industrial and Applied Mathematics. CBMS-NSF Regional Conference series in applied mathematics.

    Google Scholar 

  • Whittle, P. (1963). Prediction and regulation by linear least-square methods. English Universities Press.

    Google Scholar 

  • Williams, C. K. I. (1997a). Computation with infinite neural networks. Submitted to Neural Computation.

    Google Scholar 

  • Williams, C. K. I. (1997b). Computing with infinite networks. In M. C. Moser, M. I. Jordan, and T. Petsche (Eds.), Advances in Neural Information Processing Systems 9. MIT Press.

    Google Scholar 

  • Williams, C. K. I. and C. E. Rasmussen (1996). Gaussian processes for regression. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8, pp. 514–520. MIT Press.

    Google Scholar 

  • Wong, E. (1971). Stochastic Processes in Information and Dynamical Systems. New York: McGraw-Hill.

    MATH  Google Scholar 

  • Zhu, H. and R. Rohwer (1996). Bayesian Regression Filters and the Issue of Priors. Neural Computing and Applications 4, 130–142.

    Article  Google Scholar 

  • Zhu, H., C. K. I. Williams, R. J. Rohwer, and M. Morciniec (1997). Gaussian Regression and Optimal Finite Dimensional Linear Models. Technical Report NCRG/97/011, Aston University, UK. Available from http://www.ncrg.aston.ac.uk/Papers/.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Williams, C.K.I. (1998). Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond. In: Jordan, M.I. (eds) Learning in Graphical Models. NATO ASI Series, vol 89. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5014-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-5014-9_23

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-6104-9

  • Online ISBN: 978-94-011-5014-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics