Skip to main content
Log in

A fast ‘Monte-Carlo cross-validation’ procedure for large least squares problems with noisy data

  • Published:
Numerische Mathematik Aims and scope Submit manuscript

Summary

We propose a fast Monte-Carlo algorithm for calculating reliable estimates of the trace of the influence matrixA τ involved in regularization of linear equations or data smoothing problems, where τ is the regularization or smoothing parameter. This general algorithm is simply as follows: i) generaten pseudo-random valuesw 1, ...,w n , from the standard normal distribution (wheren is the number of data points) and letw=(w 1, ...,w n )T, ii) compute the residual vectorwA τ w, iii) take the ‘normalized” inner-product (w T(wA τ w))/(w T w) as an approximation to (1/n)tr(I−A τ). We show, both by theoretical bounds and by numerical simulations on some typical problems, that the expected relative precision of these estimates is very good whenn is large enough, and that they can be used in practice for the minimization with respect to τ of the well known Generalized Cross-Validation (GCV) function. This permits the use of the GCV method for choosing τ in any particular large-scale application, with only a similar amount of work as the standard residual method. Numerical applications of this procedure to optimal spline smoothing in one or two dimensions show its efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions, 9th Ed. New York: Dover 1972

    Google Scholar 

  2. Angel, E.S., Jain, A.K.: Restoration of images degraded by spatially varying pointspread functions by a conjugate gradient method. Appl. Optics17, 2186–2190 (1978)

    Google Scholar 

  3. Anselone, P.M., Laurent, P.J.: A general method for the construction of interpolating or smoothing spline-functions. Numer. Math.12, 66–82 (1968)

    Google Scholar 

  4. Apprato, D., Arcangeli, R., Manzanilla, R.: Sur la construction de surface de classeC k à partir d'un grand nombre de données de Lagrange. M2 AN, RAIRO Anal. Numer.21, 529–555 (1987)

    Google Scholar 

  5. Bates, D.M., Wahba, G.: Computational methods for generalized cross-validation with large data sets. Technical report no697. Department of Statistics. University of Wisconsin-Madison 1982

  6. Bates, D.M., Wahba, G.: A truncated singular value decomposition and other methods for generalized cross-validation. Technical report no715. Department of Statistics. University of Wisconsin-Madison 1983

  7. Bellman, R.: Introduction to matrix analysis, 2nd Ed. New York: Mc Graw-Hill 1970

    Google Scholar 

  8. Craven, P., Wahba, G.: Smoothing noisy data with spline functions. Numer. Math.31, 377–403 (1979)

    Google Scholar 

  9. Dyn, N., Levin, D., Rippa, S.: Numerical procedures for surface fitting of scattered data by radial function. SIAM J. Sci. Stat. Comput.7, 639–659 (1986)

    Google Scholar 

  10. Eldén, L.: Algorithms for the regularization of ill-conditioned least squares problem. BIT17, 134–145 (1977)

    Google Scholar 

  11. Eldén, L.: A note on the computation of the generalized cross-validation function for ill-conditioned least squares problems. BIT24, 467–472 (1984)

    Google Scholar 

  12. Girard, D.: Practical optimal regularization of large linear systems. M2AN, RAIRO Anal. Numer.20, 75–87 (1986)

    Google Scholar 

  13. Girard, D.: Optimal regularized reconstruction in computerized tomography. SIAM J. Sci. Stat. Comput.8, 934–950 (1987)

    Google Scholar 

  14. Girard, D.: Un algorithme rapide pour le calcul de la trace de l'inverse d'une grande matrice. Rapport de recherche RR 665-M, TIM3, Informatique et Mathématiques Appliquées de Grenoble (May 1987)

  15. Girard, D.: Un algorithme simple et rapide pour la validation croisée généralisée sur des problèmes de grande taille. RR 669-M, TIM3, Informatique et Mathématiques Appliquées de Grenoble (May 1987)

  16. Girard, D.: Détection de discontinuités dans un signal (ou une image) par Inf-Convolution Spline et Validation Croisée: un algorithme rapide non paramétré. RR 702-I-M, TIM3, Informatique et Mathématiques Appliquées de Grenoble (Jan. 1988)

  17. Golub, G.H., Heath, M., Wahba, G.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics21, 215–224 (1979)

    Google Scholar 

  18. Golub, G.H., Van Loan, C.: Matrix computations. North Oxford Academic 1983

  19. Hou, H.S., Andrews, H.C.: Least squares image restoration using spline basis functions. IEEE Trans. Comput.c-28, 856–873 (1977)

    Google Scholar 

  20. Hutchinson, M.F., de Hoog, F.R.: Smoothing noisy data with spline functions. Numer. Math.47, 99–106 (1985)

    Google Scholar 

  21. de Hoog, F.R., Hutchinson, M.F.: An efficient method for calculating smoothing splines using orthogonal transformations. Numer. Math.50, 311–319 (1987)

    Google Scholar 

  22. IMSL: Library Reference Manual, edition 9. Houston: IMSL 1982

    Google Scholar 

  23. Laurent, P.J.: Inf-convolution spline pour l'approximation de données discontinues. Rapport de Recherche IMAG no270 (1981); also M2AN, RAIRO Anal. Numer.20, 89–111 (1986)

  24. Laurent, P.J., Utreras, F.: Optimal smoothing of noisy broken data. J. Approx. Theory Appl.2, 71–94 (1986)

    Google Scholar 

  25. Reinsch, C.M.: Smoothing by spline functions. Numer. Math.10, 173–183 (1967)

    Google Scholar 

  26. Reinsch, C.M.: Smoothing by spline functions II. Numer. Math.16, 451–454 (1971)

    Google Scholar 

  27. Silverman, B.W.: A fast and efficient cross-validation method for smoothing parameter choice in spline regression. J. Amer. Statist. Assist.79, 584–589 (1984)

    Google Scholar 

  28. Silverman, B.W.: Some aspect of the spline smoothing approach to non-parametric regression curve fitting. J. R. Statist. Soc. B47, 1–52 (1985)

    Google Scholar 

  29. Utreras, F.: Utilisation de la méthode de validation croisée pour le lissage par fonctions spline à une ou deux variables. Thèse. Université de Grenoble 1979

  30. Utreras, F.: Sur le choix du paramètre d'ajustement dans le lissage par fonctions spline. Numer. Math.34, 15–28 (1980)

    Google Scholar 

  31. Utreras, F.: Optimal smoothing of noisy data using spline functions. SIAM J. Sci. Stat. Comput.2, 349–362 (1981)

    Google Scholar 

  32. Utreras, F.: Natural spline functions, their associated eigenvalue problem. Numer. Math.42, 107–117 (1983)

    Google Scholar 

  33. von Neumann, J.: Distribution of the ratio of the mean square successive difference to the variance. Annals Math. Stat.12, 367–395 (1941)

    Google Scholar 

  34. Wahba, G.: The approximate solution of linear operator equations when the data are noisy. SIAM J. Numer. Anal.14, 651–667 (1977)

    Google Scholar 

  35. Wahba, G.: Variational methods for multidimensional inverse problems. Tech. Rep. no755. Depart. of Statistics. University of Wisconsin 1984

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Girard, A. A fast ‘Monte-Carlo cross-validation’ procedure for large least squares problems with noisy data. Numer. Math. 56, 1–23 (1989). https://doi.org/10.1007/BF01395775

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01395775

Subject Classifications

Navigation