Skip to main content
Top

2020 | OriginalPaper | Chapter

4. Resampling

Author : Joe Suzuki

Published in: Statistical Learning with Math and R

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Generally, there is not only one statistical model that explains a phenomenon. In that case, the more complicated the model, the easier it is for the statistical model to fit the data. However, we do not know whether the estimation result shows a satisfactory (prediction) performance for new data different from those used for the estimation. For example, in the forecasting of stock prices, even if the price movements up to yesterday are analyzed so that the error fluctuations are reduced, the analysis is not meaningful if no suggestion about stock price movements for tomorrow is given. In this book, choosing a more complex model than a true statistical model is referred to as overfitting (The term overfitting is commonly used in data science and machine learning. However, the definition may differ depending on the situation, so the author felt that uniformity was necessary.). In this chapter, we will first learn about cross-validation, a method of evaluating learning performance without being affected by overfitting. Furthermore, the data used for learning are randomly selected, and even if the data follow the same distribution, the learning result may be significantly different. In some cases, the confidence and the variance of the estimated value can be evaluated, as in the case of linear regression. In this chapter, we will continue to learn how to assess the dispersion of learning results, called bootstrapping.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
Many books mention a restrictive formula valid only for LOOCV (\(k=N\)). This book addresses the general formula applicable to any k.
 
2
Linear Model Selection by Cross-Validation Jun Shao, Journal of the American Statistical Association Vol. 88, No. 422 (Jun., 1993), pp. 486–494.
 
3
In a portfolio, for two brands X and Y, the quantity of X and Y is often estimated.
 
Metadata
Title
Resampling
Author
Joe Suzuki
Copyright Year
2020
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-15-7568-6_4

Premium Partner