Skip to main content

2013 | OriginalPaper | Buchkapitel

4. Over-Fitting and Model Tuning

verfasst von : Max Kuhn, Kjell Johnson

Erschienen in: Applied Predictive Modeling

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Many modern classification and regression models are highly adaptable; they are capable of modeling complex relationships. Each model's adaptability is typically governed by a set of tuning parameters, which can allow each model to pinpoint predictive patterns and structures within the data. However, these tuning parameters can very identify predictive patterns that are not reproducible. This is known as “over-fitting.” Models that are over-fit generally have excellent predictivity for the samples on which they were built, but poor predictivity for new samples. Without a methodological approach to building and evaluating models, the modeler will not know if the model is over-fit until the next set of samples are predicted. In Section 4.1 we use a simple example to illustrate the problem of over-fitting. We then describe a systematic process for tuning models (Section 4.2), which is foundational to the remaining parts of the book. Core to model tuning are appropriate ways for splitting (or spending) the data, which is covered in Section 4.3. Resampling techniques (Section 4.4) are an alternative or complementary approach to data splitting. Recommendations for approaches to data splitting are provided in Section 4.7. After evaluating a number tuning parameters via data splitting or resampling, we must choose the final tuning parameters (Section 4.6). We also discuss how to choose the optimal model across several tuned models (Section 4.8) We illustrate how to implement the recommended techniques discussed in this chapter in the Computing Section (4.9). Exercises are provided at the end of the chapter to solidify concepts.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
This model uses a radial basis function kernel, defined in Sect.​ 13.​4. Although not explored here, we used the analytical approach discussed later for determining the kernel parameter and fixed this value for all resampling techniques.
 
2
The authors state that there are 95 samples of known oils. However, we count 96 in their Table 1 (pp. 33–35 of the article).
 
Literatur
Zurück zum Zitat Ambroise C, McLachlan G (2002). “Selection Bias in Gene Extraction on the Basis of Microarray Gene–Expression Data.” Proceedings of the National Academy of Sciences, 99(10), 6562–6566.CrossRefMATH Ambroise C, McLachlan G (2002). “Selection Bias in Gene Extraction on the Basis of Microarray Gene–Expression Data.” Proceedings of the National Academy of Sciences, 99(10), 6562–6566.CrossRefMATH
Zurück zum Zitat Boulesteix A, Strobl C (2009). “Optimal Classifier Selection and Negative Bias in Error Rate Estimation: An Empirical Study on High–Dimensional Prediction.” BMC Medical Research Methodology, 9(1), 85.CrossRef Boulesteix A, Strobl C (2009). “Optimal Classifier Selection and Negative Bias in Error Rate Estimation: An Empirical Study on High–Dimensional Prediction.” BMC Medical Research Methodology, 9(1), 85.CrossRef
Zurück zum Zitat Breiman L, Friedman J, Olshen R, Stone C (1984). Classification and Regression Trees. Chapman and Hall, New York.MATH Breiman L, Friedman J, Olshen R, Stone C (1984). Classification and Regression Trees. Chapman and Hall, New York.MATH
Zurück zum Zitat Brodnjak-Vonina D, Kodba Z, Novi M (2005). “Multivariate Data Analysis in Classification of Vegetable Oils Characterized by the Content of Fatty Acids.” Chemometrics and Intelligent Laboratory Systems, 75(1), 31–43.CrossRef Brodnjak-Vonina D, Kodba Z, Novi M (2005). “Multivariate Data Analysis in Classification of Vegetable Oils Characterized by the Content of Fatty Acids.” Chemometrics and Intelligent Laboratory Systems, 75(1), 31–43.CrossRef
Zurück zum Zitat Caputo B, Sim K, Furesjo F, Smola A (2002). “Appearance–Based Object Recognition Using SVMs: Which Kernel Should I Use?” In “Proceedings of NIPS Workshop on Statistical Methods for Computational Experiments in Visual Processing and Computer Vision,”. Caputo B, Sim K, Furesjo F, Smola A (2002). “Appearance–Based Object Recognition Using SVMs: Which Kernel Should I Use?” In “Proceedings of NIPS Workshop on Statistical Methods for Computational Experiments in Visual Processing and Computer Vision,”.
Zurück zum Zitat Clark R (1997). “OptiSim: An Extended Dissimilarity Selection Method for Finding Diverse Representative Subsets.” Journal of Chemical Information and Computer Sciences, 37(6), 1181–1188.CrossRef Clark R (1997). “OptiSim: An Extended Dissimilarity Selection Method for Finding Diverse Representative Subsets.” Journal of Chemical Information and Computer Sciences, 37(6), 1181–1188.CrossRef
Zurück zum Zitat Clark T (2004). “Can Out–of–Sample Forecast Comparisons Help Prevent Overfitting?” Journal of Forecasting, 23(2), 115–139.CrossRef Clark T (2004). “Can Out–of–Sample Forecast Comparisons Help Prevent Overfitting?” Journal of Forecasting, 23(2), 115–139.CrossRef
Zurück zum Zitat Cohen G, Hilario M, Pellegrini C, Geissbuhler A (2005). “SVM Modeling via a Hybrid Genetic Strategy. A Health Care Application.” In R Engelbrecht, AGC Lovis (eds.), “Connecting Medical Informatics and Bio–Informatics,” pp. 193–198. IOS Press. Cohen G, Hilario M, Pellegrini C, Geissbuhler A (2005). “SVM Modeling via a Hybrid Genetic Strategy. A Health Care Application.” In R Engelbrecht, AGC Lovis (eds.), “Connecting Medical Informatics and Bio–Informatics,” pp. 193–198. IOS Press.
Zurück zum Zitat Defernez M, Kemsley E (1997). “The Use and Misuse of Chemometrics for Treating Classification Problems.” TrAC Trends in Analytical Chemistry, 16(4), 216–221.CrossRef Defernez M, Kemsley E (1997). “The Use and Misuse of Chemometrics for Treating Classification Problems.” TrAC Trends in Analytical Chemistry, 16(4), 216–221.CrossRef
Zurück zum Zitat Dwyer D (2005). “Examples of Overfitting Encountered When Building Private Firm Default Prediction Models.” Technical report, Moody’s KMV. Dwyer D (2005). “Examples of Overfitting Encountered When Building Private Firm Default Prediction Models.” Technical report, Moody’s KMV.
Zurück zum Zitat Efron B (1983). “Estimating the Error Rate of a Prediction Rule: Improvement on Cross–Validation.” Journal of the American Statistical Association, pp. 316–331. Efron B (1983). “Estimating the Error Rate of a Prediction Rule: Improvement on Cross–Validation.” Journal of the American Statistical Association, pp. 316–331.
Zurück zum Zitat Efron B, Tibshirani R (1986). “Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy.” Statistical Science, pp. 54–75. Efron B, Tibshirani R (1986). “Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy.” Statistical Science, pp. 54–75.
Zurück zum Zitat Efron B, Tibshirani R (1997). “Improvements on Cross–Validation: The 632+ Bootstrap Method.” Journal of the American Statistical Association, 92(438), 548–560.MathSciNetMATH Efron B, Tibshirani R (1997). “Improvements on Cross–Validation: The 632+ Bootstrap Method.” Journal of the American Statistical Association, 92(438), 548–560.MathSciNetMATH
Zurück zum Zitat Eugster M, Hothorn T, Leisch F (2008). “Exploratory and Inferential Analysis of Benchmark Experiments.” Ludwigs-Maximilians-Universität München, Department of Statistics, Tech. Rep, 30. Eugster M, Hothorn T, Leisch F (2008). “Exploratory and Inferential Analysis of Benchmark Experiments.” Ludwigs-Maximilians-Universität München, Department of Statistics, Tech. Rep, 30.
Zurück zum Zitat Golub G, Heath M, Wahba G (1979). “Generalized Cross–Validation as a Method for Choosing a Good Ridge Parameter.” Technometrics, 21(2), 215–223.MathSciNetCrossRefMATH Golub G, Heath M, Wahba G (1979). “Generalized Cross–Validation as a Method for Choosing a Good Ridge Parameter.” Technometrics, 21(2), 215–223.MathSciNetCrossRefMATH
Zurück zum Zitat Gowen A, Downey G, Esquerre C, O’Donnell C (2010). “Preventing Over–Fitting in PLS Calibration Models of Near-Infrared (NIR) Spectroscopy Data Using Regression Coefficients.” Journal of Chemometrics, 25, 375–381.CrossRef Gowen A, Downey G, Esquerre C, O’Donnell C (2010). “Preventing Over–Fitting in PLS Calibration Models of Near-Infrared (NIR) Spectroscopy Data Using Regression Coefficients.” Journal of Chemometrics, 25, 375–381.CrossRef
Zurück zum Zitat Hawkins D (2004). “The Problem of Overfitting.” Journal of Chemical Information and Computer Sciences, 44(1), 1–12.CrossRef Hawkins D (2004). “The Problem of Overfitting.” Journal of Chemical Information and Computer Sciences, 44(1), 1–12.CrossRef
Zurück zum Zitat Hawkins D, Basak S, Mills D (2003). “Assessing Model Fit by Cross–Validation.” Journal of Chemical Information and Computer Sciences, 43(2), 579–586.CrossRef Hawkins D, Basak S, Mills D (2003). “Assessing Model Fit by Cross–Validation.” Journal of Chemical Information and Computer Sciences, 43(2), 579–586.CrossRef
Zurück zum Zitat Heyman R, Slep A (2001). “The Hazards of Predicting Divorce Without Cross-validation.” Journal of Marriage and the Family, 63(2), 473.CrossRef Heyman R, Slep A (2001). “The Hazards of Predicting Divorce Without Cross-validation.” Journal of Marriage and the Family, 63(2), 473.CrossRef
Zurück zum Zitat Hothorn T, Leisch F, Zeileis A, Hornik K (2005). “The Design and Analysis of Benchmark Experiments.” Journal of Computational and Graphical Statistics, 14(3), 675–699.MathSciNetCrossRef Hothorn T, Leisch F, Zeileis A, Hornik K (2005). “The Design and Analysis of Benchmark Experiments.” Journal of Computational and Graphical Statistics, 14(3), 675–699.MathSciNetCrossRef
Zurück zum Zitat Hsieh W, Tang B (1998). “Applying Neural Network Models to Prediction and Data Analysis in Meteorology and Oceanography.” Bulletin of the American Meteorological Society, 79(9), 1855–1870.CrossRef Hsieh W, Tang B (1998). “Applying Neural Network Models to Prediction and Data Analysis in Meteorology and Oceanography.” Bulletin of the American Meteorological Society, 79(9), 1855–1870.CrossRef
Zurück zum Zitat Kim JH (2009). “Estimating Classification Error Rate: Repeated Cross–Validation, Repeated Hold–Out and Bootstrap.” Computational Statistics & Data Analysis, 53(11), 3735–3745.MathSciNetCrossRefMATH Kim JH (2009). “Estimating Classification Error Rate: Repeated Cross–Validation, Repeated Hold–Out and Bootstrap.” Computational Statistics & Data Analysis, 53(11), 3735–3745.MathSciNetCrossRefMATH
Zurück zum Zitat Kohavi R (1995). “A Study of Cross–Validation and Bootstrap for Accuracy Estimation and Model Selection.” International Joint Conference on Artificial Intelligence, 14, 1137–1145. Kohavi R (1995). “A Study of Cross–Validation and Bootstrap for Accuracy Estimation and Model Selection.” International Joint Conference on Artificial Intelligence, 14, 1137–1145.
Zurück zum Zitat Martin J, Hirschberg D (1996). “Small Sample Statistics for Classification Error Rates I: Error Rate Measurements.” Department of Informatics and Computer Science Technical Report. Martin J, Hirschberg D (1996). “Small Sample Statistics for Classification Error Rates I: Error Rate Measurements.” Department of Informatics and Computer Science Technical Report.
Zurück zum Zitat Martin T, Harten P, Young D, Muratov E, Golbraikh A, Zhu H, Tropsha A (2012). “Does Rational Selection of Training and Test Sets Improve the Outcome of QSAR Modeling?” Journal of Chemical Information and Modeling, 52(10), 2570–2578.CrossRef Martin T, Harten P, Young D, Muratov E, Golbraikh A, Zhu H, Tropsha A (2012). “Does Rational Selection of Training and Test Sets Improve the Outcome of QSAR Modeling?” Journal of Chemical Information and Modeling, 52(10), 2570–2578.CrossRef
Zurück zum Zitat Mitchell M (1998). An Introduction to Genetic Algorithms. MIT Press. Mitchell M (1998). An Introduction to Genetic Algorithms. MIT Press.
Zurück zum Zitat Molinaro A (2005). “Prediction Error Estimation: A Comparison of Resampling Methods.” Bioinformatics, 21(15), 3301–3307.CrossRef Molinaro A (2005). “Prediction Error Estimation: A Comparison of Resampling Methods.” Bioinformatics, 21(15), 3301–3307.CrossRef
Zurück zum Zitat Olsson D, Nelson L (1975). “The Nelder–Mead Simplex Procedure for Function Minimization.” Technometrics, 17(1), 45–51.CrossRefMATH Olsson D, Nelson L (1975). “The Nelder–Mead Simplex Procedure for Function Minimization.” Technometrics, 17(1), 45–51.CrossRefMATH
Zurück zum Zitat Simon R, Radmacher M, Dobbin K, McShane L (2003). “Pitfalls in the Use of DNA Microarray Data for Diagnostic and Prognostic Classification.” Journal of the National Cancer Institute, 95(1), 14–18.CrossRef Simon R, Radmacher M, Dobbin K, McShane L (2003). “Pitfalls in the Use of DNA Microarray Data for Diagnostic and Prognostic Classification.” Journal of the National Cancer Institute, 95(1), 14–18.CrossRef
Zurück zum Zitat Steyerberg E (2010). Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. Springer, 1st ed. softcover of orig. ed. 2009 edition. Steyerberg E (2010). Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. Springer, 1st ed. softcover of orig. ed. 2009 edition.
Zurück zum Zitat Varma S, Simon R (2006). “Bias in Error Estimation When Using Cross–Validation for Model Selection.” BMC Bioinformatics, 7(1), 91.MathSciNetCrossRef Varma S, Simon R (2006). “Bias in Error Estimation When Using Cross–Validation for Model Selection.” BMC Bioinformatics, 7(1), 91.MathSciNetCrossRef
Zurück zum Zitat Willett P (1999). “Dissimilarity–Based Algorithms for Selecting Structurally Diverse Sets of Compounds.” Journal of Computational Biology, 6(3), 447–457.MathSciNetCrossRef Willett P (1999). “Dissimilarity–Based Algorithms for Selecting Structurally Diverse Sets of Compounds.” Journal of Computational Biology, 6(3), 447–457.MathSciNetCrossRef
Metadaten
Titel
Over-Fitting and Model Tuning
verfasst von
Max Kuhn
Kjell Johnson
Copyright-Jahr
2013
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-6849-3_4