Abstract
A widely persisting interpretation of Occam’s razor is that given two classifiers with the same training error, the simpler classifier is more likely to generalize better. Within a long-lasting debate in the machine learning community over Occam’s razor, Domingos (Data Min. Knowl. Discov. 3:409–425, 1999) rejects this interpretation and proposes that model complexity is only a confounding factor usually correlated with the number of models from which the learner selects. It is thus hypothesized that the risk of overfitting (poor generalization) follows only from the number of model tests rather than the complexity of the selected model. We test this hypothesis on 30 UCI data sets using polynomial classification models. The results confirm Domingos’ hypothesis on the 0.05 significance level and thus refutes the above interpretation of Occam’s razor. Our experiments however also illustrate that decoupling the two factors (model complexity and number of model tests) is problematic.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Domingos, P. (1999). The role of Occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery, 3, 409–425.
Esmeir, S., & Markovitch, S. (2007). Occam’s razor just got sharper. In M. Veloso (Ed.), IJCAI’07: proceedings of the 20th international joint conference on artificial intelligence (pp. 768–773). San Mateo: Morgan Kaufmann.
Gamberger, D., & Lavrač, N. (1997). Conditions for Occam’s razor applicability and noise elimination. In M. van Someren & G. Widmer (Eds.), ECML’97: proceedings of the 9th European conference on machine learning (pp. 108–123). Berlin: Springer.
Grünwald, P. (2001). Occam, Bayes, MDL and the real world (presentation slides). In NIPS 2001 workshop on Occam’s razor.
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. Berlin: Springer.
Jensen, D., & Cohen, P. R. (2000). Multiple comparisons in induction algorithms. Machine Learning, 38, 308–338.
Murphy, P. M., & Pazzani, M. J. (1994). Exploring the decision forest: an empirical investigation of Occam’s razor in decision tree induction. Journal of Artificial Intelligence Research, 1(1), 257–275.
Needham, S. L., & Dowe, D. L. (2001). Message length as an effective Ockham’s razor in decision tree induction. In T. Richardson & T. Jaakkola (Eds.), Proceedings of the 8th international workshop on artificial intelligence and statistics (pp. 253–260).
Nolan, D. (1997). Quantitative parsimony. The British Journal for the Philosophy of Science, 48(2), 329–343.
Paes, A., Železný, F., Zaverucha, G., Page, D., & Srinivasan, A. (2007). ILP through propositionalization and stochastic k-term DNF learning. In S. Muggleton, R. Otero, & A. Tamaddoni-Nezhad (Eds.), ILP’06: proceedings of the 16th international conference on inductive logic programming (pp. 379–393).
Piatetsky-Shapiro, G. (1996). Editorial comments. KDD Nuggets, 96, 28.
Quinlan, J. R., & Cameron-Jones, R. (1995). Oversearching and layered search in empirical learning. In L. P. Kaelbling & A. Saffiotti (Eds.), IJCAI’95: proceedings of the 14th international joint conference on artificial intelligence (pp. 1019–1024). San Mateo: Morgan Kaufmann.
Rückert, U., & Kramer, S. (2003). Stochastic local search in k-term DNF learning. In T. Fawcett & N. Mishra (Eds.), ICML 2003: proceedings of the 20th international conference on machine learning (pp. 648–655). New York: AAAI Press.
Železný, F., Srinivasan, A., & Page, D. (2006). Randomised restarted search in ILP. Machine Learning, 64(1–3), 183–208.
Webb, G. I. (1996). Further experimental evidence against the utility of Occam’s razor. Journal of Artificial Intelligence Research, 4, 397–417.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Johannes Fürnkranz.
Rights and permissions
About this article
Cite this article
Zahálka, J., Železný, F. An experimental test of Occam’s razor in classification. Mach Learn 82, 475–481 (2011). https://doi.org/10.1007/s10994-010-5227-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-010-5227-2