skip to main content
research-article

Semi-analytical method for analyzing models and model selection measures based on moment analysis

Published:23 March 2009Publication History
Skip Abstract Section

Abstract

In this article we propose a moment-based method for studying models and model selection measures. By focusing on the probabilistic space of classifiers induced by the classification algorithm rather than on that of datasets, we obtain efficient characterizations for computing the moments, which is followed by visualization of the resulting formulae that are too complicated for direct interpretation. By assuming the data to be drawn independently and identically distributed from the underlying probability distribution, and by going over the space of all possible datasets, we establish general relationships between the generalization error, hold-out-set error, cross-validation error, and leave-one-out error. We later exemplify the method and the results by studying the behavior of the errors for the naive Bayes classifier.

References

  1. Bengio, Y. and Grandvalet, Y. 2003. No unbiased estimator of the variance of k-fold cross validation. J. Mach. Learn. Res. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bertsimas, D. and Popescu, I. 1998. Optimal inequalities in probability theory: A convex optimization approach. Tech. rep., Department of Mathematics, O.R., Cambridge, Massachusetts, 02139.Google ScholarGoogle Scholar
  3. Blum, A., Kalai, A., and Langford, J. 1999. Beating the hold-out: Bounds for k-fold and progressive cross-validation. In Computational Learing Theory, 203--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Boucheron, S., Bousquet, O., and Lugosi, G. 2005. Introduction to statistical learning theory. http://www.kyb.mpg.de/publications/pdfs/pdf2819.pdf.Google ScholarGoogle Scholar
  5. Boyd, S. and Vandenberghe, L. 2004. Convex Optimization. Cambridge. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Breiman, L. 1996. Heuristics of instability and stabilization in model selection. Ann. Statis. 24, 6, 2350--2383.Google ScholarGoogle ScholarCross RefCross Ref
  7. Butler, R. and Sutton, R. 1998. Saddlepoint approximation for multivariate cumulative distribution functions and probability computations in sampling theory and outlier testing. J. Amer. Statis. Assoc. 93, 442, 596--604.Google ScholarGoogle ScholarCross RefCross Ref
  8. Connor-Linton, J. 2003. Chi square tutorial. http://www.georgetown.edu/faculty/ballc/webtools/web_chi_tut.html.Google ScholarGoogle Scholar
  9. Devroye, L., Györfi, L., and Lugosi, G. 1996. A Probabilistic Theory of Pattern Recognition. Springer.Google ScholarGoogle Scholar
  10. Doob, J. 1994. Measure Theory. Springer.Google ScholarGoogle Scholar
  11. Elisseeff, A. and Pontil, M. 2003. Leave-one-out error and stability of learning algorithms with applications. In Learning Theory and Practice. IOS Press.Google ScholarGoogle Scholar
  12. Goutte, C. 1997. Note on free lunches and cross-validation. Neural Comput. 9, 6, 1245--1249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hall, P. 1992. The Bootstrap and Edgeworth Expansion. Springer.Google ScholarGoogle Scholar
  14. Isii, K. 1960. The extrema of probability determined by generalized moments(i) bounded random variables. Ann. Inst. Stat. Math 12, 119--133.Google ScholarGoogle ScholarCross RefCross Ref
  15. Isii, K. 1963. On the sharpeness of chebyshev-type inequalities. Ann. Inst. Stat. Math 14, 185--197.Google ScholarGoogle ScholarCross RefCross Ref
  16. Karlin, S. and Shapely, L. 1953. Geometry of moment spaces. Memoirs Amer. Math. Soc. 12.Google ScholarGoogle Scholar
  17. Kearns, M. and Ron, D. 1997. Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. In Computational Learing Theory, 152--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kohavi, R. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence. Morgan Kaufmann, 1137--1143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kutin, S. and Niyogi, P. 2002. Almost-Everywhere algorithmic stability and generalization error. In Proceedings of the 18th Annual Conference on Uncertainty in Artificial Intelligence (UAI-02). Morgan Kaufmann Publishers, 275--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Langford, J. 2005. Filed under: Prediction theory, problems. http://hunch.net/index.php?p=29.Google ScholarGoogle Scholar
  21. Langley, P. and Sage, S. 1999. Tractable average-case analysis of naive Bayesian classifiers. In Proceedings of the 16th International Conference on Machine Learning. Morgan Kaufmann, 220--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Levin, B. 1981. A representation for multinomial cumulative distribution functions. Ann. Statis. 9, 5, 1123--1126.Google ScholarGoogle ScholarCross RefCross Ref
  23. Moore, A. and Lee, M. 1994. Efficient algorithms for minimizing cross validation error. In Proceedings of the International Conference on Machine Learning, 190--198.Google ScholarGoogle Scholar
  24. Plutowski, M. 1996. Survey: Cross-validation in theory and in practice. www.emotivate.com/CvSurvey.doc.Google ScholarGoogle Scholar
  25. Prekopa, A. 1989. The discrete moment problem and linear programming. RUTCOR Res. rep.Google ScholarGoogle Scholar
  26. Shao, J. 1993. Linear model selection by cross validation. J. Amer. Statis. Assoc. 88.Google ScholarGoogle ScholarCross RefCross Ref
  27. Vapnik, V. 1998. Statistical Learning Theory. Wiley & Sons.Google ScholarGoogle Scholar
  28. Williamson, R. 2001. Srm and vc theory (statistical learning theory). http://axiom.anu.edu.au/williams/papers/P151.pdf.Google ScholarGoogle Scholar
  29. Wolfram-Research. 2009. Mathematica. http://www.wolfram.com/.Google ScholarGoogle Scholar
  30. Wu, S.-P. and Boyd, S. 1996. Sdpsol: A parser/solver for sdp and maxdet problems with matrix structure. http://www.stanford.edu/boyd/SDPSOL.html.Google ScholarGoogle Scholar
  31. Zhu, H. and Rohwer, R. 1996. No free lunch for cross validation. Neural Comput. 8, 7, 1421--1426. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Semi-analytical method for analyzing models and model selection measures based on moment analysis

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 3, Issue 1
      March 2009
      251 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/1497577
      Issue’s Table of Contents

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 March 2009
      • Accepted: 1 September 2008
      • Revised: 1 August 2008
      • Received: 1 May 2008
      Published in tkdd Volume 3, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader