skip to main content
10.1145/1143844.1143971acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Inference with the Universum

Published:25 June 2006Publication History

ABSTRACT

In this paper we study a new framework introduced by Vapnik (1998) and Vapnik (2006) that is an alternative capacity concept to the large margin approach. In the particular case of binary classification, we are given a set of labeled examples, and a collection of "non-examples" that do not belong to either class of interest. This collection, called the Universum, allows one to encode prior knowledge by representing meaningful concepts in the same domain as the problem at hand. We describe an algorithm to leverage the Universum by maximizing the number of observed contradictions, and show experimentally that this approach delivers accuracy improvements over using labeled data alone.

References

  1. Baird, H. (1990). Document image defect models. Proceedings, IAPR Workshop on Syntactic and Structural Pattern Recognition (pp. 38--46). Murray Hill, NJ.Google ScholarGoogle Scholar
  2. Bernardo, J. M., & Smith, A. F. M. (1994). Bayesian theory. John Wiley and Sons.Google ScholarGoogle ScholarCross RefCross Ref
  3. Boser, B. E., Guyon, I. M., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory (pp. 144--152). Pittsburgh, PA: ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Grandvalet, Y., Canu, S., & Boucheron, S. (1997). Noise injection: Theoretical prospects. Neural Computation, 9, 1093--1108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Leen, T. K. (1995). From data distributions to regularization in invariant learning. Advances in Neural information processing systems 7. Cambridge MA: MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Lewis, D. D., Yang, Y., Rose, T., & Li, F. (2004). Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5, 361--397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mangasarian, O. L. (1965). Linear and nonlinear separation of patterns by linear programming. Operations Research, 13, 444--452.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Niyogi, P., Girosi, F., & Poggio, T. (1998). Incorporating prior information in machine learning by creating virtual examples. Proceedings of the IEEE, 86, 2196--2209.Google ScholarGoogle ScholarCross RefCross Ref
  9. Schölkopf, B., Burges, C., & Vapnik, V. (1996). Incorporating invariances in support vector learning machines. Artificial Neural Networks --- ICANN'96 (pp. 47--52). Berlin: Springer Lecture Notes in Computer Science, Vol. 1112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shawe-Taylor, J., Bartlett, P. L., Williamson, R. C., & Anthony, M. (1998). Structural risk minimization over data-dependent hierarchies. 44, 1926--1940.Google ScholarGoogle Scholar
  11. Vapnik, V. (2006). Estimation of dependences based on empirical data. Berlin: Springer Verlag. 2nd edition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zhong, P., & Fukushima, M. (2006). A new multi-class support vector algorithm. Optimization Methods and Software, 21, 359--372.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Inference with the Universum

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICML '06: Proceedings of the 23rd international conference on Machine learning
        June 2006
        1154 pages
        ISBN:1595933832
        DOI:10.1145/1143844

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 June 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        ICML '06 Paper Acceptance Rate140of548submissions,26%Overall Acceptance Rate140of548submissions,26%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader