skip to main content
10.1145/1031171.1031186acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Hierarchical document categorization with support vector machines

Published:13 November 2004Publication History

ABSTRACT

Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques like Support Vector Machines and related large margin methods have been successfully applied for this task, albeit the fact that they ignore the inter-class relationships. In this paper, we propose a novel hierarchical classification method that generalizes Support Vector Machine learning and that is based on discriminant functions that are structured in a way that mirrors the class hierarchy. Our method can work with arbitrary, not necessarily singly connected taxonomies and can deal with task-specific loss functions. All parameters are learned jointly by optimizing a common objective function corresponding to a regularized upper bound on the empirical loss. We present experimental results on the WIPO-alpha patent collection to show the competitiveness of our approach.

References

  1. A. Cardoso-Cachopo and A. L. Oliveira. An empirical comparison of text categorization methods. In Proceedings of the 10th International Symposium on String Processing and Information Retrieval (SPIRE'03), number 2857 in Lecture Notes in Computer Science, pages 183--196. Springer Verlag, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Charkabarti, B. Dom, R. Agrawal, and P. Raghavan. Unsing taxonomy, discriminants, and signatures for navigating in text databases. In Proceedings of the 23rd Conference on Very Large Databases (VLDB'97), pages 560--573, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Crammer and Y. Singer. On the algorithmic implementation of multi-class kernelbased vector machines. Journal of Machine Learning Research, 2:265--292, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. T. Dumais and H. Chen. Hierarchical classification of Web content. In Proceedings of the 23rd ACM International Conference on Research and Development in Information Retrieval (SIGIR'00), pages 256--263, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Joachims. Text categorization with support vector machines: learning with many relevant features. In C. Nédellec and C. Rouveirol, editors, Proceedings of the 10th European Conference on Machine Learning (ECML'98), number 1398, pages 137--142. Springer Verlag, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Koller and M. Sahami. Hierarchically classifying documents using very few words. In Proceedings of the 14th International Conference on Machine Learning (ICML), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. D. Lewis. Naive (Bayes) at forty: The independence assumption in information retrieval. In Proceedings of the 10th European Conference on Machine Learning (ECML'98), pages 4--15, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. McCallum, R. Rosenfeld, T. Mitchell, and A. Ng. Improving text clasification by shrinkage in a hierarchy of classes. In Proceedings of the International Conference on Machine Learning (ICML), pages 359--367, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Mladenić and M. Grobelnik. Feature selection for classification based on text hierarchy. In Proceedings of the Conference on Automated Learning and Discovery, 1998.Google ScholarGoogle Scholar
  10. M. E. Ruiz and P. Srinivasan. Hierarchical text categorization using neural networks. Information Retrieval, 5(1):87--118, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. E. Schapire and Y. Singer. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3):135--168, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. E. Schapire, Y. Singer, and A. Singhal. Boosting and Rocchio applied to text filtering. In Proceedings of 21th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98), pages 215--223, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Sun and E.-P. Lim. Hierarchical text classification and evaluation. In International Conference on Data Mining (ICDM), pages 521--528, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. Toutanova, F. Chen, K. Popat, and T. Hofmann. Text classification in a hierarchical mixture model for small training sets. In Proceedings of the Tenth International ACM Conference on Information and Knowledge Management (CIKM), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. I. Tsochantardis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In Proceedings of the 21st International Conference on Machine Learning (ICML'04), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. J. Vanderbei. LOQO: An interior point code for quadratic programming. Optimization Methods and Software, 11:451--484, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  17. V. Vapnik. Statistical Learning Theory. Wiley and Sons Inc., New York, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. Wang, S. Zhou, and S. C. Liew. Building hierarchical classifiers using class proximity. In M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, editors, Proceedings of VLDB-99, 25th International Conference on Very Large Data Bases, pages 363--374. Morgan Kaufmann Publishers, San Francisco, US, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. S. Weigend, E. D. Wiener, and J. O. Pedersen. Exploiting hierarchy in text categorization. Information Retrieval, 1(3):193--216, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Weston and C. Watkins. Multi-class support vector machines. Technical Report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London, 1998.Google ScholarGoogle Scholar
  21. World Intellectual Property Organization. International patent classification. URL, 2001. http://www.wipo.int/classifications/en/.Google ScholarGoogle Scholar
  22. World Intellectual Property Organization. Wipo-alpha sataset. URL, 2003. http://www.wipo.int/ibis/datasets.Google ScholarGoogle Scholar
  23. Y. Yang. Expert network: Effective and efficient learning from human decisions in text categorization and retrieval. In Proceedings of 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'94), pages 13--22, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Yang and X. Liu. A re-examination of text categorization methods. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 42--49, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Hierarchical document categorization with support vector machines

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '04: Proceedings of the thirteenth ACM international conference on Information and knowledge management
          November 2004
          678 pages
          ISBN:1581138741
          DOI:10.1145/1031171

          Copyright © 2004 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 November 2004

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader