ABSTRACT
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques like Support Vector Machines and related large margin methods have been successfully applied for this task, albeit the fact that they ignore the inter-class relationships. In this paper, we propose a novel hierarchical classification method that generalizes Support Vector Machine learning and that is based on discriminant functions that are structured in a way that mirrors the class hierarchy. Our method can work with arbitrary, not necessarily singly connected taxonomies and can deal with task-specific loss functions. All parameters are learned jointly by optimizing a common objective function corresponding to a regularized upper bound on the empirical loss. We present experimental results on the WIPO-alpha patent collection to show the competitiveness of our approach.
- A. Cardoso-Cachopo and A. L. Oliveira. An empirical comparison of text categorization methods. In Proceedings of the 10th International Symposium on String Processing and Information Retrieval (SPIRE'03), number 2857 in Lecture Notes in Computer Science, pages 183--196. Springer Verlag, 2003.Google ScholarCross Ref
- S. Charkabarti, B. Dom, R. Agrawal, and P. Raghavan. Unsing taxonomy, discriminants, and signatures for navigating in text databases. In Proceedings of the 23rd Conference on Very Large Databases (VLDB'97), pages 560--573, 1997. Google ScholarDigital Library
- K. Crammer and Y. Singer. On the algorithmic implementation of multi-class kernelbased vector machines. Journal of Machine Learning Research, 2:265--292, 2001. Google ScholarDigital Library
- S. T. Dumais and H. Chen. Hierarchical classification of Web content. In Proceedings of the 23rd ACM International Conference on Research and Development in Information Retrieval (SIGIR'00), pages 256--263, 2000. Google ScholarDigital Library
- T. Joachims. Text categorization with support vector machines: learning with many relevant features. In C. Nédellec and C. Rouveirol, editors, Proceedings of the 10th European Conference on Machine Learning (ECML'98), number 1398, pages 137--142. Springer Verlag, 1998. Google ScholarDigital Library
- D. Koller and M. Sahami. Hierarchically classifying documents using very few words. In Proceedings of the 14th International Conference on Machine Learning (ICML), 1997. Google ScholarDigital Library
- D. D. Lewis. Naive (Bayes) at forty: The independence assumption in information retrieval. In Proceedings of the 10th European Conference on Machine Learning (ECML'98), pages 4--15, 1998. Google ScholarDigital Library
- A. McCallum, R. Rosenfeld, T. Mitchell, and A. Ng. Improving text clasification by shrinkage in a hierarchy of classes. In Proceedings of the International Conference on Machine Learning (ICML), pages 359--367, 1998. Google ScholarDigital Library
- D. Mladenić and M. Grobelnik. Feature selection for classification based on text hierarchy. In Proceedings of the Conference on Automated Learning and Discovery, 1998.Google Scholar
- M. E. Ruiz and P. Srinivasan. Hierarchical text categorization using neural networks. Information Retrieval, 5(1):87--118, 2002. Google ScholarDigital Library
- R. E. Schapire and Y. Singer. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3):135--168, 2000. Google ScholarDigital Library
- R. E. Schapire, Y. Singer, and A. Singhal. Boosting and Rocchio applied to text filtering. In Proceedings of 21th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98), pages 215--223, 1998. Google ScholarDigital Library
- A. Sun and E.-P. Lim. Hierarchical text classification and evaluation. In International Conference on Data Mining (ICDM), pages 521--528, 2001. Google ScholarDigital Library
- K. Toutanova, F. Chen, K. Popat, and T. Hofmann. Text classification in a hierarchical mixture model for small training sets. In Proceedings of the Tenth International ACM Conference on Information and Knowledge Management (CIKM), 2001. Google ScholarDigital Library
- I. Tsochantardis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In Proceedings of the 21st International Conference on Machine Learning (ICML'04), 2004. Google ScholarDigital Library
- R. J. Vanderbei. LOQO: An interior point code for quadratic programming. Optimization Methods and Software, 11:451--484, 1999.Google ScholarCross Ref
- V. Vapnik. Statistical Learning Theory. Wiley and Sons Inc., New York, 1998. Google ScholarDigital Library
- K. Wang, S. Zhou, and S. C. Liew. Building hierarchical classifiers using class proximity. In M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, editors, Proceedings of VLDB-99, 25th International Conference on Very Large Data Bases, pages 363--374. Morgan Kaufmann Publishers, San Francisco, US, 1999. Google ScholarDigital Library
- A. S. Weigend, E. D. Wiener, and J. O. Pedersen. Exploiting hierarchy in text categorization. Information Retrieval, 1(3):193--216, 1999. Google ScholarDigital Library
- J. Weston and C. Watkins. Multi-class support vector machines. Technical Report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London, 1998.Google Scholar
- World Intellectual Property Organization. International patent classification. URL, 2001. http://www.wipo.int/classifications/en/.Google Scholar
- World Intellectual Property Organization. Wipo-alpha sataset. URL, 2003. http://www.wipo.int/ibis/datasets.Google Scholar
- Y. Yang. Expert network: Effective and efficient learning from human decisions in text categorization and retrieval. In Proceedings of 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'94), pages 13--22, 1994. Google ScholarDigital Library
- Y. Yang and X. Liu. A re-examination of text categorization methods. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 42--49, 1999. Google ScholarDigital Library
Index Terms
- Hierarchical document categorization with support vector machines
Recommendations
PAC-Bayes bounds for twin support vector machines
Twin support vector machines are regarded as a milestone in the development of support vector machines. Compared to standard support vector machines, they learn two nonparallel hyperplanes rather than one as in standard support vector machines for ...
Wavelet twin support vector machines based on glowworm swarm optimization
Twin support vector machine is a machine learning algorithm developing from standard support vector machine. The performance of twin support vector machine is always better than support vector machine on datasets that have cross regions. Recently ...
Hierarchically SVM classification based on support vector clustering method and its application to document categorization
Automatic categorization of documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques like support vector machines and related large margin methods have been ...
Comments