ABSTRACT
Like many purely data-driven machine learning methods, Support Vector Machine (SVM) classifiers are learned exclusively from the evidence presented in the training dataset; thus a larger training dataset is required for better performance. In some applications, there might be human knowledge available that, in principle, could compensate for the lack of data. In this paper, we propose a simple generalization of SVM: Weighted Margin SVM (WMSVMs) that permits the incorporation of prior knowledge. We show that Sequential Minimal Optimization can be used in training WMSVM. We discuss the issues of incorporating prior knowledge using this rather general formulation. The experimental results show that the proposed methods of incorporating prior knowledge is effective.
- K. Bennett and A. Demiriz. Semi-supervised support vector machines. In Advances in Neural Information Processing Systems 11, 1998. Google ScholarDigital Library
- C. Chang and C. Lin. LIBSVM: a library for support vector machines (version 2.3), 2001. Google ScholarDigital Library
- G. Fung and O. Mangasarian. Semi-supervised support vector machines for unlabeled data classification. Optimization Methods and Software, 15, 2001.Google Scholar
- G. Fung, O. L. Mangasarian, and J. Shavlik. Knowledge-based support vector machine classifiers. In Data Mining Institute Technical Report 01-09, Nov 2001.Google Scholar
- G. H. Golub and C. F. V. Loan. Matrix Computation. Johns Hopkins Univ Press, 1996.Google Scholar
- W. R. Hersh, C. Buckley, T. J. Leone, and D. H. Hickam. Ohsumed: An interactive retrieval evaluation and new large test collection for research, 1994.Google Scholar
- T. Joachims. Text categorization with support vector machines: learning with many relevant features. In C. Nedellec and C. Rouveirol, editors, Proceedings of ECML-98, 10th European Conference on Machine Learning, number 1398, pages 137--142, Chemnitz, DE, 1998. Springer Verlag, Heidelberg, DE. Google ScholarDigital Library
- T. Joachims. Transductive inference for text classification using support vector machines. In Proc. 16th International Conf. on Machine Learning, pages 200--209. Morgan Kaufmann, San Francisco, CA, 1999. Google ScholarDigital Library
- T. Joachims. Learning To Classify Text Using Support Vector Machines. Kluwer Academic Publishers, Boston, 2002. Google ScholarDigital Library
- S. Keerthi, S. Shevade, C. Bhattacharyya, and K. Murthy. Improvements to platt's smo algorithm for svm classifier design, 1999.Google Scholar
- W. Lam and C. Ho. Using a generalized instance set for automatic text categorization. In W. B. Croft, A. Moffat, C. J. van Rijsbergen, R. Wilkinson, and J. Zobel, editors, Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval, pages 81--89, Melbourne, AU, 1998. ACM Press, New York, US. Google ScholarDigital Library
- J. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Scholkopf, C. Burges, and A. Smola, editors, Advances in kernel methods - support vector learning. MIT Press, 1998. Google ScholarDigital Library
- R. Schapire, M. Rochery, M. Rahim, and N. Gupta. Incorporating prior knowledge into boosting. In Proceedings of the Nineteenth International Conference In Machine Learning, 2002. Google ScholarDigital Library
- B. Scholkopf, P. Simard, A. Smola, and V. Vapnik. Prior knowledge in support vector kernels. In B. Scholkopf, C. Burges, and A. Smola, editors, Advances in kernel methods - support vector learning. MIT Press, 1998. Google ScholarDigital Library
- S. Tong and D. Koller. Support vector machine active learning with applications to text classification. In P. Langley, editor, Proceedings of ICML-00, 17th International Conference on Machine Learning, pages 999--1006, Stanford, US, 2000. Morgan Kaufmann Publishers, San Francisco, US. Google ScholarDigital Library
- V. N. Vapnik. Statistical learning theory. John Wiley & Sons, New York, NY, 1998. Google ScholarDigital Library
- V. N. Vapnik. The nature of statistical learning theory, 2nd Edition. Springer Verlag, Heidelberg, DE, 1999. Google ScholarDigital Library
- Y. Yang and X. Liu. A re-examination of text categorization methods. In M. A. Hearst, F. Gey, and R. Tong, editors, Proceedings of SIGIR-99, 22nd ACM International Conference on Research and Development in Information Retrieval, pages 42--49, Berkeley, US, 1999. ACM Press, New York, US. Google ScholarDigital Library
- J. Zhang and Y. Yang. Robustness of regularized linear classification methods in text categorization. In Proceedings of SIGIR-2003, 26st ACM International Conference on Research and Development in Information Retrieval. ACM Press, 2003. Google ScholarDigital Library
Index Terms
- Incorporating prior knowledge with weighted margin support vector machines
Recommendations
An overview on twin support vector machines
Twin support vector machines (TWSVM) is based on the idea of proximal SVM based on generalized eigenvalues (GEPSVM), which determines two nonparallel planes by solving two related SVM-type problems, so that its computing cost in the training phase is 1/...
Incremental training of support vector machines using hyperspheres
In the conventional incremental training of support vector machines, candidates for support vectors tend to be deleted if the separating hyperplane rotates as the training data are added. To solve this problem, in this paper, we propose an incremental ...
Twin Support Vector Machines for Pattern Classification
We propose Twin SVM, a binary SVM classifier that determines two nonparallel planes by solving two related SVM-type problems, each of which is smaller than in a conventional SVM. The Twin SVM formulation is in the spirit of proximal SVMs via generalized ...
Comments