ABSTRACT
In this paper, a new geometrical approach is proposed for data reduction to speed up the training of support vector machine (SVM) for large data sets. In this approach, the largest possible hyperspherical sector centered at each pattern and containing patterns of the same class is constructed, then the patterns that tend to be far away from the decision boundary are removed from the training set. Experiments show the effectiveness of the proposed approach in speeding up the training process while maintaining the same level of accuracy as the standard SVM.
- Angiulli, F. 2007. Fast nearest neighbor condensation for large data sets classification. IEEE Transactions On Knowledge And Data Engineering 19, 11 (November 2007) 1450--1464. Google ScholarDigital Library
- Cervantes, J., Li, X., and Yu, W. 2006. Support vector machine classification based on fuzzy clustering for large datasets. In: MICAI 2006: Advances in Artificial Intelligence, Lecture Notes in Computer Science, 4293, Springer, (2006), 572--582. Google ScholarDigital Library
- Chang, C.C., and Lin, C.J., 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1--27:27. Google ScholarDigital Library
- Chien, L.J., Chang, C. C., and Lee, Y. J. 2010. Variant methods of reduced set selection for reduced support vector machines, Journal of Information Science and Engineering 26, 1, (2010), 183--196.Google Scholar
- Koggalage, R., and Halgamuge, S. 2004. Reducing the number of training samples for fast support vector machine classification. Neural Information Processing - Letters and Reviews 2, (March 2004), 57--65.Google Scholar
- Kremer, J., Pedersen, K. S., and Igel, C. 2014. Active learning with support vector machines. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 4,4 (July 2014), 313--326. Google ScholarDigital Library
- Lee, Y. J., and Mangasarian, O. L. 2001. RSVM: Reduced support vector machines. In Proceedings of the First SIAM International Conference on Data Mining (April 2001).Google Scholar
- Schohn, G., and D. Cohn, D. 2000. Less is more: Active learning with support vector machines. In Proceedings of the International Conference on Machine Learning (ICML), 839--846. Google ScholarDigital Library
- Shen, X., Mu, L., Li, Z., Wu, H., Gou, J., and Chen, X. 2016. Large-scale support vector machine classification with redundant data reduction. Neurocomputing 172 (January 2016), 189--197. Google ScholarDigital Library
- Tsang, I. W., Kwok, J. T., and Cheung, P. M. 2005. Core Vector Machines: Fast SVM Training on Very Large Data Sets. Journal of Machine Learning Research 6 (April 2005) 363--392. Google ScholarDigital Library
- Vapnik, V. 1998. Statistical Learning Theory. Wiley, NY.Google Scholar
- Wang, J., Neskovic, P., and Cooper, L. N. 2005. Training Data Selection for Support Vector Machines. In Proceedings of the First international conference on Advances in Natural Computation (Changsha, China, Au gust 27--29, 2005), 554--564. Google ScholarDigital Library
Index Terms
- A data reduction approach using hyperspherical sectors for support vector machine
Recommendations
An overview on nonparallel hyperplane support vector machine algorithms
Support vector machine (SVM) has attracted substantial interest in the community of machine learning. As the extension of SVM, nonparallel hyperplane SVM (NHSVM) classification algorithms have become current researching hot spots in machine learning ...
A twin-hypersphere support vector machine classifier and the fast learning algorithm
This paper formulates a twin-hypersphere support vector machine (THSVM) classifier for binary recognition. Similar to the twin support vector machine (TWSVM) classifier, this THSVM determines two hyperspheres by solving two related support vector ...
Midpoint-validation algorithm for support vector machine classification
In this article, we propose a midpoint-validation algorithm for a support vector machine which improves the generalization of the support vector machine so that the midpoint-validation error is minimized. We compared its performance with other ...
Comments