ABSTRACT
In this paper we propose a new memetic algorithm (MASVM) for fast and efficient selection of a valuable training set for support vector machines (SVMs). This is a crucial step especially in case of large and noisy data sets, since the SVM training has high time and memory complexity. The majority of state-of-the-art methods exploit the data geometry analysis, both in the input and kernel space. Although evolutionary algorithms have been proven to be very efficient for this purpose, they have not been extensively studied so far. Here, we propose a new method employing an adaptive genetic algorithm enhanced by some refinement techniques. The refinements are based on utilizing a pool of the support vectors identified so far at various steps of the algorithm. Extensive experimental study performed on the well-known benchmark, real-world and artificial data sets clearly confirms the efficacy, robustness and convergence capabilities of the proposed approach, and shows that it is competitive compared with other state-of-the-art techniques.
- S. Abe and T. Inoue. Fast training of support vector machines by extracting boundary data. In Proc. Int. Conf. on ANNs, pages 308--313. Springer, 2001. Google ScholarDigital Library
- J. L. Balcäzar, Y. Dai, and O. Watanabe. A random sampling technique for training support vector machines. In Proc. ALT, pages 119--134. Springer, 2001. Google ScholarDigital Library
- C.-C. Chang, H.-K. Pao, and Y.-J. Lee. An RSVM based two-teachers-one-student semi-supervised learning algorithm. Neural Networks, 25:57--69, 2012. Google ScholarDigital Library
- L.-J. Chien, C.-C. Chang, and Y.-J. Lee. Variant methods of reduced set selection for reduced support vector machines. JISE, 26(1):183--196, 2010.Google Scholar
- C. Cortes and V. Vapnik. Support-Vector Networks. Mach. Learn., 20(3):273--297, 1995. Google ScholarDigital Library
- E. E. A. Elamin. A proposed genetic algorithm selection method. In Proc. 1st NITS, pages 1--8, 2006.Google Scholar
- E. Ferragut and J. Laska. Randomized sampling for large data applications of SVM. In Proc. IEEE ICMLA, volume 1, pages 350--355, 2012. Google ScholarDigital Library
- X. Guan, X. Zhang, D. Han, Y. Zhu, J. Lv, and J. Su. A strategic flight conflict avoidance approach based on a memetic algorithm. Chinese J. of Aeronautics, 27(1):93--101, 2014.Google ScholarCross Ref
- Y. Jin, J.-K. Hao, and J.-P. Hamiez. A memetic algorithm for the minimum sum coloring problem. Comput. Oper. Res., 43:318--327, 2014. Google ScholarDigital Library
- T. Joachims. Making large-scale SVM learning practical. In B. Scholkopf, C. J. C. Burges, and A. J. Smola, editors, Adv. in kernel methods, pages 169--184. MIT Press, 1999. Google ScholarDigital Library
- M. Kawulok and J. Nalepa. Support vector machines training data selection using a genetic algorithm. In Proc. S+SSPR 2012, volume 7626 of LNCS, pages 557--565. Springer, 2012. Google ScholarDigital Library
- R. Koggalage and S. Halgamuge. Reducing the number of training samples for fast support vector machine classification. Neural Information Process. Lett. and Reviews, 2(3):57--65, 2004.Google Scholar
- Q. Le, T. Sarlos, and A. Smola. Fastfood -- approximating kernel expansions in loglinear time. In Proc. ICML, 2013.Google Scholar
- Y.-J. Lee and S.-Y. Huang. Reduced support vector machines: A statistical theory. IEEE T. Neural Networ., 18(1):1--13, 2007. Google ScholarDigital Library
- Y. Li, P. Li, B. Wu, L. Jiao, and R. Shang. Kernel clustering using a hybrid memetic algorithm. Natural Comput., 12(4):605--615, 2013. Google ScholarDigital Library
- A. Lopez-Chau, X. Li, and W. Yu. Convex-concave hull for classification with SVM. In Proc. IEEE ICDMW, pages 431--438, 2012. Google ScholarDigital Library
- M. Marinaki and Y. Marinakis. An island memetic differential evolution algorithm for the feature selection problem. In Proc. NICSO, volume 512 of SCI, pages 29--42. Springer, 2014.Google Scholar
- P. Moscato. On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts -- Towards Memetic Algorithms. Technical Report 158-79, California Institute of Technology, 1989.Google Scholar
- D. R. Musicant and A. Feinberg. Active set support vector regression. IEEE T. on Neural Networ., 15(2):268--275, 2004. Google ScholarDigital Library
- J. Nalepa and Z. J. Czech. New selection schemes in a memetic algorithm for the vehicle routing problem with time windows. In Proc. ICANNGA, volume 7824 of LNCS, pages 396--405. Springer, 2013.Google Scholar
- J. Nalepa and M. Kawulok. Adaptive genetic algorithm to select training data for support vector machines. In EvoApp., LNCS. Springer, 2014. in press.Google Scholar
- S. L. Phung, D. Chai, and A. Bouzerdoum. Adaptive skin segmentation in color images. In Proc. IEEE ICASSP, pages 353--356, 2003.Google ScholarCross Ref
- A. Rahimi and B. Recht. Random features for large-scale kernel machines. In Proc. NIPS, pages 1177--1184. 2008.Google Scholar
- A. Rahimi and B. Recht. Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In Proc. NIPS, pages 1313--1320. 2009.Google Scholar
- I. Rodriguez-Lujan, C. S. Cruz, and R. Huerta. Hierarchical linear support vector machine. Patt. Recogn., 45(12):4414--4427, 2012. Google ScholarDigital Library
- S. Salvador and P. Chan. Determining the number of clusters / segments in hierarchical clustering / segmentation algorithms. In Proc. IEEE ICTAI, pages 576--584, 2004. Google ScholarDigital Library
- G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In Proc. ICML, pages 839--846, 2000. Google ScholarDigital Library
- H. Shin and S. Cho. Neighborhood property-based pattern selection for SVMs. Neural Comput., 19(3):816--855, 2007. Google ScholarDigital Library
- I. W. Tsang, J. T. Kwok, and P.-M. Cheung. Core vector machines: Fast SVM training on very large data sets. J. of Mach. Learn. Res., 6:363--392, 2005. Google ScholarDigital Library
- D. Wang and L. Shi. Selecting valuable training samples for SVMs via data structure analysis. Neurocomputing, 71:2772--2781, 2008. Google ScholarDigital Library
- J. Wang, P. Neskovic, and L. N. Cooper. Training data selection for SVMs. In Adv. in Natural Comp., pages 554--564. Springer, 2005. Google ScholarDigital Library
- Z.-Q. Zeng, H.-R. Xu, Y.-Q. Xie, and J. Gao. A geometric approach to train SVM on very large data sets. In Proc. ISKE, volume 1, pages 991--996, 2008.Google Scholar
- W. Zhang and I. King. Locating support vectors via-skeleton technique. In Proc. ICONIP, pages 1423--1427, 2002.Google ScholarCross Ref
Index Terms
- A memetic algorithm to select training data for support vector machines
Recommendations
Incremental training of support vector machines using hyperspheres
In the conventional incremental training of support vector machines, candidates for support vectors tend to be deleted if the separating hyperplane rotates as the training data are added. To solve this problem, in this paper, we propose an incremental ...
Training Support Vector Machines Using Gilbert's Algorithm
ICDM '05: Proceedings of the Fifth IEEE International Conference on Data MiningSupport Vector Machines are classifiers designed around the computation of an optimal separating hyperplane. This hyperplane is typically obtained by solving a constrained quadratic programming problem, but may also be located by solving a nearest point ...
An overview on twin support vector machines
Twin support vector machines (TWSVM) is based on the idea of proximal SVM based on generalized eigenvalues (GEPSVM), which determines two nonparallel planes by solving two related SVM-type problems, so that its computing cost in the training phase is 1/...
Comments