skip to main content
10.1145/2576768.2598370acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

A memetic algorithm to select training data for support vector machines

Published:12 July 2014Publication History

ABSTRACT

In this paper we propose a new memetic algorithm (MASVM) for fast and efficient selection of a valuable training set for support vector machines (SVMs). This is a crucial step especially in case of large and noisy data sets, since the SVM training has high time and memory complexity. The majority of state-of-the-art methods exploit the data geometry analysis, both in the input and kernel space. Although evolutionary algorithms have been proven to be very efficient for this purpose, they have not been extensively studied so far. Here, we propose a new method employing an adaptive genetic algorithm enhanced by some refinement techniques. The refinements are based on utilizing a pool of the support vectors identified so far at various steps of the algorithm. Extensive experimental study performed on the well-known benchmark, real-world and artificial data sets clearly confirms the efficacy, robustness and convergence capabilities of the proposed approach, and shows that it is competitive compared with other state-of-the-art techniques.

References

  1. S. Abe and T. Inoue. Fast training of support vector machines by extracting boundary data. In Proc. Int. Conf. on ANNs, pages 308--313. Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. L. Balcäzar, Y. Dai, and O. Watanabe. A random sampling technique for training support vector machines. In Proc. ALT, pages 119--134. Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C.-C. Chang, H.-K. Pao, and Y.-J. Lee. An RSVM based two-teachers-one-student semi-supervised learning algorithm. Neural Networks, 25:57--69, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L.-J. Chien, C.-C. Chang, and Y.-J. Lee. Variant methods of reduced set selection for reduced support vector machines. JISE, 26(1):183--196, 2010.Google ScholarGoogle Scholar
  5. C. Cortes and V. Vapnik. Support-Vector Networks. Mach. Learn., 20(3):273--297, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. E. A. Elamin. A proposed genetic algorithm selection method. In Proc. 1st NITS, pages 1--8, 2006.Google ScholarGoogle Scholar
  7. E. Ferragut and J. Laska. Randomized sampling for large data applications of SVM. In Proc. IEEE ICMLA, volume 1, pages 350--355, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. X. Guan, X. Zhang, D. Han, Y. Zhu, J. Lv, and J. Su. A strategic flight conflict avoidance approach based on a memetic algorithm. Chinese J. of Aeronautics, 27(1):93--101, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  9. Y. Jin, J.-K. Hao, and J.-P. Hamiez. A memetic algorithm for the minimum sum coloring problem. Comput. Oper. Res., 43:318--327, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Joachims. Making large-scale SVM learning practical. In B. Scholkopf, C. J. C. Burges, and A. J. Smola, editors, Adv. in kernel methods, pages 169--184. MIT Press, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Kawulok and J. Nalepa. Support vector machines training data selection using a genetic algorithm. In Proc. S+SSPR 2012, volume 7626 of LNCS, pages 557--565. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Koggalage and S. Halgamuge. Reducing the number of training samples for fast support vector machine classification. Neural Information Process. Lett. and Reviews, 2(3):57--65, 2004.Google ScholarGoogle Scholar
  13. Q. Le, T. Sarlos, and A. Smola. Fastfood -- approximating kernel expansions in loglinear time. In Proc. ICML, 2013.Google ScholarGoogle Scholar
  14. Y.-J. Lee and S.-Y. Huang. Reduced support vector machines: A statistical theory. IEEE T. Neural Networ., 18(1):1--13, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Li, P. Li, B. Wu, L. Jiao, and R. Shang. Kernel clustering using a hybrid memetic algorithm. Natural Comput., 12(4):605--615, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Lopez-Chau, X. Li, and W. Yu. Convex-concave hull for classification with SVM. In Proc. IEEE ICDMW, pages 431--438, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Marinaki and Y. Marinakis. An island memetic differential evolution algorithm for the feature selection problem. In Proc. NICSO, volume 512 of SCI, pages 29--42. Springer, 2014.Google ScholarGoogle Scholar
  18. P. Moscato. On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts -- Towards Memetic Algorithms. Technical Report 158-79, California Institute of Technology, 1989.Google ScholarGoogle Scholar
  19. D. R. Musicant and A. Feinberg. Active set support vector regression. IEEE T. on Neural Networ., 15(2):268--275, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Nalepa and Z. J. Czech. New selection schemes in a memetic algorithm for the vehicle routing problem with time windows. In Proc. ICANNGA, volume 7824 of LNCS, pages 396--405. Springer, 2013.Google ScholarGoogle Scholar
  21. J. Nalepa and M. Kawulok. Adaptive genetic algorithm to select training data for support vector machines. In EvoApp., LNCS. Springer, 2014. in press.Google ScholarGoogle Scholar
  22. S. L. Phung, D. Chai, and A. Bouzerdoum. Adaptive skin segmentation in color images. In Proc. IEEE ICASSP, pages 353--356, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  23. A. Rahimi and B. Recht. Random features for large-scale kernel machines. In Proc. NIPS, pages 1177--1184. 2008.Google ScholarGoogle Scholar
  24. A. Rahimi and B. Recht. Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In Proc. NIPS, pages 1313--1320. 2009.Google ScholarGoogle Scholar
  25. I. Rodriguez-Lujan, C. S. Cruz, and R. Huerta. Hierarchical linear support vector machine. Patt. Recogn., 45(12):4414--4427, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Salvador and P. Chan. Determining the number of clusters / segments in hierarchical clustering / segmentation algorithms. In Proc. IEEE ICTAI, pages 576--584, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In Proc. ICML, pages 839--846, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Shin and S. Cho. Neighborhood property-based pattern selection for SVMs. Neural Comput., 19(3):816--855, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. I. W. Tsang, J. T. Kwok, and P.-M. Cheung. Core vector machines: Fast SVM training on very large data sets. J. of Mach. Learn. Res., 6:363--392, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. Wang and L. Shi. Selecting valuable training samples for SVMs via data structure analysis. Neurocomputing, 71:2772--2781, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Wang, P. Neskovic, and L. N. Cooper. Training data selection for SVMs. In Adv. in Natural Comp., pages 554--564. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Z.-Q. Zeng, H.-R. Xu, Y.-Q. Xie, and J. Gao. A geometric approach to train SVM on very large data sets. In Proc. ISKE, volume 1, pages 991--996, 2008.Google ScholarGoogle Scholar
  33. W. Zhang and I. King. Locating support vectors via-skeleton technique. In Proc. ICONIP, pages 1423--1427, 2002.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A memetic algorithm to select training data for support vector machines

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GECCO '14: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation
        July 2014
        1478 pages
        ISBN:9781450326629
        DOI:10.1145/2576768

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 July 2014

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        GECCO '14 Paper Acceptance Rate180of544submissions,33%Overall Acceptance Rate1,669of4,410submissions,38%

        Upcoming Conference

        GECCO '24
        Genetic and Evolutionary Computation Conference
        July 14 - 18, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader