Skip to main content
Top

2017 | OriginalPaper | Chapter

An Alternating Genetic Algorithm for Selecting SVM Model and Training Set

Authors : Michal Kawulok, Jakub Nalepa, Wojciech Dudzik

Published in: Pattern Recognition

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Support vector machines (SVMs) have been found highly helpful in solving numerous pattern recognition tasks. Although it is challenging to train SVMs from large data sets, this obstacle may be mitigated by selecting a small, yet representative, subset of the entire training set. Another crucial and deeply-investigated problem consists in selecting the SVM model. There have been a plethora of methods proposed to effectively deal with these two problems treated independently, however to the best of our knowledge, it was not explored how to effectively combine these two processes. It is a noteworthy observation that depending on the subset selected for training, a different SVM model may be optimal, hence performing these two operations simultaneously is potentially beneficial. In this paper, we propose a new method to select both the training set and the SVM model, using a genetic algorithm which alternately optimizes two different populations. We demonstrate that our approach is competitive with sequential optimization of the hyperparameters followed by selecting the training set. We report the results obtained for several benchmark data sets and we visualize the results elaborated for artificial sets of 2D points.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Available at http://​sun.​aei.​polsl.​pl/​~mkawulok/​mcpr2017 (along with visual results).
 
Literature
1.
go back to reference Angiulli, F., Astorino, A.: Scaling up support vector machines using nearest neighbor condensation. IEEE Trans. Neural Netw. 21(2), 351–357 (2010)CrossRef Angiulli, F., Astorino, A.: Scaling up support vector machines using nearest neighbor condensation. IEEE Trans. Neural Netw. 21(2), 351–357 (2010)CrossRef
2.
go back to reference Cervantes, J., Lamont, F.G., López-Chau, A., Mazahua, L.R., Ruíz, J.S.: Data selection based on decision tree for SVM classification on large data sets. Appl. Soft Comput. 37, 787–798 (2015)CrossRef Cervantes, J., Lamont, F.G., López-Chau, A., Mazahua, L.R., Ruíz, J.S.: Data selection based on decision tree for SVM classification on large data sets. Appl. Soft Comput. 37, 787–798 (2015)CrossRef
3.
go back to reference Chou, J.S., Cheng, M.Y., Wu, Y.W., Pham, A.D.: Optimizing parameters of SVM using fast messy genetic algorithm for dispute classification. Expert Syst. Appl. 41(8), 3955–3964 (2014)CrossRef Chou, J.S., Cheng, M.Y., Wu, Y.W., Pham, A.D.: Optimizing parameters of SVM using fast messy genetic algorithm for dispute classification. Expert Syst. Appl. 41(8), 3955–3964 (2014)CrossRef
4.
go back to reference Ferragut, E., Laska, J.: Randomized sampling for large data applications of SVM. In: Proceedings of the ICMLA, vol. 1, pp. 350–355 (2012) Ferragut, E., Laska, J.: Randomized sampling for large data applications of SVM. In: Proceedings of the ICMLA, vol. 1, pp. 350–355 (2012)
5.
go back to reference Friedrichs, F., Igel, C.: Evolutionary tuning of multiple SVM parameters. Neurocomputing 64, 107–117 (2005)CrossRef Friedrichs, F., Igel, C.: Evolutionary tuning of multiple SVM parameters. Neurocomputing 64, 107–117 (2005)CrossRef
6.
go back to reference Gold, C., Sollich, P.: Model selection for support vector machine classification. Neurocomputing 55(1–2), 221–249 (2003)CrossRef Gold, C., Sollich, P.: Model selection for support vector machine classification. Neurocomputing 55(1–2), 221–249 (2003)CrossRef
7.
go back to reference Guo, L., Boukir, S.: Fast data selection for SVM training using ensemble margin. Pattern Recognit. Lett. 51, 112–119 (2015)CrossRef Guo, L., Boukir, S.: Fast data selection for SVM training using ensemble margin. Pattern Recognit. Lett. 51, 112–119 (2015)CrossRef
8.
go back to reference Joachims, T.: Making large-scale SVM learning practical. In: Advances in Kernel Methods, pp. 169–184. MIT Press, Cambridge (1999) Joachims, T.: Making large-scale SVM learning practical. In: Advances in Kernel Methods, pp. 169–184. MIT Press, Cambridge (1999)
9.
go back to reference Kapp, M.N., Sabourin, R., Maupin, P.: A dynamic model selection strategy for support vector machine classifiers. Appl. Soft Comput. 12(8), 2550–2565 (2012)CrossRef Kapp, M.N., Sabourin, R., Maupin, P.: A dynamic model selection strategy for support vector machine classifiers. Appl. Soft Comput. 12(8), 2550–2565 (2012)CrossRef
10.
go back to reference Kawulok, M., Nalepa, J.: Support vector machines training data selection using a genetic algorithm. In: Gimel’farb, G.L., et al. (eds.) SSPR & SPR 2012. LNCS, vol. 7626, pp. 557–565. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34166-3_61 CrossRef Kawulok, M., Nalepa, J.: Support vector machines training data selection using a genetic algorithm. In: Gimel’farb, G.L., et al. (eds.) SSPR & SPR 2012. LNCS, vol. 7626, pp. 557–565. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-34166-3_​61 CrossRef
11.
go back to reference Kawulok, M., Nalepa, J.: Dynamically adaptive genetic algorithm to select training data for SVMs. In: Bazzan, A.L.C., Pichara, K. (eds.) IBERAMIA 2014. LNCS (LNAI), vol. 8864, pp. 242–254. Springer, Cham (2014). doi:10.1007/978-3-319-12027-0_20 Kawulok, M., Nalepa, J.: Dynamically adaptive genetic algorithm to select training data for SVMs. In: Bazzan, A.L.C., Pichara, K. (eds.) IBERAMIA 2014. LNCS (LNAI), vol. 8864, pp. 242–254. Springer, Cham (2014). doi:10.​1007/​978-3-319-12027-0_​20
12.
go back to reference Le, Q., Sarlos, T., Smola, A.: Fastfood - approximating kernel expansions in loglinear time. In: Proceedings of the ICML, pp. 1–9 (2013) Le, Q., Sarlos, T., Smola, A.: Fastfood - approximating kernel expansions in loglinear time. In: Proceedings of the ICML, pp. 1–9 (2013)
13.
go back to reference Lebrun, G., Charrier, C., Lezoray, O., Cardot, H.: Tabu search model selection for SVM. Int. J. Neural Syst. 18(01), 19–31 (2008)CrossRef Lebrun, G., Charrier, C., Lezoray, O., Cardot, H.: Tabu search model selection for SVM. Int. J. Neural Syst. 18(01), 19–31 (2008)CrossRef
14.
go back to reference von Luxburg, U., Bousquet, O., Schölkopf, B.: A compression approach to support vector model selection. J. Mach. Learn. Res. 5, 293–323 (2004)MathSciNetMATH von Luxburg, U., Bousquet, O., Schölkopf, B.: A compression approach to support vector model selection. J. Mach. Learn. Res. 5, 293–323 (2004)MathSciNetMATH
15.
go back to reference Nalepa, J., Kawulok, M.: A memetic algorithm to select training data for support vector machines. In: Proceedings of the GECCO, pp. 573–580. ACM (2014) Nalepa, J., Kawulok, M.: A memetic algorithm to select training data for support vector machines. In: Proceedings of the GECCO, pp. 573–580. ACM (2014)
16.
go back to reference Nalepa, J., Kawulok, M.: Adaptive memetic algorithm enhanced with data geometry analysis to select training data for SVMs. Neurocomputing 185, 113–132 (2016)CrossRef Nalepa, J., Kawulok, M.: Adaptive memetic algorithm enhanced with data geometry analysis to select training data for SVMs. Neurocomputing 185, 113–132 (2016)CrossRef
17.
go back to reference Nalepa, J., Siminski, K., Kawulok, M.: Towards parameter-less support vector machines. In: Proceedings of the ACPR, pp. 211–215 (2015) Nalepa, J., Siminski, K., Kawulok, M.: Towards parameter-less support vector machines. In: Proceedings of the ACPR, pp. 211–215 (2015)
18.
go back to reference Nishida, K., Kurita, T.: RANSAC-SVM for large-scale datasets. In: Proceedings of the IEEE ICPR, pp. 1–4 (2008) Nishida, K., Kurita, T.: RANSAC-SVM for large-scale datasets. In: Proceedings of the IEEE ICPR, pp. 1–4 (2008)
19.
go back to reference Ripepi, G., Clematis, A., DAgostino, D.: A hybrid parallel implementation of model selection for support vector machines. In: Proceedings of the PDP, pp. 145–149 (2015) Ripepi, G., Clematis, A., DAgostino, D.: A hybrid parallel implementation of model selection for support vector machines. In: Proceedings of the PDP, pp. 145–149 (2015)
20.
go back to reference Shen, X.J., Mu, L., Li, Z., Wu, H.X., Gou, J.P., Chen, X.: Large-scale SVM classification with redundant data reduction. Neurocomputing 172, 189–197 (2016)CrossRef Shen, X.J., Mu, L., Li, Z., Wu, H.X., Gou, J.P., Chen, X.: Large-scale SVM classification with redundant data reduction. Neurocomputing 172, 189–197 (2016)CrossRef
21.
go back to reference Simiński, K.: Neuro-fuzzy system based kernel for classification with support vector machines. In: Gruca, D.A., Czachórski, T., Kozielski, S. (eds.) Man-Machine Interactions 3. AISC, vol. 242, pp. 415–422. Springer, Cham (2014). doi:10.1007/978-3-319-02309-0_45 CrossRef Simiński, K.: Neuro-fuzzy system based kernel for classification with support vector machines. In: Gruca, D.A., Czachórski, T., Kozielski, S. (eds.) Man-Machine Interactions 3. AISC, vol. 242, pp. 415–422. Springer, Cham (2014). doi:10.​1007/​978-3-319-02309-0_​45 CrossRef
22.
go back to reference Sullivan, K.M., Luke, S.: Evolving kernels for support vector machine classification. In: Proceedings of the GECCO, pp. 1702–1707. ACM, New York (2007) Sullivan, K.M., Luke, S.: Evolving kernels for support vector machine classification. In: Proceedings of the GECCO, pp. 1702–1707. ACM, New York (2007)
23.
go back to reference Tang, Y., Guo, W., Gao, J.: Efficient model selection for support vector machine with Gaussian kernel function. In: Proceedings of the IEEE CIDM, pp. 40–45 (2009) Tang, Y., Guo, W., Gao, J.: Efficient model selection for support vector machine with Gaussian kernel function. In: Proceedings of the IEEE CIDM, pp. 40–45 (2009)
24.
go back to reference Wang, D., Shi, L.: Selecting valuable training samples for SVMs via data structure analysis. Neurocomputing 71, 2772–2781 (2008)CrossRef Wang, D., Shi, L.: Selecting valuable training samples for SVMs via data structure analysis. Neurocomputing 71, 2772–2781 (2008)CrossRef
25.
go back to reference Wang, Z., Shao, Y.H., Wu, T.R.: A GA-based model selection for smooth twin parametric-margin SVM. Pattern Recognit. 46(8), 2267–2277 (2013)CrossRefMATH Wang, Z., Shao, Y.H., Wu, T.R.: A GA-based model selection for smooth twin parametric-margin SVM. Pattern Recognit. 46(8), 2267–2277 (2013)CrossRefMATH
Metadata
Title
An Alternating Genetic Algorithm for Selecting SVM Model and Training Set
Authors
Michal Kawulok
Jakub Nalepa
Wojciech Dudzik
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-59226-8_10

Premium Partner