Top

The Journal of Supercomputing

Published in:

24-03-2021

Speeding up the testing and training time for the support vector machines with minimal effect on the performance

Author: Hamid Reza Ghaffari

Published in: The Journal of Supercomputing | Issue 10/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Support vector machine faces some problems associated with training time in the presence of large data sets due to the need for high memory and high computational cost. The main problem with Support vector machine occurs during the training phase, which is computationally expensive and depends on the size of the input data set. The high boundary complexity among the classes is another major problem with the most data sets, which will reduce the generalizability. Therefore, this study was conducted to present a useful method for reducing the number of data and boundary complexity, in which the training set is divided into the boundary, non-boundary, and harmful patterns. It has four phases: In the first phase, the information is determined about the neighborhood of each sample with other samples. In the second phase, harmful patterns are removed to reduce the data complexity. In the third phase, the remaining training data samples are divided into the boundary and non-boundary patterns. In phase 4, representatives of non-boundary data are determined and a reduced set is formed through combination of these representatives and boundary patterns. The proposed method is tested with 33 data sets and comparatively evaluated against five of the most successful instance-based condensation algorithms. Experiments showed that the proposed method is better than other methods presented in the research literature.

previous article Stochastic distributed data stream partitioning using task locality: design, implementation, and optimization

next article -version of the neutrosophic cubic set: application in the negative influences of Internet

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Ougiaroglou S, Diamantaras KI, Evangelidis G (2018) Exploring the effect of data reduction on neural network and support vector machine classification. Neurocomputing 280:101–110CrossRef

Guo L, Boukir S (2015) Fast data selection for SVM training using ensemble margin. Pattern Recognit Lett 51:112–119CrossRef

Kawulok M, Nalepa J (2014) Dynamically adaptive genetic algorithm to select training data for SVMs. In: Paper Presented at the Ibero-American Conference on Artificial Intelligence

Lin W-C, Ke S-W, Tsai C-F (2015) CANN: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl-Based Syst 78:13–21CrossRef

Huang J, Shao X, Wechsler H (1998) Face pose discrimination using support vector machines (SVM). In: Paper Presented at the Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No. 98EX170)

Jung HG, Kim G (2013) Support vector number reduction: survey and experimental evaluations. IEEE Trans Intell Transp Syst 15(2):463–476CrossRef

Yang L, Zhu Q, Huang J, Wu Q, Cheng D, Hong X (2019) Constraint nearest neighbor for instance reduction. Soft Comput 23:13235–13245CrossRef

Nikolaidis K, Goulermas JY, Wu Q (2011) A class boundary preserving algorithm for data condensation. Pattern Recognit 44(3):704–715CrossRef

Tomek I (1976) AN experiment with the edited nearest-nieghbor rule

10.

Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern 3:408–421MathSciNetCrossRef

11.

Cameron-Jones R (1995) Instance selection by encoding length heuristic with random mutation hill climbing. In: Paper Presented at the Eighth Australian Joint Conference on Artificial Intelligence

12.

Wilson DR, Martinez TR (1997) Improved center point selection for radial basis function networks. In: Paper Presented at the Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms (ICANNGA’97)

13.

Liu C,* Wang W, Wang M, Lv F, Konan M (2017) An efficient instance selection algorithm to reconstruct training set for support vector machine. Knowl-Based Syst 116:58–73CrossRef

14.

de Haro-García A, García-Pedrajas N (2009) A divide-and-conquer recursive approach for scaling up instance selection algorithms. Data Min Knowl Disc 18(3):392–418MathSciNetCrossRef

15.

Li X, Cervantes J, Yu W (2010) A novel SVM classification method for large data sets. In: Paper Presented at the 2010 IEEE International Conference on Granular Computing

16.

Koggalage R, Halgamuge S (2004) Reducing the number of training samples for fast support vector machine classification. Neural Inf Process-Lett Rev 2(3):57–65

17.

Li B, Wang Q, Hu J (2009) A fast SVM training method for very large data sets. In: Paper Presented at the 2009 International Joint Conference on Neural Networks

18.

Leyva E, González A, Pérez R (2013) Knowledge-based instance selection: a compromise between efficiency and versatility. Knowl-Based Syst 47:65–76CrossRef

19.

Smith-Miles K, Islam R (2010) Meta-learning for data summarization based on instance selection method. In: Paper Presented at the IEEE Congress on Evolutionary Computation

20.

Cervantes J, Lamont FG, López-Chau A, Mazahua LR, Ruíz JS (2015) Data selection based on decision tree for SVM classification on large data sets. Appl Soft Comput 37:787–798CrossRef

21.

Anwar IM, Salama KM, Abdelbar AM (2015) Instance selection with ant colony optimization. Procedia Comput Sci 53:248–256CrossRef

22.

Cervantes J, López A, García F, Trueba A (2011) A fast SVM training algorithm based on a decision tree data filter. In: Paper Presented at the Mexican International Conference on Artificial Intelligence

23.

Yu H, Yang J, Han J (2003) Classifying large data sets using SVMs with hierarchical clusters. In:Paper Presented at the Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

24.

Chitrakar R, Huang C (2014) Selection of candidate support vectors in incremental SVM for network intrusion detection. Comput Secur 45:231–241CrossRef

25.

Shen XJ, Mu L, Li Z, Wu HX, Gou JP, Chen X (2016) Large-scale support vector machine classification with redundant data reduction. Neurocomputing 172:189–197CrossRef

26.

Yang L, Zhu Q, Huang J, Cheng D (2016) Adaptive edited natural neighbor algorithm. Neurocomputing 230:427–433CrossRef

27.

Basu M, Ho TK (2006) Data complexity in pattern recognition. Springer Science & Business Media, New YorkCrossRef

28.

Ghaffari HR, Yazdi HS (2014) Multiclass classifier based on boundary complexity. Neural Comput Appl 24(5):985–993CrossRef

29.

Li L (2006) Data complexity in machine learning and novel classification algorithms. California Institute of Technology

30.

Vapnik V (2013) The nature of statistical learning theory. Springer science & business media, New YorkMATH

31.

Asuncion A, Newman D (2007) UCI machine learning repository. In

32.

Liua C, Wanga W, Wanga M, Lv F, Konana M (2016) An efficient instance selection algorithm to reconstruct training set for support vector machine. Knowl-Based Syst 116:58–73CrossRef

Title: Speeding up the testing and training time for the support vector machines with minimal effect on the performance
Author: Hamid Reza Ghaffari
Publication date: 24-03-2021
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 10/2021
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-021-03729-0

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 10/2021

Distributed stochastic principal component analysis using stabilized Barzilai-Borwein step-size for data compression with WSN

Applying TS-DBN model into sports behavior recognition with deep learning approach

-version of the neutrosophic cubic set: application in the negative influences of Internet

Cyberattack detection model using deep learning in a network log system with data visualization

Stochastic distributed data stream partitioning using task locality: design, implementation, and optimization

A novel multiclass priority algorithm for task scheduling in cloud computing

Premium Partner