Skip to main content
Erschienen in: The Journal of Supercomputing 8/2019

06.03.2019

CBCH (clustering-based convex hull) for reducing training time of support vector machine

verfasst von: Pardis Birzhandi, Hee Yong Youn

Erschienen in: The Journal of Supercomputing | Ausgabe 8/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Support vector machine (SVM) is an efficient machine learning technique widely applied to various classification problems due to its robustness. However, the training time grows dramatically as the number of training data increases. As a result, the applicability of SVM to large-scale datasets is somewhat limited. In SVM, only a few training samples called support vectors (SVs) affect the construction of hyperplane. Therefore, removing training data irrelevant to the SVs does not degrade the performance of SVM. In this paper the clustering-based convex hull (CBCH) scheme is introduced which allows to efficiently remove insignificant data and thereby reduce the training time of SVM. The CBCH scheme initially applies k-mean clustering algorithm to the given training data points, and then, the convex hull of each cluster is obtained. Only the vertices of the convex hulls and the data points relevant to the SVs are included as training data points. Computer simulation over various sizes and types of datasets reveals that the proposed scheme is considerably faster and more accurate than the existing SVM classifiers. The proposed algorithm is based on geometric interpretation of the SVM and applicable to both linearly separable and linearly inseparable datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yao Y, Liu Y, Yu Y, Xu H, Lv W, Li Z, Chen X (2013) K-SVM: an effective SVM algorithm based on K-means clustering. J Comput 8:2632–2639 Yao Y, Liu Y, Yu Y, Xu H, Lv W, Li Z, Chen X (2013) K-SVM: an effective SVM algorithm based on K-means clustering. J Comput 8:2632–2639
2.
Zurück zum Zitat Varadwaj P, Purohit N, Arora B (2009) Detection of splice sites using support vector machine. In: International Conference on Contemporary Computing, pp 493–502 Varadwaj P, Purohit N, Arora B (2009) Detection of splice sites using support vector machine. In: International Conference on Contemporary Computing, pp 493–502
3.
Zurück zum Zitat Kumar MA, Gopal M (2010) A comparison study on multiple binary-class SVM methods for unilabel text categorization. Pattern Recogn Lett 31:1437–1444CrossRef Kumar MA, Gopal M (2010) A comparison study on multiple binary-class SVM methods for unilabel text categorization. Pattern Recogn Lett 31:1437–1444CrossRef
4.
Zurück zum Zitat Mitra V, Wang C-J, Banerjee S (2007) Text classification: a least square support vector machine approach. Appl Soft Comput 7:908–914CrossRef Mitra V, Wang C-J, Banerjee S (2007) Text classification: a least square support vector machine approach. Appl Soft Comput 7:908–914CrossRef
5.
Zurück zum Zitat Sánchez AVD (2003) Advanced support vector machines and kernel methods. Neurocomputing 55:5–20CrossRef Sánchez AVD (2003) Advanced support vector machines and kernel methods. Neurocomputing 55:5–20CrossRef
6.
Zurück zum Zitat Dong J, Krzyżak A, Suen CY (2005) An improved handwritten Chinese character recognition system using support vector machine. Pattern Recogn Lett 26:1849–1856CrossRef Dong J, Krzyżak A, Suen CY (2005) An improved handwritten Chinese character recognition system using support vector machine. Pattern Recogn Lett 26:1849–1856CrossRef
7.
Zurück zum Zitat Yang Y, Yu D, Cheng J (2007) A fault diagnosis approach for roller bearing based on IMF envelope spectrum and SVM. Measurement 40:943–950CrossRef Yang Y, Yu D, Cheng J (2007) A fault diagnosis approach for roller bearing based on IMF envelope spectrum and SVM. Measurement 40:943–950CrossRef
8.
Zurück zum Zitat Abbasion S, Rafsanjani A, Farshidianfar A, Irani N (2007) Rolling element bearings multi-fault classification based on the wavelet denoising and support vector machine. Mech Syst Signal Process 21:2933–2945CrossRef Abbasion S, Rafsanjani A, Farshidianfar A, Irani N (2007) Rolling element bearings multi-fault classification based on the wavelet denoising and support vector machine. Mech Syst Signal Process 21:2933–2945CrossRef
9.
Zurück zum Zitat Zeng M, Yang Y, Zheng J, Cheng J (2015) Maximum margin classification based on flexible convex hulls. Neurocomputing 149:957–965CrossRef Zeng M, Yang Y, Zheng J, Cheng J (2015) Maximum margin classification based on flexible convex hulls. Neurocomputing 149:957–965CrossRef
10.
Zurück zum Zitat Bennett KP, Bredensteiner EJ (2000) Duality and geometry in SVM classifiers. In: ICML, pp 57–64 Bennett KP, Bredensteiner EJ (2000) Duality and geometry in SVM classifiers. In: ICML, pp 57–64
11.
Zurück zum Zitat Vapnik VN, Kotz S (1982) Estimation of dependences based on empirical data. Springer, New YorkMATH Vapnik VN, Kotz S (1982) Estimation of dependences based on empirical data. Springer, New YorkMATH
12.
Zurück zum Zitat Platt J (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Technical report MSR-TR-98-14, Microsoft research Platt J (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Technical report MSR-TR-98-14, Microsoft research
13.
Zurück zum Zitat Awad M, Khan L, Bastani F, Yen I-L (2004) An effective support vector machines (SVMs) performance using hierarchical clustering. In: 16th IEEE International Conference on Tools with Artificial Intelligence, pp 663–667 Awad M, Khan L, Bastani F, Yen I-L (2004) An effective support vector machines (SVMs) performance using hierarchical clustering. In: 16th IEEE International Conference on Tools with Artificial Intelligence, pp 663–667
14.
Zurück zum Zitat Yu H, Yang J, Han J, Li X (2005) Making SVMs scalable to large data sets using hierarchical cluster indexing. Data Min Knowl Discov 11:295–321MathSciNetCrossRef Yu H, Yang J, Han J, Li X (2005) Making SVMs scalable to large data sets using hierarchical cluster indexing. Data Min Knowl Discov 11:295–321MathSciNetCrossRef
15.
Zurück zum Zitat Heisele B, Serre T, Prentice S, Poggio T (2003) Hierarchical classification and feature reduction for fast face detection with support vector machines. Pattern Recogn 36:2007–2017CrossRefMATH Heisele B, Serre T, Prentice S, Poggio T (2003) Hierarchical classification and feature reduction for fast face detection with support vector machines. Pattern Recogn 36:2007–2017CrossRefMATH
16.
Zurück zum Zitat Sohn S, Dagli CH (2001) Advantages of using fuzzy class memberships in self-organizing map and support vector machines. In: Proceedings 2001. IJCNN’01. International Joint Conference on Neural Networks, pp 1886–1890 Sohn S, Dagli CH (2001) Advantages of using fuzzy class memberships in self-organizing map and support vector machines. In: Proceedings 2001. IJCNN’01. International Joint Conference on Neural Networks, pp 1886–1890
17.
Zurück zum Zitat Cervantes J, Li X, Yu W (2006) Support vector machine classification based on fuzzy clustering for large data sets. In: Mexican International Conference on Artificial Intelligence, pp 572–582 Cervantes J, Li X, Yu W (2006) Support vector machine classification based on fuzzy clustering for large data sets. In: Mexican International Conference on Artificial Intelligence, pp 572–582
18.
Zurück zum Zitat Almasi ON, Rouhani M (2016) Fast and de-noise support vector machine training method based on fuzzy clustering method for large real world datasets. Turk J Electr Eng Comput Sci 24:219–233CrossRef Almasi ON, Rouhani M (2016) Fast and de-noise support vector machine training method based on fuzzy clustering method for large real world datasets. Turk J Electr Eng Comput Sci 24:219–233CrossRef
19.
Zurück zum Zitat Cervantes J, Li X, Yu W, Li K (2008) Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71:611–619CrossRef Cervantes J, Li X, Yu W, Li K (2008) Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71:611–619CrossRef
20.
Zurück zum Zitat Shen X-J, Mu L, Li Z, Wu H-X, Gou J-P, Chen X (2016) Large-scale support vector machine classification with redundant data reduction. Neurocomputing 172:189–197CrossRef Shen X-J, Mu L, Li Z, Wu H-X, Gou J-P, Chen X (2016) Large-scale support vector machine classification with redundant data reduction. Neurocomputing 172:189–197CrossRef
21.
Zurück zum Zitat Shen X, Li Z, Jiang Z, Zhan Y (2013) Distributed SVM classification with redundant Data removing. Green Computing and Communications (GreenCom), 2013 IEEE and Internet of Things (iThings/CPSCom), IEEE International Conference on and IEEE Cyber, Physical and Social Computing. IEEE, pp 866–870 Shen X, Li Z, Jiang Z, Zhan Y (2013) Distributed SVM classification with redundant Data removing. Green Computing and Communications (GreenCom), 2013 IEEE and Internet of Things (iThings/CPSCom), IEEE International Conference on and IEEE Cyber, Physical and Social Computing. IEEE, pp 866–870
22.
Zurück zum Zitat Koggalage R, Halgamuge S (2004) Reducing the number of training samples for fast support vector machine classification. Neural Inf Process Lett Rev 2:57–65 Koggalage R, Halgamuge S (2004) Reducing the number of training samples for fast support vector machine classification. Neural Inf Process Lett Rev 2:57–65
23.
Zurück zum Zitat De Almeida MB, de Pádua Braga A, Braga JP (2000) SVM-KM: speeding SVMs learning with a priori cluster selection and k-means. In: Proceedings-2000. Sixth Brazilian Symposium on Neural Networks, pp 162–167 De Almeida MB, de Pádua Braga A, Braga JP (2000) SVM-KM: speeding SVMs learning with a priori cluster selection and k-means. In: Proceedings-2000. Sixth Brazilian Symposium on Neural Networks, pp 162–167
24.
25.
Zurück zum Zitat Xu W, Dong L (2016) A novel relative density based support vector machine. Opt Int J Light Electron Opt 127:10348–10354CrossRef Xu W, Dong L (2016) A novel relative density based support vector machine. Opt Int J Light Electron Opt 127:10348–10354CrossRef
26.
Zurück zum Zitat Li C, Liu K, Wang H (2011) The incremental learning algorithm with support vector machine based on hyperplane-distance. Appl Intell 34:19–27CrossRef Li C, Liu K, Wang H (2011) The incremental learning algorithm with support vector machine based on hyperplane-distance. Appl Intell 34:19–27CrossRef
27.
Zurück zum Zitat Xia S, Xiong Z, Luo Y, Dong L (2015) A method to improve support vector machine based on distance to hyperplane. Opt Int J Light Electron Opt 126:2405–2410CrossRef Xia S, Xiong Z, Luo Y, Dong L (2015) A method to improve support vector machine based on distance to hyperplane. Opt Int J Light Electron Opt 126:2405–2410CrossRef
28.
Zurück zum Zitat Sun Z, Guo Z, Liu C, Wang X, Liu J, Liu S (2017) Fast extended one-versus-rest multi-label support vector machine using approximate extreme points. IEEE Access 5:8526–8535CrossRef Sun Z, Guo Z, Liu C, Wang X, Liu J, Liu S (2017) Fast extended one-versus-rest multi-label support vector machine using approximate extreme points. IEEE Access 5:8526–8535CrossRef
29.
Zurück zum Zitat Crisp DJ, Burges CJ (2000) A geometric interpretation of v-SVM classifiers. In: Advances in neural information processing systems, pp 244–250 Crisp DJ, Burges CJ (2000) A geometric interpretation of v-SVM classifiers. In: Advances in neural information processing systems, pp 244–250
30.
Zurück zum Zitat Mavroforakis ME, Sdralis M, Theodoridis S (2006) A novel SVM geometric algorithm based on reduced convex hulls. In: 18th International Conference on pattern Recognition (ICPR’06), pp 564–568 Mavroforakis ME, Sdralis M, Theodoridis S (2006) A novel SVM geometric algorithm based on reduced convex hulls. In: 18th International Conference on pattern Recognition (ICPR’06), pp 564–568
31.
Zurück zum Zitat Osuna E, De Castro O (2002) Convex hull in feature space for support vector machines. Springer, New York, pp 411–419MATH Osuna E, De Castro O (2002) Convex hull in feature space for support vector machines. Springer, New York, pp 411–419MATH
32.
Zurück zum Zitat Mavroforakis ME, Theodoridis S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17:671–682CrossRef Mavroforakis ME, Theodoridis S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17:671–682CrossRef
33.
Zurück zum Zitat Chau AL, Li X, Yu W (2013) Large data sets classification using convex–concave hull and support vector machine. Soft Comput 17:793–804CrossRef Chau AL, Li X, Yu W (2013) Large data sets classification using convex–concave hull and support vector machine. Soft Comput 17:793–804CrossRef
34.
Zurück zum Zitat Chau AL, Li X, Yu W (2013) Convex-concave hull for classification with support vector machine. In: 2012 IEEE 12th International Conference on Data Mining Workshops, pp 431–438 Chau AL, Li X, Yu W (2013) Convex-concave hull for classification with support vector machine. In: 2012 IEEE 12th International Conference on Data Mining Workshops, pp 431–438
35.
Zurück zum Zitat Nalepa J, Kawulok M (2018) Selecting training sets for support vector machines: a review. Artificial Intelligence Review, pp 1–44 Nalepa J, Kawulok M (2018) Selecting training sets for support vector machines: a review. Artificial Intelligence Review, pp 1–44
36.
Zurück zum Zitat Nalepa J, Kawulok M (2014) Adaptive genetic algorithm to select training data for support vector machines. In: European Conference on the Applications of Evolutionary Computation, pp 514–525 Nalepa J, Kawulok M (2014) Adaptive genetic algorithm to select training data for support vector machines. In: European Conference on the Applications of Evolutionary Computation, pp 514–525
37.
Zurück zum Zitat Kawulok M, Nalepa J (2012) Support vector machines training data selection using a genetic algorithm. In: Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pp 557–565 Kawulok M, Nalepa J (2012) Support vector machines training data selection using a genetic algorithm. In: Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pp 557–565
38.
Zurück zum Zitat Nalepa J, Kawulok M (2014) A memetic algorithm to select training data for support vector machines. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp 573–580 Nalepa J, Kawulok M (2014) A memetic algorithm to select training data for support vector machines. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp 573–580
39.
Zurück zum Zitat Nalepa J, Kawulok M (2016) Adaptive memetic algorithm for minimizing distance in the vehicle routing problem with time windows. Soft Comput 20:2309–2327CrossRef Nalepa J, Kawulok M (2016) Adaptive memetic algorithm for minimizing distance in the vehicle routing problem with time windows. Soft Comput 20:2309–2327CrossRef
40.
Zurück zum Zitat Nalepa J, siminski K, Kawulok M (2015) Towards parameter-less support vector machines. In: ACPR, pp 211–215 Nalepa J, siminski K, Kawulok M (2015) Towards parameter-less support vector machines. In: ACPR, pp 211–215
42.
Zurück zum Zitat Theodoridis S, Mavroforakis M (2007) Reduced convex hulls: a geometric approach to support vector machines [lecture notes]. IEEE Signal Process Mag 24:119–122CrossRef Theodoridis S, Mavroforakis M (2007) Reduced convex hulls: a geometric approach to support vector machines [lecture notes]. IEEE Signal Process Mag 24:119–122CrossRef
43.
Zurück zum Zitat Li Y, Wang Y, He G (2012) Clustering-based distributed support vector machine in wireless sensor networks. J Inf Comput Sci 9:1083–1096 Li Y, Wang Y, He G (2012) Clustering-based distributed support vector machine in wireless sensor networks. J Inf Comput Sci 9:1083–1096
44.
Zurück zum Zitat De Berg M, Van Kreveld M, Overmars M, Schwarzkopf OC (2000) Computational geometry. In: Computational geometry. Springer, pp 1–17 De Berg M, Van Kreveld M, Overmars M, Schwarzkopf OC (2000) Computational geometry. In: Computational geometry. Springer, pp 1–17
45.
Zurück zum Zitat Li X, Cervantes J, Yu W (2010) A novel SVM classification method for large data sets. In: 2010 IEEE International Conference on Granular Computing, pp 297–302 Li X, Cervantes J, Yu W (2010) A novel SVM classification method for large data sets. In: 2010 IEEE International Conference on Granular Computing, pp 297–302
46.
Zurück zum Zitat Wang J, Wu X, Zhang C (2005) Support vector machines based on K-means clustering for real-time business intelligence systems. Int J Bus Intell Data Min 1:54–64CrossRef Wang J, Wu X, Zhang C (2005) Support vector machines based on K-means clustering for real-time business intelligence systems. Int J Bus Intell Data Min 1:54–64CrossRef
47.
Zurück zum Zitat Inaba M, Katoh N, Imai H (1994) Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering. In: Proceedings of the tenth annual symposium on Computational geometry, pp 332–339 Inaba M, Katoh N, Imai H (1994) Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering. In: Proceedings of the tenth annual symposium on Computational geometry, pp 332–339
Metadaten
Titel
CBCH (clustering-based convex hull) for reducing training time of support vector machine
verfasst von
Pardis Birzhandi
Hee Yong Youn
Publikationsdatum
06.03.2019
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 8/2019
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-019-02795-9

Weitere Artikel der Ausgabe 8/2019

The Journal of Supercomputing 8/2019 Zur Ausgabe