nach oben

Pattern Analysis and Applications

Erschienen in:

18.04.2017 | Theoretical Advances

A fast classification strategy for SVM on the large-scale high-dimensional datasets

verfasst von: I-Jing Li, Jiunn-Lin Wu, Chih-Hung Yeh

Erschienen in: Pattern Analysis and Applications | Ausgabe 4/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The challenges of the classification for the large-scale and high-dimensional datasets are: (1) It requires huge computational burden in the training phase and in the classification phase; (2) it needs large storage requirement to save many training data; and (3) it is difficult to determine decision rules in the high-dimensional data. Nonlinear support vector machine (SVM) is a popular classifier, and it performs well on a high-dimensional dataset. However, it easily leads overfitting problem especially when the data are not evenly distributed. Recently, profile support vector machine (PSVM) is proposed to solve this problem. Because local learning is superior to global learning, multiple linear SVM models are trained to get similar performance to a nonlinear SVM model. However, it is inefficient in the training phase. In this paper, we proposed a fast classification strategy for PSVM to speed up the training time and the classification time. We first choose border samples near the decision boundary from training samples. Then, the reduced training samples are clustered to several local subsets through MagKmeans algorithm. In the paper, we proposed a fast search method to find the optimal solution for MagKmeans algorithm. Each cluster is used to learn multiple linear SVM models. Both artificial datasets and real datasets are used to evaluate the performance of the proposed method. In the experimental result, the proposed method prevents overfitting and underfitting problems. Moreover, the proposed strategy is effective and efficient.

Vorheriger Artikel Image denoising with norm weighted fusion estimators

Nächster Artikel A thresholding method based on interval-valued intuitionistic fuzzy sets: an application to image segmentation

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Wang F (2011) Semisupervised metric learning by maximizing constraint margin. IEEE Trans Syst Man Cybern B 41(4):931–939CrossRef

Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442CrossRef

Yu J, Tao D (2012) Adaptive hypergraph learning and its application in image classification. IEEE Trans Image Process 21(7):3262–3272MathSciNetCrossRef

Li IJ, Wu JL (2014) A new nearest neighbor classification algorithm based on local probability centers. Math Probl Eng 2014. doi:10.1155/2014/324742 MathSciNet

Yu J, Tao D, Rui Y, Cheng J (2013) Pairwise constraints based multiview features fusion for scene classification. Pattern Recognit 46:483–496CrossRef

Garcia S, Derrac J, Cano JR, Herrera F (2012) Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans Pattern Anal Mach Intell 34(3):417–435CrossRef

Triguero I, Derrac J, Garcıa S, Herrera F (2012) A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans Syst Man Cybern C 42(1):86–100CrossRef

Joachims T (1999) Transductive inference for text classification using support vector machines prodigy. In: Proceedings of international conference on machine learning

Zhang H, Berg AC, Maire M, Malik J (2006) SVM-KNN: discriminative nearest neighbor for visual object recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition

10.

Van Nguyen H, Porikli F (2013) Support vector shape: a classifier-based shape representation. IEEE Trans Pattern Anal Mach Intell 35(4):970–982CrossRef

11.

Wang CW, You WH (2013) Boosting-SVM: effective learning with reduced data dimension. Appl Intell 39(3):465–474CrossRef

12.

Chang CC, Lin CJ (2016) LIBSVM: a library for support vector machines. Software Available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm

13.

Rojas SA, Fernandez Reyes D (2005) Adapting multiple kernel parameters for support vector machines using genetic algorithms. In: The 2005 IEEE congress on evolutionary computation, vol 1. pp 626–631

14.

Liang X, Liu F (2002) Choosing multiple parameters for SVM based on genetic algorithm. In: 6th International conference on signal processing, vol 1. pp 117–119

15.

Liu HJ, Wang YN, Lu XF (2005) A method to choose kernel function and its parameters for support vector machines. In: Proceedings of 2005 international conference on machine learning and cybernetics, vol 7. pp 4277–4280

16.

Liu S, Jia CY, Ma H (2005) A new weighted support vector machine with GA-based parameter selection. In: Proceedings of 2005 international conference on machine learning and cybernetics, vol 7. pp 4351–4355

17.

Quang AT, Zhang QL, Li X (2002) Evolving support vector machine parameters. In: Proceedings of 2002 international conference on machine learning and cybernetics, vol 1. pp 548–551

18.

Wu KP, Wang SD (2009) Choosing the kernel parameters for support vector machines by the inter-cluster distance in the feature space. Pattern Recognit 42(5):710–717CrossRef

19.

Lee YJ, Mangasarian OL (2001) RSVM: reduced support vector machines. In: Proceedings of 1st SIAM international conference on data mining

20.

Yu H, Yang J, Han J (2003) Classifying large data sets using SVMs with hierarchical clusters. In: Proceedings of international conference on knowledge discovery data mining. pp 306–315

21.

Bakur GH, Bottou L, Weston J (2005) Breaking SVM complexity with cross-training. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems (NIPS), vol 17. MIT Press, Cambridge, pp 81–88

22.

Angiulli F, Astorino A (2010) Scaling up support vector machines using nearest neighbor condensation. IEEE Trans Neural Netw 21(2):351–357CrossRef

23.

Devi FS, Murty MN (2002) An incremental prototype set building technique. Pattern Recognit 35(2):505–513CrossRef

24.

Theodoridis S, Koutroumbas K (2006) Pattern recognition, 3rd edn. Academic Press, LondonMATH

25.

Hart PE, Stock DG, Duda RO (2001) Pattern classification, 2nd edn. Wiley, Hoboken

26.

Bottou L, Vapnik V (1992) Local learning algorithms. Neural Comput 4(6):888–900CrossRef

27.

Lau KW, Wu QH (2008) Local prediction of non-linear time series using support vector regression. Pattern Recognit 41(5):1556–1564CrossRef

28.

Li IJ, Chen JC, Wu JL (2013) A fast prototype reduction method based on template reduction and visualization-induced self-organizing map for nearest neighbor algorithm. Appl Intell 39(3):564–582CrossRef

29.

Cheng HB, Tan PN, Jin R (2010) Efficient algorithm for localized support vector machine. IEEE Trans Knowl Data Eng 22(4):537–549CrossRef

30.

Schrijver A (1998) Theory of linear and integer programming. Wiley, HobokenMATH

31.

Angiulli F (2007) Fast nearest neighbor condensation for large data sets classification. IEEE Trans Knowl Data Eng 19(11):1450–1464CrossRef

32.

Hart PE (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515–516CrossRef

33.

Gates W (1972) The reduced nearest neighbor rule. IEEE Trans Inf Theory 18(3):431–433CrossRef

34.

Blake C, Keogh E, Merz CJ (2009) UCI repository of machine learning databases. Department of Information and Computer Science, University of California. http://www.ics.uci.edu/∼mlearn

35.

Zhang L, Zhang Q, Zhang L, Tao D, Huang X, Du B (2015) Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding. Pattern Recognit 48:3102–3112CrossRef

36.

Xiong W, Zhang L, Du B, Tao D (2016) Combining local and global: rich and robust feature pooling for visual recognition. Pattern Recognit 62:225–235CrossRef

Titel: A fast classification strategy for SVM on the large-scale high-dimensional datasets
verfasst von: I-Jing Li
Jiunn-Lin Wu
Chih-Hung Yeh
Publikationsdatum: 18.04.2017
Verlag: Springer London
Erschienen in: Pattern Analysis and Applications / Ausgabe 4/2018
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI: https://doi.org/10.1007/s10044-017-0620-0

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2018

Image denoising with norm weighted fusion estimators

Isolated structural error analysis of printed mathematical expressions

Gene clustering with hidden Markov model optimized by PSO algorithm

Co-occurrence pattern mining based on a biological approximation scoring matrix

A regression model based on the nearest centroid neighborhood

Constant-time monocular object detection using scene geometry