Top

International Journal of Machine Learning and Cybernetics

Published in:

01-08-2014 | Original Article

A nonlinear least squares quasi-Newton strategy for LP-SVR hyper-parameters selection

Authors: Pablo Rivas-Perea, Juan Cota-Ruiz, Jose-Gerardo Rosiles

Published in: International Journal of Machine Learning and Cybernetics | Issue 4/2014

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper studies the problem of hyper-parameters selection for a linear programming-based support vector machine for regression (LP-SVR). The proposed model is a generalized method that minimizes a linear-least squares problem using a globalization strategy, inexact computation of first order information, and an existing analytical method for estimating the initial point in the hyper-parameters space. The minimization problem consists of finding the set of hyper-parameters that minimizes any generalization error function for different problems. Particularly, this research explores the case of two-class, multi-class, and regression problems. Simulation results among standard data sets suggest that the algorithm achieves statistically insignificant variability when measuring the residual error; and when compared to other methods for hyper-parameters search, the proposed method produces the lowest root mean squared error in most cases. Experimental analysis suggests that the proposed approach is better suited for large-scale applications for the particular case of an LP-SVR. Moreover, due to its mathematical formulation, the proposed method can be extended in order to estimate any number of hyper-parameters.

previous article A new finger-knuckle-print ROI extraction method based on probabilistic region growing algorithm

next article Shadow determination and compensation for face recognition

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

inform now

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

inform now

Anguita D, Boni A, Ridella S, Rivieccio F, Sterpi D (2005) Theoretical and practical model selection methods for support vector classifiers. In: Support vector machines: theory and applications, Springer, Berlin, pp 159–179

Anguita D, Ridella S, Rivieccio F, Zunino R (2003) Hyperparameter design criteria for support vector classifiers. Neurocomputing 55(1–2):109–134CrossRef

Argáez M, Velázquez L (2003) A new infeasible interior-point algorithm for linear programming. In: Proceedings of the 2003 conference on diversity in computing, TAPIA ’03, ACM, New York, pp 12–14. doi:10.1145/948542.948545

Armijo L (1966) Minimization of functions having lipschitz continuous first partial derivatives. Pac J Math 16(1):1–3CrossRefMATHMathSciNet

Blackard J, Dean D (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electr Agric 24(3):131–151CrossRef

Cawley G (2006) Leave-one-out cross-validation based model selection criteria for weighted ls-svms. In: Proceedings of the IEEE international joint conference on neural networks, IJCNN’06, pp 1661–1668. doi:10.1109/IJCNN.2006.246634

Chang M, Lin C (2005) Leave-one-out bounds for support vector regression model selection. Neural Comput 17(5):1188–1222CrossRefMATHMathSciNet

Cherkassky V, Ma Y (2004) Practical selection of svm parameters and noise estimation for svm regression. Neural Netw 17(1):113–126CrossRefMATH

Collobert R, Bengio S (2001) Svmtorch: support vector machines for large-scale regression problems. J Mach Learn Res 1:143–160. doi:10.1162/15324430152733142 MathSciNet

10.

Courant R, Hilbert D (1966) Methods of mathematical physics. Interscience, New York

11.

Dennis J, Schnabel R (1996) Numerical methods for unconstrained optimization and nonlinear equations. Society for Industrial Mathematics, Philadelphia

12.

Duan K, Keerthi S, Poo A (2003) Evaluation of simple performance measures for tuning SVM hyperparameters. Neurocomputing 51:41–59CrossRef

13.

Fawcett T (2004) Roc graphs: notes and practical considerations for researchers. Mach Learn 31:1–38MathSciNet

14.

Fisher R (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188CrossRef

15.

Forina M, Leardi R, Armanino C, Lanteri S (1998) PARVUS: an extendable package of programs for data exploration, classification and correlation. Institute of Pharmaceutical and Food Analysis Technologies, Genoa, Italy

16.

Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml

17.

Gorman R, Sejnowski T (1988) Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw 1(1):75–89CrossRef

18.

Hart P, Duda R, Stork D (2001) Pattern classification. Wiley, New York

19.

Haykin SS (2009) Neural networks and learning machines. Prentice Hall, Upper Saddle River

20.

He Q, Wu C (2011) Separating theorem of samples in banach space for support vector machine learning. Int J Mach Learn Cybern 2(1):49–54CrossRef

21.

Hestenes M (1975) Pseudoinversus and conjugate gradients. Commun ACM 18(1):40–43CrossRefMathSciNet

22.

Hui-ren Z, Pi-e Z (2008) Method for selecting parameters of least squares support vector machines based on GA and bootstrap. J Syst Simul 12:58. doi:http://en.cnki.com.cn/Article_en/CJFDTOTAL-XTFZ200607074.htm

23.

Ito K, Nakano R (2003) Optimizing support vector regression hyperparameters based on cross-validation. In: Proceedings of the IEEE international Joint Conference on neural networks, vol 3, pp 2077–2082

24.

Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. Machine learning ECML-98, Computer Science Department, University of Dortmund, pp 137–142

25.

Joachims T (1999) Making large-scale support vector machine learning practical. In: Advances in kernel methods, MIT Press, Cambridge, pp 169–184

26.

Karasuyama M, Kitakoshi D, Nakano R (2006) Revised optimizer of svr hyperparameters minimizing cross-validation error. In: Proceedings of the IEEE international joint conference on neural networks, IJCNN’06, pp 319–326

27.

Karasuyama M, Nakano R (2007) Optimizing svr hyperparameters via fast cross-validation using aosvr. In: Proceedings of the IEEE international joint conference on neural networks, IJCNN 2007, pp 1186–1191

28.

Karsaz A, Mashhadi H, Mirsalehi M (2010) Market clearing price and load forecasting using cooperative co-evolutionary approach. Int J Electr Power Energy Syst 32(5):408–415CrossRef

29.

Kay S (2006) Intuitive probability and random processes using MATLAB, 1st edn. Springer, Berlin. doi:10.1007/b104645

30.

Khemchandani R, Karpatne A, Chandra S (2012) Twin support vector regression for the simultaneous learning of a function and its derivatives. Int J Mach Learn Cybern, Springer, pp 1–13. doi:10.1007/s13042-012-0072-1

31.

Kinzett D, Zhang M, Johnston M (2008) Using numerical simplification to control bloat in genetic programming. Simul Evol Learn 5361:493–502. doi:10.1007/978-3-540-89694-4_50 CrossRef

32.

Kobayashi K, Kitakoshi D, Nakano R (2005) Yet faster method to optimize svr hyperparameters based on minimizing cross-validation error. In: Proceedings of the 2005 IEEE international joint conference on neural networks, IJCNN’05, vol 2, pp 871–876. doi:10.1109/IJCNN.2005.1555967

33.

Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining, vol 7. Menlo Park, AAAI Press, USA

34.

Lang K, Witbrock M (1988) Learning to tell two spirals apart. In: Proceedings of the 1988 connectionist models summer school, pp 52–59 (M. Kaufmann)

35.

LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. doi:10.1109/5.726791 CrossRef

36.

Liu Z, Wu Q, Zhang Y, Philip Chen C (2011) Adaptive least squares support vector machines filter for hand tremor canceling in microsurgery. Int J Mach Learn Cybern 2(1):37–47. doi:10.1007/s13042-011-0012-5 CrossRef

37.

Lu Z, Sun J, Butts KR (2009) Linear programming support vector regression with wavelet kernel: a new approach to nonlinear dynamical systems identification. Math Comput Simul 79(7):2051–2063. doi:10.1016/j.matcom.2008.10.011 CrossRefMATHMathSciNet

38.

Ma J, Theiler J, Perkins S (2003) Accurate on-line support vector regression. Neural Comput 15(11):2683–2703. doi:10.1162/089976603322385117 CrossRefMATH

39.

McDonald G, Schwing R (1973) Instabilities of regression estimates relating air pollution to mortality. Technometrics 15(3):463–481. doi:10.2307/1266852 CrossRef

40.

Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc Lond Ser A (containing papers of a mathematical or physical character) 209:415–446. doi:10.1098/rsta.1909.0016

41.

Momma M, Bennett K (2002) A pattern search method for model selection of support vector regression. In: Proceedings of the SIAM international conference on data mining, SIAM, Philadelphia, pp 261–274

42.

Musa A (2012) Comparative study on classification performance between support vector machine and logistic regression. Int J Mach Learn Cybern, 1–12. doi:10.1007/s13042-012-0068-x

43.

Nierenberg D, Stukel T, Baron J, Dain B, Greenberg E (1989) Determinants of plasma levels of beta-carotene and retinol. Skin cancer prevention study group. Am J Epidemiol 130(3):511–521

44.

Nocedal J, Wright S (1999) Numerical optimization. Springer, Berlin. doi:10.1007/b98874

45.

Ortiz-García E, Salcedo-Sanz S, Pérez-Bellido Á, Portilla-Figueras J (2009) Improving the training time of support vector regression algorithms through novel hyper-parameters search space reductions. Neurocomputing 72(16):3683–3691. doi:10.1016/j.neucom.2009.07.009 CrossRef

46.

Osuna E, Castro O (2002) Convex hull in feature space for support vector machines. In: Advances in artificial intelligence IBERAMIA 2002, lecture notes in computer science, vol 2527, Springer, Berlin, pp 411–419. doi:10.1007/3-540-36131-6_42

47.

Peng X (2010) Tsvr: an efficient twin support vector machine for regression. Neural Netw 23(3):365–372. doi:10.1016/j.neunet.2009.07.002 CrossRef

48.

Penrose K, Nelson A, Fisher A (1985) Generalized body composition prediction equation for men using simple measurement techniques. Med Sci Sports Exerc 2(17):189CrossRef

49.

Platt J (1999) Using analytic qp and sparseness to speed training of support vector machines. In: Proceedings of the 1998 conference on Advances in neural information processing systems II, MIT Press, Cambridge, MA, USA, pp 557–563

50.

Quinlan J (1993) Combining instance-based and model-based learning. In: Proceedings of the 10th international conference on machine learning, pp 236–243

51.

Ren Y, Bai G (2010) Determination of optimal svm parameters by using ga/pso. J Comput 5(8):1160–1168. doi:10.4304/jcp.5.8.1160-1168 CrossRef

52.

Ripley B (2008) Pattern recognition and neural networks, 1st edn. Cambridge University Press, Cambridge

53.

Rivas-Perea P (2009) Southwestern US and northwestern mexico dust storm modeling trough moderate resolution imaging spectroradiometer data: a machine learning perspective. Technical report: NASA/UMBC/GEST graduate student summer program. http://gest.umbc.edu/student_opp/2009_gssp_reports.html

54.

Rivas Perea P (2011) Algorithms for training large-scale linear programming support vector regression and classification. PhD thesis, The University of Texas at El Paso

55.

Rivas-Perea P, Cota-Ruiz J (2012) An algorithm for training a large scale support vector machine for regression based on linear programming and decomposition methods. Pattern Recogn Lett (In Press). doi:10.1016/j.patrec.2012.10.026

56.

Schölkopf B, Smola A, Williamson R, Bartlett P (2000) New support vector algorithms. Neural Comput 12(5):1207–1245. doi:10.1162/089976600300015565 CrossRef

57.

Small K, Roth D (2010) Margin-based active learning for structured predictions. Int J Mach Learn Cybern 1(1–4):3–25. doi:10.1007/s13042-010-0003-y CrossRef

58.

Smets K, Verdonk B, Jordaan E (2007) Evaluation of performance measures for svr hyperparameter selection. In: Proceedings of the IEEE international joint conference on neural networks, IJCNN 2007, pp. 637–642. doi:10.1109/IJCNN.2007.4371031

59.

Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222. doi:10.1023/B:STCO.0000035301.49549.88 CrossRefMathSciNet

60.

Stark H, Woods J (2001) Probability and random processes with applications to signal processing, 3rd edn. Prentice-Hall, Upper Saddle River

61.

Torii Y, Abe S (2009) Decomposition techniques for training linear programming support vector machines. Neurocomputing 72(4-6):973–984. doi:10.1016/j.neucom.2008.04.008 CrossRef

62.

Vapnik V, Golowich S, Smola A (1997) Support vector method for function approximation, regression estimation, and signal processing. Adv Neural Inf Process Syst 9:281–287

63.

Wang L (2005) Support vector machines: theory and applications, studies in fuzziness and soft computing, vol 177, Springer, Berlin

64.

Waugh S (1995) Extending and benchmarking cascade-correlation. PhD thesis, University of Tasmania, Tasmania

65.

Xiao JZ, Wang HR, Yang XC, Gao Z (2012) Multiple faults diagnosis in motion system based on svm. Int J Mach Learn Cybern 3(1):77–82. doi:10.1007/s13042-011-0035-y CrossRef

66.

Xiaofang Y, Yaonan W (2008) Parameter selection of support vector machine for function approximation based on chaos optimization. J Syst Eng Electr 19(1):191–197. doi:10.1016/S1004-4132(08)60066-3

67.

Xu Z, Huang K, Zhu J, King I, Lyu MR (2009) A novel kernel-based maximum a posteriori classification method. Neural Netw 22(7):977–987. doi:10.1016/j.neunet.2008.11.005 CrossRef

68.

Yeh I (1998) Modeling of strength of high-performance concrete using artificial neural networks. Cement and Concrete research 28(12):1797–1808. doi:10.1016/S0008-8846(98)00165-3 CrossRef

69.

Zhang JP, Li ZW, Yang J (2005) A parallel svm training algorithm on large-scale classification problems. In: Proceedings of the 2005 international conference on machine learning and cybernetics, vol 3, pp 1637–1641. doi:10.1109/icmlc.2005.1527207

70.

Zhang L, Zhou W (2010) On the sparseness of 1-norm support vector machines. Neural Netw 23(3):373–385. doi:10.1016/j.neunet.2009.11.012 CrossRef

71.

Zhang XQ, Gu CH (2007) Ch-svm based network anomaly detection. In: Proceedings of the 2007 international conference on machine learning and cybernetics, vol 6, pp 3261 –3266. doi:10.1109/icmlc.2007.4370710

Title: A nonlinear least squares quasi-Newton strategy for LP-SVR hyper-parameters selection
Authors: Pablo Rivas-Perea
Juan Cota-Ruiz
Jose-Gerardo Rosiles
Publication date: 01-08-2014
Publisher: Springer Berlin Heidelberg
Published in: International Journal of Machine Learning and Cybernetics / Issue 4/2014
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-013-0153-9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Other articles of this Issue 4/2014

N-dimensional (α, β)-fuzzy H-ideals in hemirings

Discovering the discovery of the hierarchy of formal languages

Shadow determination and compensation for face recognition

A new finger-knuckle-print ROI extraction method based on probabilistic region growing algorithm

An image segmentation method based on the fusion of vector quantization and edge detection with applications to medical image processing

Decision implications: a logical point of view