nach oben

Artificial Intelligence Review

Erschienen in:

16.01.2018

Problem formulations and solvers in linear SVM: a review

verfasst von: Vinod Kumar Chauhan, Kalpana Dahiya, Anuj Sharma

Erschienen in: Artificial Intelligence Review | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Support vector machine (SVM) is an optimal margin based classification technique in machine learning. SVM is a binary linear classifier which has been extended to non-linear data using Kernels and multi-class data using various techniques like one-versus-one, one-versus-rest, Crammer Singer SVM, Weston Watkins SVM and directed acyclic graph SVM (DAGSVM) etc. SVM with a linear Kernel is called linear SVM and one with a non-linear Kernel is called non-linear SVM. Linear SVM is an efficient technique for high dimensional data applications like document classification, word-sense disambiguation, drug design etc. because under such data applications, test accuracy of linear SVM is closer to non-linear SVM while its training is much faster than non-linear SVM. SVM is continuously evolving since its inception and researchers have proposed many problem formulations, solvers and strategies for solving SVM. Moreover, due to advancements in the technology, data has taken the form of ‘Big Data’ which have posed a challenge for Machine Learning to train a classifier on this large-scale data. In this paper, we have presented a review on evolution of linear support vector machine classification, its solvers, strategies to improve solvers, experimental results, current challenges and research directions.

Vorheriger Artikel A review on multi-class TWSVM

Nächster Artikel Selecting training sets for support vector machines: a review

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

datasets used in the experiments are available at link: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.

Abe S (2015) Fuzzy support vector machines for multilabel classification. Pattern Recogn 48(6):2110–2117MATHCrossRef

Abe S, Inoue T (2002) Fuzzy support vector machines for multiclass problems. In: European symposium on artificial neural networks, pp 113–118

Beck A, Tetruashvili L (2013) On the convergence of block coordinate descent type methods. SIAM J Optim 23(4):2037–2060MathSciNetMATHCrossRef

Becker S, Fadili M (2012) A quasi-newton proximal splitting method. In: Advances in neural information processing systems 25: 26th annual conference on neural information processing systems 2012. Proceedings of a meeting held December 3–6, 2012, Lake Tahoe, Nevada, pp 2627–2635

Blondel M, Seki K, Uehara K (2013) Block coordinate descent algorithms for large-scale sparse multiclass classification. Mach Learn 93(1):31–52MathSciNetMATHCrossRef

Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the 5th annual ACM workshop on computational learning theory, ACM Press, pp 144–152

Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Lechevallier Y, Saporta G (eds) Proceedings of the 19th international conference on computational statistics (COMPSTAT’2010), Springer, Paris, pp 177–187. http://leon.bottou.org/papers/bottou-2010

Bottou L (2012) Neural networks: tricks of the trade. 2nd edn, Springer, Berlin, pp 421–436

Bottou L, Lin CJ (2007) Support vector machine solvers. Science 3(1):1–27

Bottou L, Curtis FE, Nocedal J (2016) Optimization methods for large-scale machine learning. arXiv:1606.04838

Bredensteiner EJ, Bennett KP (1999) Computational optimization: a tribute to olvi mangasarian, vol I, Springer, Boston, chap Multicategory Classification by Support Vector Machines, pp 53–79

Byrd RH, Hansen SL, Nocedal J, Singer Y (2016) A stochastic quasi-newton method for large-scale optimization. SIAM J Optim 26(2):1008–1031MathSciNetMATHCrossRef

Cao LJ, Keerthi SS, Ong CJ, Zhang JQ, Periyathamby U, Fu XJ, Lee HP (2006) Parallel sequential minimal optimization for the training of support vector machines. IEEE Trans Neural Netw 17(4):1039–1049CrossRef

Cauchy AL (1847) Méthode générale pour la résolution des systèmes d’équations simultanées. Compte Rendu des S’eances de L’Acad’emie des Sciences XXV S’erie A(25):536–538

Chang CC, Hsu CW, Lin CJ (2000) The analysis of decomposition methods for support vector machines. IEEE Transactions on Neural Networks 11(4):1003–1008

Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27CrossRef

Chang KW, Hsieh CJ, Lin CJ (2008) Coordinate descent method for large-scale l2-loss linear support vector machines. J Mach Learn Res 9:1369–1398MathSciNetMATH

Chapelle O (2007) Training a support vector machine in the primal. Neural Comput 19(5):1155–1178MathSciNetMATHCrossRef

Chauhan VK, Dahiya K, Sharma A (2017a) Mini-batch block-coordinate based stochastic average adjusted gradient methods to solve big data problems. In: Zhang ML, Noh YK (eds) Proceedings of the ninth Asian conference on machine learning, PMLR, Proceedings of machine learning research, vol 77, pp 49–64. http://proceedings.mlr.press/v77/chauhan17a.html

Chauhan VK, Dahiya K, Sharma A (2017b) Trust region levenberg-marquardt method for linear svm. In: Ninth international conference on advances in pattern recognition (ICAPR-2017)

Yc C, Su C (2016) Distance-based margin support vector machine for classification. Appl Math Comput 283:141–152MathSciNet

Collobert R, Bengio S, Bengio Y (2002) A parallel mixture of svms for very large scale problems. Neural Comput 14(5):1105–1114MATHCrossRef

Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH

Cotter A, Shalev-Shwartz S, Srebro N (2013) Learning optimally sparse support vector machines. In: Proceedings of the 30th ICML, pp 266–274

Crammer K, Singer Y (2002) On the learnability and design of output codes for multiclass problems. Mach Learn 47(1995):201–233MATHCrossRef

De S, Taylor G, Goldstein T (2015) Variance reduction for distributed stochastic gradient descent. arXiv:1512.01708

Defazio A, Bach F, Lacoste-Julien S (2014) Saga: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Proceedings of the 27th international conference on neural information processing systems, MIT Press, Cambridge, NIPS’14, pp 1646–1654

Evgeniou T, Pontil M (2004) Regularized multi–task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD ’04, pp 109–117

Evgeniou T, Pontil M, Poggio T (2000) Regularization networks and support vector machines. In: Advances in computational mathematics, MIT Press, Cambridge, pp 1–50

Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) LIBLINEAR: a library for large linear classification. JMLR 9:1871–1874MATH

Farquhar JDR, Meng H, Szedmak S, Hardoon DR, Shawe-Taylor J (2006) Two view learning: Svm-2k, theory and practice. In: Advances in neural information processing systems, MIT Press, Cambridge

Ferris MC, Munson TS (2002) Interior-point methods for massive support vector machines. SIAM J Optim 13(3):783–804MathSciNetMATHCrossRef

Fletcher R, Reeves CM (1964) Function minimization by conjugate gradients. Comput J 7(2):149–154. https://doi.org/10.1093/comjnl/7.2.149 MathSciNetMATHCrossRef

Franc V, Sonnenburg S (2008) Optimized cutting plane algorithm for support vector machines. In: Proceedings of the 25th international conference on machine learning, ACM, New York, NY, ICML ’08, pp 320–327

Friedman J (1996) Another approach to polychotomous classification. Stanford University, Tech Rep, Dept Statistics

Fung G, Mangasarian OL (2001) Proximal support vector machine classifiers. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, KDD ’01, pp 77–86

Gao Y, Sun S (2010) An empirical evaluation of linear and nonlinear kernels for text classification using support vector machines. In: 2010 seventh international conference on fuzzy systems and knowledge discovery, vol 4, pp 1502–1505, https://doi.org/10.1109/FSKD.2010.5569327

Gibson N (2011) Gradient-based methods for optimization. part i

Graf HP, Cosatto E, Bottou L, Durdanovic I, Vapnik V (2005) Parallel support vector machines: the cascade SVM. In: Advances in neural information processing systems pp 521–528

Guermeur Y (2002) Combining discriminant models with new multi-class svms. Pattern Anal Appl 5(2):168–179MathSciNetMATHCrossRef

Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Natl Bur Stand 49(6):409–436MathSciNetMATHCrossRef

Hsieh CJ, Chang KW, Lin CJ, Keerthi SS, Sundararajan S (2008) A dual coordinate descent method for large-scale linear svm. In: Proceedings of the 25th international conference on machine learning, ACM, New York, NY, ICML ’08, pp 408–415

Hsieh CJ, Si S, Dhillon I (2014) A divide-and-conquer solver for Kernel support vector machines. JMLR W&Cp 32(1):566–574

Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425. https://doi.org/10.1109/72.991427 CrossRef

Jayadeva, Khemchandani R, Chandra S (2004) Fast and robust learning through fuzzy linear proximal support vector machines. Neurocomput 61(C):401–411CrossRef

Jayadeva KR, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910MATHCrossRef

Khemchandani R, Chandra S (2017) Twin support vector machines–models, extensions and applications, studies in computational intelligence, vol 659. Springer, Berlin

Ji Y, Sun S (2011) Multitask multiclass support vector machines. In: 2011 IEEE 11th international conference on data mining workshops, pp 512–518

Ji Y, Sun S (2013) Multitask multiclass support vector machines: model and experiments. Pattern Recogn 46(3):914–924MATHCrossRef

Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the sixteenth international conference on machine learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, ICML ’99, pp 200–209

Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, KDD ’06, pp 217–226

Joachims T, Dortmund U, Joachimscsuni-dortmundde T (1999) Making large-scale SVM learning practical. In: Advances in Kernel methods–support vector learning, pp 41–56

Joachims T, Finley T, Yu CNJ (2009) Cutting-plane training of structural svms. Mach Learn 77(1):27–59MATHCrossRef

Johnson R, Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems 26, Curran Associates, Inc., pp 315–323

Kao W, Chung K, Sun C, Lin C (2004) Decomposition methods for linear support vector machines. Neural Comput 16:1689–1704MATHCrossRef

Keerthi SS, DeCoste D (2005) A modified finite newton method for fast solution of large scale linear svms. JMLR 6:341–361MathSciNetMATH

Keerthi SS, Decoste D (2006) Building support vector machines with reduced classifier complexity. JMLR 7:1493–1515MathSciNetMATH

Keerthi SS, Sundararajan S, Chang KW, Hsieh CJ, Lin CJ (2008) A sequential dual method for large scale multi-class linear svms. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, KDD ’08, pp 408–416

Knerr S, Personnaz L, Dreyfus G (1990) Neurocomputing: algorithms, architectures and applications. Springer, Berlin, pp 41–50CrossRef

Konečný J, Richtárik P (2013) Semi-stochastic gradient descent methods, 1:19. arXiv:1312.1666

Kressel UHG (1999) Advances in Kernel methods. MIT Press, Cambridge, MA, pp 255–268

Ladicky L, Torr P (2011) Locally linear support vector machines. ICML, Stockholm, pp 985–992

Lazar A (2010) A comparison of linear support vector machine algorithms on large non-sparse datasets. In: Proceedings–9th ICMLA

Lee CP, Lin CJ (2013) A study on l2-loss (squared hinge-loss) multi-class svm. Neural Comput 25:1302–1323MathSciNetMATHCrossRef

Lee CP, Roth D (2015) Distributed box-constrained quadratic optimization for dual linear SVM. In: Proceedings of the 32nd ICML, 37

Lee YJ, Mangasarian O (2001) Ssvm: a smooth support vector machine for classification. Comput Optim Appl 20(1):5–22MathSciNetMATHCrossRef

Levenberg K (1944) A method for the solution of certain problems in least squares. Quart Appl Math 2:164–168MathSciNetMATHCrossRef

Li L, Li Y, Su H, Chu J (2006) Intelligent computing: international conference on intelligent computing, ICIC 2006, Kunming, China, August 16–19, 2006. Proceedings, Part I, Springer Berlin Heidelberg, Berlin, Heidelberg, chap Least Squares Support Vector Machines Based on Support Vector Degrees, pp 1275–1281

Li M, Andersen DG, Park JW, Smola AJ, Ahmed A, Josifovski V, Long J, Shekita EJ, Su BY (2014a) Scaling distributed machine learning with the parameter server. In: Proceedings of the 11th USENIX conference on operating systems design and implementation, USENIX Association, Berkeley, CA, OSDI’14, pp 583–598

Li M, Zhang T, Chen Y, Smola AJ (2014b) Efficient mini-batch training for stochastic optimization. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, KDD ’14, pp 661–670

Lin CF, Wang SD (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471CrossRef

Lin CJ (2001) On the convergence of the decomposition method for support vector machines. IEEE Trans Neural Netw 12(6):1288–1298CrossRef

Lin CJ, Weng RC, Keerthi SS (2008) Trust region newton method for logistic regression. JMLR 9:627–650MathSciNetMATH

Lin CY, Tsai CH, Lee CP, Lin CJ (2014) Large-scale logistic regression and linear support vector machines using spark. In: 2014 IEEE international conference on Big Data (Big Data) pp 519–528

Lin HT, Lin CJ, Weng RC (2007) A note on platt’s probabilistic outputs for support vector machines. Mach Learn 68(3):267–276CrossRef

Liu D, Shi Y, Tian Y, Huang X (2016) Ramp loss least squares support vector machine. J Comput Sci 14:61–68MathSciNetCrossRef

Lu Z, Xiao L (2015) On the complexity analysis of randomized block-coordinate descent methods. Math Program 152(1):615–642MathSciNetMATHCrossRef

Luss R, d’Aspremont A (2009) Support vector machine classification with indefinite kernels. Math Program Comput 1(2):97–118MathSciNetMATHCrossRef

Mangasarian OL (2001) A finite newton method for classification problems. Tech. rep., Data Mining Institute, Computer Sciences Department, University of Wisconsin

Mangasarian OL (2006) Exact 1-norm support vector machines via unconstrained convex differentiable minimization. J Mach Learn Res 7:1517–1530MathSciNetMATH

Mangasarian OL, Musicant DR (2001) Lagrangian support vector machines. JMLR 1:161–177MathSciNetMATH

Mangasarian OL, Wild EW (2006) Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74CrossRef

Manno A, Palagi L, Sagratella S (2015) A class of parallel decomposition algorithms for SVMs training, pp 1–24

Marquardt D (1963) An algorithm for least-squares estimation of nonlinear parameters. SIAM J Appl Math 11:431–441MathSciNetMATHCrossRef

Moré JJ (1978) Numerical analysis. In: Proceedings of the biennial conference held at Dundee, June 28–July 1, 1977, Springer, Berlin, pp 105–116

Needell D, Srebro N, Ward R (2014) Stochastic gradient descent, weighted sampling, and the randomized kaczmarz algorithm. In: Proceedings of the 27th international conference on neural information processing systems, MIT Press, Cambridge, MA, NIPS’14, pp 1017–1025

Nesterov Y (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J Optim 22(2):341–362MathSciNetMATHCrossRef

Nie F, Huang Y, Wang X, Huang H (2014) New primal SVM solver with linear computational cost for big data classifications. In: Proceedings of the 31st international conference on international conference on machine learning, Vol 32, JMLR.org, ICML’14, pp II–505–II–513

Nie F, Wang X, Huang H (2017) Multiclass capped Lp-norm SVM for robust classifications. The 31st AAAI Conference on Artificial Intelligence (AAAI), San Francisco, USA

Nocedal WS (1999) Numerical optimization. Springer, New YorkMATHCrossRef

Parikh N, Boyd S (2014) Proximal algorithms. Found Trends Optim 1(3):127–239CrossRef

Parlett B (1998) The symmetric eigenvalue problem. Society for Industrial and Applied Mathematics, Philadelphia, PA

Peng X (2010) A nu-twin support vector machine (nu-tsvm) classifier and its geometric algorithms. Inf Sci 180(20):3863–3875MATHCrossRef

Platt J (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Microsoft Res pp MSR–TR–98–14

Platt JC, Cristianini N, Shawe-Taylor J (2000) Large margin dags for multiclass classification. Adv Neural Inf Process Syst 12:547–553

Prez-Cruz F, Weston J, Herrmann D, Schölkopf B (2003) Extension of the nu-SVM range for classification. Nato Sci Series Sub Series iii Comput Syst Sci 190:179–196

Ranganathan A (2004) The levenberq-marquardt algorithm

Recht B, Re C, Wright S, Niu F (2011) Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems 24, Curran Associates, Inc., pp 693–701

Richtárik P, Takáč M (2014) Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math Program 144(1–2):1–38MathSciNetMATHCrossRef

Richtárik P, Takáč M (2016) Parallel coordinate descent methods for big data optimization. Math Program 156(1):433–484MathSciNetMATHCrossRef

Schmidt M, Le Roux N, Bach F (2016) Minimizing finite sums with the stochastic average gradient. Math Program pp 1–30

Schölkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245CrossRef

Shalev-Shwartz S, Singer Y (2006) Online learning meets optimization in the dual, pp 423–437

Shalev-Shwartz S, Zhang T (2013a) Accelerated mini-batch stochastic dual coordinate ascent. In: NIPS’ 13 Proceedings of the 26th International Conference on Neural Information Processing Systems, Vol 1. pp 378–385

Shalev-Shwartz S, Zhang T (2013b) Stochastic dual coordinate ascent methods for regularized loss. JMLR 14(1):567–599MathSciNetMATH

Shalev-Shwartz S, Singer Y, Srebro N (2007) Pegasos: primal estimated sub-gradient solver for SVM. In: Proceedings of the 24th international conference on machine learning, ACM, New York, NY, ICML ’07, pp 807–814

Shawe-Taylor J, Sun S (2011) A review of optimization methodologies in support vector machines. Neurocomputing 74(17):3609–3618CrossRef

Shi Z, Liu R (2016) A better convergence analysis of the block coordinate descent method for large scale machine learning. arXiv:1608.04826

Stefano M, Mikhail B (2011) Laplacian support vector machines trained in the primal. JMLR 12:1149–1184MathSciNetMATH

Steinwart I (2004) Sparseness of support vector machinessome asymptotically sharp bounds. Adv Neural Inf Process Syst 16:1069–1076

Stempfel G, Ralaivola L (2009) Learning SVMs from sloppily labeled data. In: Proceedings of the 19th international conference on artificial neural networks: Part I, Springer, Berlin, ICANN ’09, pp 884–893

Sun S, Xie X (2016) Semisupervised support vector machines with tangent space intrinsic manifold regularization. IEEE Trans Neural Netw Learn Syst 27(9):1827–1839MathSciNetCrossRef

Suykens J, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300CrossRef

Tak M, Bijral A, Richtrik P, Srebro N (2013) Mini-batch primal and dual methods for SVMs. In: 30th international conference on machine learning, Springer, Berlin, pp 537–552

Teo CH, Smola A, Vishwanathan SV, Le QV (2007) A scalable modular convex solver for regularized risk minimization. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, KDD ’07, pp 727–736

Teo CH, Vishwanthan S, Smola AJ, Le QV (2010) Bundle methods for regularized risk minimization. JMLR 11:311–365MathSciNetMATH

Tsai CH, Lin CJCY (2014) Incremental and decremental training for linear classification. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’14, pp 343–352

Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494MathSciNetMATHCrossRef

Tsochantaridis I, Hofmann T, Joachims T, Altun Y (2004) Support vector machine learning for interdependent and structured output spaces. In: Proceedings of the twenty-first international conference on machine learning, ACM, New York, NY, ICML ’04, p 104

Vapnik VN (1998) Statistical learning theory, 1st edn. Wiley, New YorkMATH

Wang Z, Crammer K, Vucetic S (2012) Breaking the curse of kernelization : budgeted stochastic gradient descent for large-scale SVM training. JMLR 13(1):3103–3131MathSciNetMATH

Wen T, Edelman A, Gorsich D (2003) Contemporary mathematics. American Mathematical Society, Boston, MA, chap A Fast Projected Conjugate Gradient Algorithm for Training Support Vector Machines, pp 245–263

Weston J, Watkins C (1999) Support vector machines for multi-class pattern recognition. In: ESANN

Willoughby RA (1979) Solutions of ill-posed problems (a. n. tikhonov and v. y. arsenin). SIAM Rev 21(2):266–267CrossRef

Woodsend K, Gondzio J (2009) Hybrid MPI/OpenMP parallel linear support vector machine training. JMLR 10:1937–1953MathSciNetMATH

Wright SJ (2015) Coordinate descent algorithms. Math Program 151(1):3–34MathSciNetMATHCrossRef

Xiao L, Zhang T (2014) A proximal stochastic gradient method with progressive variance reduction. SIAM J Optim 24(4):2057–2075MathSciNetMATHCrossRef

Xie X, Sun S (2015) Multi-view twin support vector machines. Intell Data Anal 19(4):701–712CrossRef

Xu C, Tao D, Xu C (2013) A survey on multi-view learning, pp 1–59. arXiv:1304.5634v1

Xu Y, Yin W (2013) A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J Imaging Sci 6(3):1758–1789MathSciNetMATHCrossRef

Xu Y, Yin W (2015) Block stochastic gradient iteration for convex and nonconvex optimization. SIAM J Optim 25(3):1686–1716MathSciNetMATHCrossRef

Yang Y, Chen J, Zhu J (2016) Distributing the stochastic gradient sampler for large-scale lda. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, KDD ’16, pp 1975–1984

Ye J, Xiong T (2007) SVM versus least squares SVM. In: JMLR W&P, pp 644–651

Yuan GX, Chang KW, Hsieh CJ, Lin CJ (2010) A comparison of optimization methods and software for large-scale L1-regularized linear classification. JMLR 11:3183–3234MathSciNetMATH

Yuan GX, Ho CH, Lin CJ (2012) Recent advances of large-scale linear classification. Proc IEEE 100(9):2584–2603CrossRef

Zanghirati G, Zanni L (2003) A parallel solver for large quadratic programs in training support vector machines. Parallel Comput 29(4):535–551MathSciNetMATHCrossRef

Zanni L, Serafini T, Zanghirati G (2006) Parallel software for training large scale support vector machines on multiprocessor systems. JMLR 7:1467–1492MathSciNetMATH

Zhang A, Gu Q (2016) Accelerated stochastic block coordinate descent with optimal sampling. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, KDD ’16, pp 2035–2044

Zhang T (2004) Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the twenty-first international conference on machine learning, ACM, New York, NY, ICML ’04

Zhu J, Rosset S, Hastie T, Tibshirani R (2003) 1-norm support vector machines. In: Neural information processing systems, MIT Press, Cambridge, p 16

Titel: Problem formulations and solvers in linear SVM: a review
verfasst von: Vinod Kumar Chauhan
Kalpana Dahiya
Anuj Sharma
Publikationsdatum: 16.01.2018
Verlag: Springer Netherlands
Erschienen in: Artificial Intelligence Review / Ausgabe 2/2019
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI: https://doi.org/10.1007/s10462-018-9614-6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 2/2019

A survey on techniques to handle face recognition challenges: occlusion, single sample per subject and expression

A survey on facial soft biometrics for video surveillance and forensic applications

Scene analysis and search using local features and support vector machine for effective content-based image retrieval

Diversity based cluster weighting in cluster ensemble: an information theory approach

A journey of Indian languages over sentiment analysis: a systematic review

A review on multi-class TWSVM