Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 7/2020

03.01.2020 | Original Article

Stochastic trust region inexact Newton method for large-scale machine learning

verfasst von: Vinod Kumar Chauhan, Anuj Sharma, Kalpana Dahiya

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 7/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays stochastic approximation methods are one of the major research direction to deal with the large-scale machine learning problems. From stochastic first order methods, now the focus is shifting to stochastic second order methods due to their faster convergence and availability of computing resources. In this paper, we have proposed a novel stochastic trust region inexact Newton method, called as STRON, to solve large-scale learning problems which uses conjugate gradient to inexactly solve trust region subproblem. The method uses progressive subsampling in the calculation of gradient and Hessian values to take the advantage of both, stochastic and full-batch regimes. We have extended STRON using existing variance reduction techniques to deal with the noisy gradients and using preconditioned conjugate gradient as subproblem solver, and empirically proved that they do not work as expected, for the large-scale learning problems. Finally, our empirical results prove efficacy of the proposed method against existing methods with bench marked datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
2
Experimental results can be reproduced using the library LIBS2ML [14].
 
Literatur
1.
Zurück zum Zitat Agarwal N, Bullins B, Hazan E (2017) Second-order stochastic optimization for machine learning in linear time. J Mach Learn Res 18(116):1–40MathSciNetMATH Agarwal N, Bullins B, Hazan E (2017) Second-order stochastic optimization for machine learning in linear time. J Mach Learn Res 18(116):1–40MathSciNetMATH
2.
Zurück zum Zitat Allen-Zhu Z (2017) Katyusha: the first direct acceleration of stochastic gradient methods. J Mach Learn Res (to appear) Full Version. hyperimagehttp://arxiv.org/abs/1603.05953arXiv:1603.05953 Allen-Zhu Z (2017) Katyusha: the first direct acceleration of stochastic gradient methods. J Mach Learn Res (to appear) Full Version. hyperimagehttp://​arxiv.​org/​abs/​1603.​05953arXiv:1603.05953
3.
Zurück zum Zitat Bellavia S, Krejic N, Jerinkic NK (2018) Subsampled inexact Newton methods for minimizing large sums of convex functions. Opt Online. arXiv preprint arXiv:1811.05730 Bellavia S, Krejic N, Jerinkic NK (2018) Subsampled inexact Newton methods for minimizing large sums of convex functions. Opt Online. arXiv preprint arXiv:​1811.​05730
4.
Zurück zum Zitat Berahas AS, Nocedal J, Takac M (2016) A multi-batch L-BFGS method for machine learning. Adv Neural Inf Process Syst 29:1055–1063 Berahas AS, Nocedal J, Takac M (2016) A multi-batch L-BFGS method for machine learning. Adv Neural Inf Process Syst 29:1055–1063
6.
Zurück zum Zitat Bollapragada R, Nocedal J, Mudigere D, Shi HJ, Tang PTP (2018) A progressive batching L-BFGS method for machine learning. Proc 35th Int Conf Mach Learn PMLR Proc Mach Learn Res 80:620–629 Bollapragada R, Nocedal J, Mudigere D, Shi HJ, Tang PTP (2018) A progressive batching L-BFGS method for machine learning. Proc 35th Int Conf Mach Learn PMLR Proc Mach Learn Res 80:620–629
7.
Zurück zum Zitat Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New YorkCrossRef Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New YorkCrossRef
9.
Zurück zum Zitat Byrd RH, Hansen SL, Nocedal J, Singer Y (2016) A stochastic quasi-Newton method for large-scale optimization. SIAM J Opt 26(2):1008–1031MathSciNetCrossRef Byrd RH, Hansen SL, Nocedal J, Singer Y (2016) A stochastic quasi-Newton method for large-scale optimization. SIAM J Opt 26(2):1008–1031MathSciNetCrossRef
10.
Zurück zum Zitat Cauchy AL (1847) Méthode générale pour la résolution des systèmes d’équations simultanées. Compte Rendu des S’eances de L’Acad’emie des Sciences XXV S’erie A(25):536–538 Cauchy AL (1847) Méthode générale pour la résolution des systèmes d’équations simultanées. Compte Rendu des S’eances de L’Acad’emie des Sciences XXV S’erie A(25):536–538
14.
Zurück zum Zitat Chauhan VK, Sharma A, Dahiya K (2019) LIBS2ML: a library for scalable second order machine learning algorithms. arXiv arXiv:1904.09448 Chauhan VK, Sharma A, Dahiya K (2019) LIBS2ML: a library for scalable second order machine learning algorithms. arXiv arXiv:​1904.​09448
17.
Zurück zum Zitat Defazio A, Bach F, Lacoste-Julien S (2014) Saga: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Proceedings of the 27th international conference on neural information processing systems, MIT Press, Cambridge, MA, USA, NIPS’14, pp 1646–1654 Defazio A, Bach F, Lacoste-Julien S (2014) Saga: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Proceedings of the 27th international conference on neural information processing systems, MIT Press, Cambridge, MA, USA, NIPS’14, pp 1646–1654
18.
Zurück zum Zitat Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) Liblinear: a library for large linear classification. JMLR 9:1871–1874MATH Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) Liblinear: a library for large linear classification. JMLR 9:1871–1874MATH
19.
Zurück zum Zitat Fanhua S, Zhou K, Cheng J, Tsang IW, Zhang L, Tao D (2018) Vr-sgd: a simple stochastic variance reduction method for machine learning. arXiv arXiv:1802.09932 Fanhua S, Zhou K, Cheng J, Tsang IW, Zhang L, Tao D (2018) Vr-sgd: a simple stochastic variance reduction method for machine learning. arXiv arXiv:​1802.​09932
20.
Zurück zum Zitat Fletcher R (1980) Practical methods of optimization, vol 1, unconstrained optimization. John Wiley & Sons Fletcher R (1980) Practical methods of optimization, vol 1, unconstrained optimization. John Wiley & Sons
21.
Zurück zum Zitat Hsia CY, Zhu Y, Lin CJ (2017) A study on trust region update rules in newton methods for large-scale linear classification. Proc Ninth Asian Conf Mach Learn PMLR Proc Mach Learn Res 77:33–48 Hsia CY, Zhu Y, Lin CJ (2017) A study on trust region update rules in newton methods for large-scale linear classification. Proc Ninth Asian Conf Mach Learn PMLR Proc Mach Learn Res 77:33–48
22.
Zurück zum Zitat Hsia CY, Chiang WL, Lin CJ (2018) Preconditioned conjugate gradient methods in truncated newton frameworks for large-scale linear classification. In: Proceedings of the tenth Asian conference on machine learning, PMLR, proceedings of machine learning research Hsia CY, Chiang WL, Lin CJ (2018) Preconditioned conjugate gradient methods in truncated newton frameworks for large-scale linear classification. In: Proceedings of the tenth Asian conference on machine learning, PMLR, proceedings of machine learning research
23.
Zurück zum Zitat Johnson R, Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26. Curran Associates Inc., New York, pp 315–323 Johnson R, Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26. Curran Associates Inc., New York, pp 315–323
24.
Zurück zum Zitat Kolte R, Erdogdu M, Ozgur A (2015) Accelerating SVRG via second-order information. In: NIPS workshop on optimization for machine learning Kolte R, Erdogdu M, Ozgur A (2015) Accelerating SVRG via second-order information. In: NIPS workshop on optimization for machine learning
25.
Zurück zum Zitat Le Roux N, Schmidt M, Bach F (2012) A stochastic gradient method with an exponential convergence rate for strongly-convex optimization with finite training sets. Tech. rep, INRIA Le Roux N, Schmidt M, Bach F (2012) A stochastic gradient method with an exponential convergence rate for strongly-convex optimization with finite training sets. Tech. rep, INRIA
27.
Zurück zum Zitat Lin CJ, Weng RC, Keerthi SS (2008) Trust region newton method for logistic regression. JMLR 9:627–650MathSciNetMATH Lin CJ, Weng RC, Keerthi SS (2008) Trust region newton method for logistic regression. JMLR 9:627–650MathSciNetMATH
28.
Zurück zum Zitat Liu DC, Nocedal J (1989) On the limited memory BFGS method for large scale optimization. Math Program 45(1):503–528MathSciNetCrossRef Liu DC, Nocedal J (1989) On the limited memory BFGS method for large scale optimization. Math Program 45(1):503–528MathSciNetCrossRef
30.
Zurück zum Zitat Mokhtari A, Ribeiro A (2014) Res: regularized stochastic BFGS algorithm. IEEE Trans Signal Process 62(23):6089–6104MathSciNetCrossRef Mokhtari A, Ribeiro A (2014) Res: regularized stochastic BFGS algorithm. IEEE Trans Signal Process 62(23):6089–6104MathSciNetCrossRef
31.
Zurück zum Zitat Moritz P, Nishihara R, Jordan MI (2016) A linearly-convergent stochastic L-BFGS algorithm. In: AISTATS Moritz P, Nishihara R, Jordan MI (2016) A linearly-convergent stochastic L-BFGS algorithm. In: AISTATS
32.
34.
Zurück zum Zitat Schmidt M, Le Roux N, Bach F (2017) Minimizing finite sums with the stochastic average gradient. Math Program 162:83–112MathSciNetCrossRef Schmidt M, Le Roux N, Bach F (2017) Minimizing finite sums with the stochastic average gradient. Math Program 162:83–112MathSciNetCrossRef
35.
Zurück zum Zitat Schraudolph NN, Yu J, Günter S (2007) A stochastic quasi-newton method for online convex optimization. In: Meila M, Shen X (eds) Proceedings of the eleventh international conference on artificial intelligence and statistics, PMLR, proceedings of machine learning research, vol 2, pp 436–443 Schraudolph NN, Yu J, Günter S (2007) A stochastic quasi-newton method for online convex optimization. In: Meila M, Shen X (eds) Proceedings of the eleventh international conference on artificial intelligence and statistics, PMLR, proceedings of machine learning research, vol 2, pp 436–443
36.
Zurück zum Zitat Shalev-Shwartz S, Zhang T (2013) Stochastic dual coordinate ascent methods for regularized loss. J Mach Learn Res 14(1):567–599MathSciNetMATH Shalev-Shwartz S, Zhang T (2013) Stochastic dual coordinate ascent methods for regularized loss. J Mach Learn Res 14(1):567–599MathSciNetMATH
37.
Zurück zum Zitat Shalev-Shwartz S, Singer Y, Srebro N (2007) Pegasos: primal estimated sub-gradient solver for SVM. In: Proceedings of the 24th international conference on machine learning, ACM, New York, NY, USA, ICML’07, pp 807–814 Shalev-Shwartz S, Singer Y, Srebro N (2007) Pegasos: primal estimated sub-gradient solver for SVM. In: Proceedings of the 24th international conference on machine learning, ACM, New York, NY, USA, ICML’07, pp 807–814
38.
Zurück zum Zitat Steihaug T (1983) The conjugate gradient method and trust regions in large scale optimization. SIAM J Numer Anal 20(3):626–637MathSciNetCrossRef Steihaug T (1983) The conjugate gradient method and trust regions in large scale optimization. SIAM J Numer Anal 20(3):626–637MathSciNetCrossRef
39.
Zurück zum Zitat Zhang Y, Xiao L (2015) Stochastic primal-dual coordinate method for regularized empirical risk minimization. Proc 32nd Int Conf Int Conf Mach Learn ICML’15 37:353–361 Zhang Y, Xiao L (2015) Stochastic primal-dual coordinate method for regularized empirical risk minimization. Proc 32nd Int Conf Int Conf Mach Learn ICML’15 37:353–361
Metadaten
Titel
Stochastic trust region inexact Newton method for large-scale machine learning
verfasst von
Vinod Kumar Chauhan
Anuj Sharma
Kalpana Dahiya
Publikationsdatum
03.01.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 7/2020
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-019-01055-9

Weitere Artikel der Ausgabe 7/2020

International Journal of Machine Learning and Cybernetics 7/2020 Zur Ausgabe

Neuer Inhalt