nach oben

Foundations of Computational Mathematics

Erschienen in:

30.11.2015

Random Gradient-Free Minimization of Convex Functions

verfasst von: Yurii Nesterov, Vladimir Spokoiny

Erschienen in: Foundations of Computational Mathematics | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this paper, we prove new complexity bounds for methods of convex optimization based only on computation of the function value. The search directions of our schemes are normally distributed random Gaussian vectors. It appears that such methods usually need at most n times more iterations than the standard gradient methods, where n is the dimension of the space of variables. This conclusion is true for both nonsmooth and smooth problems. For the latter class, we present also an accelerated scheme with the expected rate of convergence \(O\Big ({n^2 \over k^2}\Big )\), where k is the iteration counter. For stochastic optimization, we propose a zero-order scheme and justify its expected rate of convergence \(O\Big ({n \over k^{1/2}}\Big )\). We give also some bounds for the rate of convergence of the random gradient-free methods to stationary points of nonconvex functions, for both smooth and nonsmooth cases. Our theoretical results are supported by preliminary computational experiments.

Vorheriger Artikel Computing Teichmüller Maps Between Polygons

Nächster Artikel Polytope Lyapunov Functions for Stable and for Stabilizable LSS

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

In [15], u was uniformly distributed over a unit ball. In our comparison, we use a direct translation of the constructions in [15] into the language of the normal Gaussian distribution.

Presence of this oracle is the main reason why we call our methods gradient free (not derivative free!). Indeed, directional derivative is a much simpler object as compared with the gradient. It can be easily defined for a very large class of functions. At the same time, definition of the gradient (or subgradient) is much more involved. It is well known that in nonsmooth case, collection of partial derivatives is not a subgradient of convex function. For nonsmooth nonconvex functions, the possibility of computing a single subgradient needs a serious mathematical justification [17]. On the other hand, if we have an access to a program for computing the value of our function, then the program for computing directional derivatives can be obtained by a trivial automatic forward differentiation.

The rest of the proof is very similar to the proof of Lemma 2.2.4 in [16]. We present it here just for the reader convenience.

A. Agarwal, O. Dekel, and L. Xiao, Optimal algorithms for online convex optimization with multi-point bandit feedback, in Proceedings of the 23rd Annual Conference on Learning, 2010, pp. 2840.

A. Agarwal, D. Foster, D. Hsu, S. Kakade, and A. Rakhlin, Stochastic convex optimization with bandit feedback,. SIAM J. on Optimization, 23 (2013), pp. 213-240.MathSciNetCrossRefMATH

D. Bertsimas and S. Vempala, Solving convex programs by random walks, J. of the ACM, 51 (2004), pp. 540-556.MathSciNetCrossRefMATH

F. Clarke, Optimization and nonsmooth analysis, Wliley, New York, 1983.MATH

A. Conn, K. Scheinberg, and L. Vicente , Introduction to derivative-free optimization. MPS-SIAM series on optimization, SIAM, Philadelphia, 2009.CrossRefMATH

C. Dorea, Expected number of steps of a random optimization method, JOTA, 39 (1983), pp. 165-171.MathSciNetCrossRefMATH

J. Duchi, M.I. Jordan, M.J. Wainwright, and A. Wibisono, Finite sample convergence rate of zero-order stochastic optimization methods, in NIPS, 2012, pp. 1448-1456.

A. D. Flaxman, A.T. Kalai, and B.H. Mcmahan, Online convex optimization in the bandit setting: gradient descent without a gradient, in Proceedings of the 16th annual ACM-SIAM symposium on Discrete Algorithms, 2005, pp. 385-394 .

R. Kleinberg, A. Slivkins, and E. Upfal, Multi-armed bandits in metric spaces, in Proceedings of the 40th annual ACM symposium on Theory of Computing, 2008, pp. 681-690.

10.

J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright, Convergence properties of the Nelder-Mead Simplex Algorithm in low dimensions, SIAM J. Optimization, 9 (1998), pp. 112-147.CrossRefMATH

11.

J. C. Lagarias, B. Poonen, and M. H. Wright, Convergence of the restricted Nelder-Mead algorithm in two dimensions, SIAM J. Optimization, 22 (2012), pp. 501-532.MathSciNetCrossRefMATH

12.

J. Matyas, Random optimization. Automation and Remote Control, 26 (1965), pp. 246-253.MathSciNetMATH

13.

J. A. Nelder and R. Mead, A simplex method for function minimization, Computer Journal, 7 (1965), pp. 308–3013MathSciNetCrossRefMATH

14.

A. Nemirovski, A. Juditsky, G. Lan, and A.Shapiro, Robust Stochastic Approximation approach to Stochastic Programming, SIAM J. on Optimization, 19 (2009), pp. 1574-1609.MathSciNetCrossRefMATH

15.

A. Nemirovsky and D.Yudin, Problem complexity and method efficiency in optimization, John Wiley and Sons, New York, 1983.

16.

Yu. Nesterov, Introductory Lectures on Convex Optimization, Kluwer, Boston, 2004.CrossRefMATH

17.

Yu. Nesterov, Lexicographic differentiation of nonsmooth functions’, Mathematical Programming, 104 (2005), pp. 669-700.MathSciNetCrossRefMATH

18.

Yu. Nesterov, Random gradient-free minimization of convex functions, CORE Discussion Paper # 2011/1, (2011).

19.

Yu. Nesterov, Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. on Optimization, 22 (2012), pp. 341-362.MathSciNetCrossRefMATH

20.

B. Polyak, Introduction to Optimization. Optimization Software - Inc., Publications Division, New York, 1987.MATH

21.

V. Protasov, Algorithms for approximate calculation of the minimum of a convex function from its values, Mathematical Notes, 59 (1996), pp. 69-74.MathSciNetCrossRefMATH

22.

M. Sarma, On the convergence of the Baba and Dorea random optimization methods, JOTA, 66 (1990), pp. 337-343.MathSciNetCrossRefMATH

Titel: Random Gradient-Free Minimization of Convex Functions
verfasst von: Yurii Nesterov
Vladimir Spokoiny
Publikationsdatum: 30.11.2015
Verlag: Springer US
Erschienen in: Foundations of Computational Mathematics / Ausgabe 2/2017
Print ISSN: 1615-3375
Elektronische ISSN: 1615-3383
DOI: https://doi.org/10.1007/s10208-015-9296-2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2017

Generating Polynomials and Symmetric Tensor Decompositions

Computing Teichmüller Maps Between Polygons

Polytope Lyapunov Functions for Stable and for Stabilizable LSS

Scrambled Geometric Net Integration Over General Product Spaces

The Fshape Framework for the Variability Analysis of Functional Shapes

Steepest Descent Method with Random Step Lengths