Acessibilidade / Reportar erro

BUNDLE METHODS IN THE XXIst CENTURY: A BIRD'S-EYE VIEW

Abstract

Bundle methods are often the algorithms of choice for nonsmooth convex optimization, especially if accuracy in the solution and reliability are a concern. We review several algorithms based on the bundle methodology that have been developed recently and that, unlike their forerunner variants, have the ability to provide exact solutions even if most of the time the available information is inaccurate. We adopt an approach that is by no means exhaustive, but covers different proximal and level bundle methods dealing with inexact oracles, for both unconstrained and constrained problems.

bundle methods; inexact oracles; nonsmooth convex optimization


1 INTRODUCTION

Nonsmooth convex optimization (NCO) appears often in connection with real-life problems that are too difficult to solve directly and need to be decomposed. Typically, Lagrangian relaxation or Benders' decomposition approaches replace the hard problem by a sequence of simpler problems, corresponding, in general, to an iteration of some nonsmooth convex method.

As in Nonlinear Programming, NCO methods define iterates using oracle information, which in the nonsmooth setting comes in the form of a function value and a subgradient. When dealing with the hard problems mentioned above, usually the oracle information cannot be computed exactly (the procedure may be too time consuming, or just impossible, for example, if differential equations or continuous stochastic processes are involved). We refer to this situation as dealing with an inexact oracle; for many examples in the energy sector, see [49[49] SAGASTIZÁBAL C 2012. Divide to conquer: decomposition methods for energy optimization. Math. Program. 134:187-222.].

In 1975 Claude Lemaréchal and Philip Wolfe independently created bundle methods to minimize a convex function for which only one subgradient at a point is computable, [29[29] LEMARÉCHAL C. 1975. An extension of Davidon methods to nondifferentiable problems. Math. Programming Stud., 3:95-109.] and [54[54] WOLFE P. 1975. A method of conjugate subgradients for minimizing nondifferentiable functions. Math. Programming Stud., 3:145-173.]. The short "historical" story [39[39] MIFFLIN R & SAGASTIZÁBAL C. 2012. Optimization Stories, vol. Extra Volume ISMP 2012, ed. by M. Grötschel, DOCUMENTA MATHEMATICA, ch. A Science Fiction Story in Nonsmooth Optimization Originating at IIASA, p. 460.] discusses the different developments that resulted in the VU-bundle variants [38[38] MIFFLIN R & SAGASTIZÁBAL C. 2005. A VU -algorithm for convex minimization. Math. Program. 104: 583-608.], that are superlinearly convergent with respect to so-called serious steps. Such a convergence rate was once deemed unattainable for nonsmooth functions, thus explaining the "science fiction" wording in the title of [39[39] MIFFLIN R & SAGASTIZÁBAL C. 2012. Optimization Stories, vol. Extra Volume ISMP 2012, ed. by M. Grötschel, DOCUMENTA MATHEMATICA, ch. A Science Fiction Story in Nonsmooth Optimization Originating at IIASA, p. 460.].

Since their introduction, and for practically 25 years, the convergence theory of bundle methods could only handle exact oracles. It was only in 2001 that the work [17[17] HINTERMÜLLER M. 2001. A proximal bundle method based on approximate subgradients. Computational Optimization and Applications 20:245-266.] opened the way to inexactness. It was followed by [53[53] SOLODOV MV. 2003. On approximations with finite precision in bundle methods for nonsmooth optimization. Journal of Optimization Theory and Applications 119:151-165.] and [27[27] KIWIEL KC 2006. A proximal bundle method with approximate subgradient linearizations. SIAM Journal on Optimization 16:1007-1023.].

For a long time bundle algorithms have been considered as the methods of choice for NCO problems requiring reliable and accurate solutions. The added ability of handling inexact oracles made the new variants even more interesting for real-life applications, especially those involving stochastic programming models. Works such as [43[43] DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
https://doi.org/10.1080/10556788.2013.87...
,44[44] DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
https://doi.org/10.1007/s10107-014-0809-...
,2[2] VAN ACKOOIJ W & DE OLIVEIRA W. 2014. Level bundle methods for constrained convex optimization with various oracles. Computational Optimization and Applications, 57:555-597.] give a flavor of the many different ways oracle inexactness can be exploited within a bundle method - to eventually solve the hard problem in an exact way. As shown by the numerical experience in [43[43] DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
https://doi.org/10.1080/10556788.2013.87...
], for a large battery of two-stage stochastic linear programs it is possible to save up to 75% of the total computational time (while still finding an exact solution) just by letting the bundle method control how accurate the oracle information should be at consecutive iterates. The main idea is that there is no gain in "wasting" time by computing exact oracle information at the so-called null steps; we do not enter into more details here, but refer instead to [43[43] DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
https://doi.org/10.1080/10556788.2013.87...
].

In this work we focus on the main ideas behind several variants of bundle methods, rather than on technicalities or fine details. After laying down a minimal background in Section 2, we consider separately unconstrained and constrained methods in Section 3 and 4, respectively. To illustrate the importance of modeling when solving real-life optimization problems, Section 5 contains an example of hydroreservoir management that can be cast into two different formulations, calling for unconstrained or constrained methods. Section 6 concludes with some closing remarks and comments on future research.

2 ORACLE ASSUMPTIONS AND ALGORITHMIC PATTERN

As explained in [48[48] SAGASTIZÁBAL C. 2011. Nonsmooth optimization: Thinking outside of the black box. SIAG/OPT Views-and-News, 220:2-10.], the versatility of bundle methods dealing with inexact oracles is best exploited when the structure of the NCO problem is taken into account. For this reason, we con sider below two structured optimization problems, defined over a polyhedral set X ⊂ ℜn and involving two convex functions, ƒ : ℜn → ℜ and h : ℜn → ℜ, possibly nonsmooth:

• unconstrained or linearly constrained problem

• problem with nonlinear convex constraints

The interest in considering separately the two functions ƒ and h lies in their oracles. More precisely, the function ƒ is easy: for each given x kX the ƒ -oracle provides

In contrast, the function h is hard to evaluate and, hence, only approximate values are available. Accordingly, at any given point x kX the h-oracle outputs the estimates

where the errors ηh k and η+ can be unknown. It follows from the oracle definition that h(x k ) ≥ hx k + 〈 s k, x k - x k 〉 - η+ = h(xk) - (ηh k + η+ ), showing that ηh k + η+ ≥ 0. Hence, the approximate subgradient is an (ηh k + η+)-subgradient of h at xk:

An exact oracle corresponds to ηh k = η+ = 0. When only η+ = 0, the oracle is said to be of the lower type: the linearization generated with the oracle information stays below the function everywhere. In contrast, when η+ > 0, the oracle is of the upper type, and the linearizations may cut off portions of the graph of h. Several examples of such oracles are given in [44[44] DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
https://doi.org/10.1007/s10107-014-0809-...
]; we just mention here that Lagrangian relaxation is a typical source of oracles of the lower type.

Algorithms in the bundle family use oracles (2) and (3) to define linearizations at iterates x j, for instance

At the k-th iteration, the bundle is defined by past oracle information accumulated in some index set Jk ⊂ {1,2,..., k}. Putting together the corresponding linearizations gives cuttingplane models, which can be built individually for each function, or for their sum:

By definition, the models above are piecewise linear convex functions.

The modeling functions are used to replace ƒ and h in (1) and define iterates {x k }⊂ X at which the oracles will be called. Such a replacement can be done in different ways, to be specified later on. For now, and to cover both problems in (1), we just write the resulting k-th subproblem as the minimization of an abstract objective function Ok, depending on a parameter set pars, specific to each considered bundle variant. Accordingly, the next iterate is given by

Typically, the bundle-variant-dependent set pars contains

  • the convex model kof ƒ + h (or the sum of models k and k of ƒ and h);

  • a stability center k, usually an element in the sequence {x k}, together with a test to determine when the center can be updated (for example, to x k+1);

  • an optimality measure to stop the iterations (such as upper and lower bounds for the optimal value of the problem);

  • certain algorithmic parameters, to be updated at each iteration.

The main concern when specifying (6) is that the corresponding subproblem is easy, because it needs to be solved at each iteration of the bundle method. A simple illustration, not belonging to the bundle family but still fitting the abstract format above, is the cutting-plane method of [9[9] CHENEY E & GOLDSTEIN A. 1959. Newton's method for convex programming and Tchebycheff approximations, 1:253-268.] and [19[19] KELLEY J. 1960. The cutting-plane method for solving convex programs. Journal of the Society for Industrial and Applied Mathematics, 8:703-712.], with subproblems fully characterized by

yielding a linear programming problem (LP) in (6):

(recall the set X is defined by linear equality and inequality constraints.) Bundle methods can be seen as stabilized forms of the cutting-plane algorithm, well known for its instability and slow convergence. Different stabilization devices result in different bundle algorithms, such as the proximal variant [33[33] LEMARÉCHAL C & SAGASTIZÁBAL C. 1997. Variable metric bundle methods: From conceptual to implementable forms. Math. Program. 76:393-410. 10.1007/BF02614390.
https://doi.org/10.1007/BF02614390...
] or the level one [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.]. For these methods, the respective parameter sets pars involve more objects than just and, as shown in Sections 3 and 4 below, give quadratic programming (QP) subproblems (6).

Once the parameter set and the modeling objective function in (6) are specified, the method needs to define updating rules for the elements in pars, including a stopping test. The main steps of all the algorithms are given below.

Algorithmic Pattern 1.Oracles for f and h, a model Ok, and a rule for defining the parameter set pars are given.

STEP 0. For an initial iterate x1, call the oracles to obtain the information (ƒ(x1), g1) and (h1x, s1). Set k = 1 and initialize J1 ={1} 1 = x 1.

STEP 1. Solve (6) to obtain xk+1 .

STEP 2. Perform a stopping test based on the optimality measure in pars.

STEP 3. If xk+1 was not considered close enough to optimal, call the oracles at xk+1 to obtain the information

(f(xk+1), gk+1) and (hk+1, sk+1).

STEP 4. Depending on a test measuring progress towards the goal of solving (1), specified in pars, decide whether or not the stability center is to be replaced:

STEP 5. Update pars by the given rule. Increase k by 1 and loop to Step 1.

The above pattern is merely a sketch of an algorithm. Different models and different updating rules for the stability center define different methods. The stopping test performed in Step 2 depends, naturally, on the bundle method variant. In what follows we will make each Step in the Algorithmic Pattern 1 more precise, for different bundle method variants. In order to ease the presentation, we split our analysis into the two instances considered in (1), starting with the linearly constrained problem (1a).

3 UNCONSTRAINED OR LINEARLY CONSTRAINED SETTING

The algorithms considered in this section are suitable for problems with simple constraints (such as linear) or none at all, as in (1a). We start with the most well known variant, the proximal bundle method, [32[32] LEMARÉCHAL C AND SAGASTIZÁBAL C. 1994. An approach to variable metric bundle methods. Lecture Notes in Control and Information Science, 197:144-162.,25[25] KIWIEL KC 1995. Approximations in proximal bundle methods and decomposition of convex programs. Journal of Optimization Theory and Applications, 84:529-548. 10.1007/BF02191984.
https://doi.org/10.1007/BF02191984...
,33[33] LEMARÉCHAL C & SAGASTIZÁBAL C. 1997. Variable metric bundle methods: From conceptual to implementable forms. Math. Program. 76:393-410. 10.1007/BF02614390.
https://doi.org/10.1007/BF02614390...
,15[15] FRANGIONI A. 2002. Generalized bundle methods. SIAM Journal on Optimization 13:117-156.]; see also [37[37] MIFFLIN R & SAGASTIZÁBAL C. 2000. On VU -theory for functions with primal-dual gradient structure. Siam Journal on Optimization, 11:547-571.,17[17] HINTERMÜLLER M. 2001. A proximal bundle method based on approximate subgradients. Computational Optimization and Applications 20:245-266.,53[53] SOLODOV MV. 2003. On approximations with finite precision in bundle methods for nonsmooth optimization. Journal of Optimization Theory and Applications 119:151-165.,7[7] BORGHETTI A, FRANGIONI A, LACALANDRA F & NUCCI C. 2003. Lagrangian heuristics based on disaggregated bundle methods for hydrothermal unit commitment. IEEE Transactions on Power Systems, 18:313-323.,38[38] MIFFLIN R & SAGASTIZÁBAL C. 2005. A VU -algorithm for convex minimization. Math. Program. 104: 583-608.,27[27] KIWIEL KC 2006. A proximal bundle method with approximate subgradient linearizations. SIAM Journal on Optimization 16:1007-1023.,24[24] KIWIEL K & LEMARÉCHAL C. 2009. An inexact bundle variant suited to column generation. Math. Program. 118:177-206.,11[11] EMIEL G & SAGASTIZÁBAL C. 2010. Incremental-like bundle methods with application to energy planning. Computational Optimization and Applications 46:305-332.,45[45] DE OLIVEIRA W, SAGASTIZÁBAL C & SCHEIMBERG S. 2011. Inexact bundle methods for twostage stochastic programming. SIAM Journal on Optimization 21:517-544.,34[34] LIN H. 2012. An inexact spectral bundle method for convex quadratic semidefinite programming. Computational Optimization and Applications 53:45-89.,44[44] DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
https://doi.org/10.1007/s10107-014-0809-...
,50[50] SAGASTIZÁBAL C. 2013. Composite proximal bundle method. Mathematical Programming 140:189-233.].

3.1 Proximal Bundle Methods

This family defines subproblems based on the equivalence between minimizing a convex function and finding a fixed point of the proximal point operator, [40[40] MOREAU J. 1965. Proximité et dualité dans un espace Hilbertien. Bull. Soc. Math. France, 93:273-299.]:

for some proximal stepsize t > 0 and a norm induced by a positive definite matrix M. To make the equivalence above implementable, methods in this variant replace x by the current stability center and f +h by some model, which can aggregate the linearizations and use , or disaggregate the information and use individual cutting-plane models for ƒ and h:

Instead of the M-norm, the generalized bundle methods [15[15] FRANGIONI A. 2002. Generalized bundle methods. SIAM Journal on Optimization 13:117-156.] use as stabilizing term a closed convex function satisfying certain properties.

The corresponding QP subproblems (6) are

  • Aggregate model:

  • Disaggregate model:

Since the objective function in both subproblems is strongly convex, the new iterate xk+1 defined by (6) is unique.

Depending on the structure of the problem to be solved, other modeling choices are possible. In [50[50] SAGASTIZÁBAL C. 2013. Composite proximal bundle method. Mathematical Programming 140:189-233.], for instance, the hard function is the composition of a smooth mapping with a positively homogeneous convex function. The model in pars results from composing a cutting-plane model of the latter with a first Taylor expansion of the former (at the current stability center). Telecommunication networks often exhibit separable structure, that is exploited in [16[16] FRANGIONI A & GENDRON B. 2013. A stabilized structured Dantzig Wolfe decomposition method. Math. Program. 140:45-76.] via a special Dantzig-Wolfe decomposition approach and in [31[31] LEMARÉCHAL C, OUOROU A & PETROU G. 2011. A bundle-type algorithm for routing in telecommunication data networks. Comput. Optim. Appl., 44:385-409.] by using a model that sums a cuttingplane model of the hard function and a a second order expansion of the (smooth) easy term.

For simplicity in what follows we focus on a proximal bundle variant using the aggregate model and |·| M = |·|, the Euclidean norm.

Subproblem definition

We now provide all the necessary ingredients to make the Algorithmic Pattern 1 an implementable method, when the proximal subproblem uses the aggregate model, so that (6) becomes

and the unique solution to (7) is

where ix denotes the indicator function of the set X. As for the simplicial multiplier αk in (8), it satisfies, for all jJk :

The complementarity relations above, in particular, allow to eliminate a posteriori inactive linearizations without losing relevant information but keeping the QP subproblems not too large.

The resulting selection mechanism takes in Step 5 of the Algorithmic Pattern 1 a bundle set satisfying

Jk + 1 ⊃ {jJk : αjk > 0}∪{k + 1}

to define the next model given in (5). An even more economical bundle, called compressed, can be built using the aggregate linearization defined in (11) below. As shown in [27[27] KIWIEL KC 2006. A proximal bundle method with approximate subgradient linearizations. SIAM Journal on Optimization 16:1007-1023.] and [44[44] DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
https://doi.org/10.1007/s10107-014-0809-...
], these strategies save storage without impairing convergence.

Stability Center and Optimality Certificate

The rule to update the stability center k must assess the quality of the candidate xk+1 in terms of the goal of solving (1a). Therefore, if the iterate xk+1 provides enough decrease with respect to the "best" available value, f(k) + hk, the stability center is changed to xk+1. In the bundle jargon setting is called making a "serious" step. This occurs when, for an Armijolike parameter m ∈ (0, 1),

The rightmost term above is a predicted improvement in the sense that it measures the distance between the best available value f(k) + hk and the best value predicted by the model,

In this variant, the certificate of optimality depends on two terms, the predicted improvement employed in (9) and the direction, given by the sum of the gradient and the normal element in (8). The following object combines both terms

and ensures convergence of the method whenever there exists a subsequence {(ɸk,Gk +Nk )}kK that converges to (ɸ, 0),for some ɸ ≤0, [44[44] DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
https://doi.org/10.1007/s10107-014-0809-...
, Theorem 3.2). For this reason, an optimality measure for proximal bundle algorithms is to require both ɸk ≤ 0and Gk +Nk to be sufficiently small. When the serious-step sequence is bounded, this test boils down to the more traditional criterion, that checks if the approximate subgradient k ≥0, which depends on the errors ηh + η+, must also be sufficiently small.) is small enough, where є

Impact of inexactness

When the hard funct ion h is exactly evaluated, or the oracle is of lower type, the definition of Gk +Nk in (8) implies that the aggregate linearization stays always below the sum of functions:

In particular, the predicted improvement employed in (9) is always nonnegative for exact oracles and so is the difference. In contrast, for lower and upper oracles the difference satisfies only the inequality

Such difference can be negative, entailing difficulties in the convergence analysis. In order to overcome this drawback, whenever this difference is too negative the prox-parameter can be increased in Step 2 of the Algorithmic Pattern 1, as in [27[27] KIWIEL KC 2006. A proximal bundle method with approximate subgradient linearizations. SIAM Journal on Optimization 16:1007-1023.]. The following related rule, based on a relative criterion, was considered in [3[3] VAN ACKOOIJ W & SAGASTIZÁBAL C. 2014. Constrained bundle methods for upper inexact oracles with application to joint chance constrained energy problems. SIAM Journal on Optimization, 24(2):733-765. Doi=10.1137/120903099.
https://doi.org/10.1137/120903099...
] and [44[44] DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
https://doi.org/10.1007/s10107-014-0809-...
]:

for a parameter β ∈ (0, 1) chosen at the initialization step. By making use of the above simple rule, and preventing tk from decreasing before getting a "good candidate" xk+1, the proximal bundle algorithm remains convergent. Clearly, convergence is ensured up to the oracle accuracy: the "optimal" value found by the algorithm has an error not larger than the precision ηh + η+. The partly asymptotically exact variants let the accuracy vary with the iterates in a manner that drives ηhk + ηk+ to 0 at serious steps, hence yielding exact solutions in the limit; we refer to [44[44] DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
https://doi.org/10.1007/s10107-014-0809-...
] for further information.

It is often the case that the QP subproblem solution takes a significant proportion of the total computational time. The aggregate linearization (11) plays an important role for controlling the QP size and making iterations less time consuming. Proximal bundle methods, and some level variants with restricted memory, work with models in (5) whose index set Jk can be compressed in Step 5 of the Algorithm Pattern 1. More precisely, for convergence purposes, it is enough to include in the new bundle set Jk+1 the linearization of the last iterate and the aggregate linearization (11). Nevertheless, the speed of convergence may be impaired if the bundle is too economical; in this case, the selection mechanism may offer a better trade-off between number of iterations to reach convergence and the time spent in each individual iteration.

3.2 Level Bundle Methods

The level bundle method was proposed in [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.] for convex unconstrained and constrained optimization problems, saddle-point problems, and variational inequalities. When compared to the proximal family, the level class can be considered as making primal-dual approximations, in the following sense. Start by rewriting the proximal point operator equivalence, multiplying the minimand therein by the positive factor t:

Then, interpreting this factor as a multiplier gives the relation

Once again, to make the equivalence above implementable, methods in the level variant replace x by the current stability center, f +h by some (aggregate or disaggregate) model, and the right hand side term in the constraint by a level parameter. Also, the M-norm is extended to general stability functions derived from a smooth strongly convex function ω : ℜn → ℜ+ that can yield a subproblem that is conic and no longer quadratic. Typically, the parameter set in the level variant is

As mentioned in [6[6] BEN-TAL A & NEMIROVSKI A. 2005. Noneuclidean restricted memory level method for large-scale convex optimization. Math. Program., 102:407-456.], the function ω can be chosen to exploit the geometry of the feasible set X (that may be non-polyhedral), to get nearly dimension-independent complexity estimates of the maximum number of iterations necessary to reach an optimality gap ƒ kup - fklow inferior to a given tolerance.

Complexity estimates are obtained if the feasible set X is compact, an assumption present in most level bundle methods, [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.,26[26] KIWIEL KC 1995. Proximal level bubdle methods for convex nondiferentiable optimization, saddlepoint problems and variational inequalities. Math. Program. 69:89-109.,12[12] FÁBIÁN C. 2000. Bundle-type methods for inexact data. Central European Journal of Operations Research, 8 (special issue, T. Csendes and T. Rapcsk, eds.), pp. 35-55.,6[6] BEN-TAL A & NEMIROVSKI A. 2005. Noneuclidean restricted memory level method for large-scale convex optimization. Math. Program., 102:407-456.,28[28] LAN G. 2013. Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math. Program. pp. 1-45.,47[47] RICHTÁRIK P. 2012. Approximate level method for nonsmooth convex minimization. Journal of Optimization Theory and Applications 152:334-350.]. Level bundle methods able to deal with unbounded feasible sets have an asymptotic analysis instead; we refer to [8[8] BRANNLUND U, KIWIEL KC & LINDBERG PO. 1995. A descent proximal level bundle method for convex nondifferentiable optimization. Operations Research Letters, 17:121-126.,43[43] DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
https://doi.org/10.1080/10556788.2013.87...
,5[5] BELLO CRUZ JY & DE OLIVEIRA W. 2014. Level bundle-like algorithms for convex optimization. Journal of Global Optimization, 59(4):787-809. Doi=10.1007/s10898-013-0096-4.
https://doi.org/10.1007/s10898-013-0096-...
,35[35] MALICK J, DE OLIVEIRA W & ZAOURAR S. 2013. Nonsmooth optimization using uncontrolled inexact information, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/05/3892.html.
http://www.optimization-online.org/DB_HT...
] and [12[12] FÁBIÁN C. 2000. Bundle-type methods for inexact data. Central European Journal of Operations Research, 8 (special issue, T. Csendes and T. Rapcsk, eds.), pp. 35-55.,43[43] DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
https://doi.org/10.1080/10556788.2013.87...
,35[35] MALICK J, DE OLIVEIRA W & ZAOURAR S. 2013. Nonsmooth optimization using uncontrolled inexact information, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/05/3892.html.
http://www.optimization-online.org/DB_HT...
] for variants handling exact and inexact oracles, respectively.

Subproblem Definition

In this case, subproblem (6) has the form

For the (canonical) stability function ω(x) = x,x) in particular, subproblem (13) is the quadratic program(

corresponding to the orthogonal projection of the stability center klev, foronto the polyhedron X∩X

a nonempty set, since the level parameter

Similarly to the proximal variant, the optimality conditions for the QP subproblem above give an iterate xk+1 as in (8), only that now tk is replaced by µk =∑j∈Jk αkjand

for a Lagrange multiplier αk that is no longer simplicial but conical and satisfies the following relations for all jJk

Unlike the proximal bundle variant, in the level family the dual variables of problem (13) can be unbounded.

Stability Center and Optimality Certificate

The rule to update the stability center [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.] sets xk for all k and, hence, does not have the restricted memory feature: with this approach compression is not possible. The proximal level variant [26[26] KIWIEL KC 1995. Proximal level bubdle methods for convex nondiferentiable optimization, saddlepoint problems and variational inequalities. Math. Program. 69:89-109.] updates the center using the proximal bundle method rule (9) and, as such, works with restricted memory; similarly for [8[8] BRANNLUND U, KIWIEL KC & LINDBERG PO. 1995. A descent proximal level bundle method for convex nondifferentiable optimization. Operations Research Letters, 17:121-126.] and [35[35] MALICK J, DE OLIVEIRA W & ZAOURAR S. 2013. Nonsmooth optimization using uncontrolled inexact information, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/05/3892.html.
http://www.optimization-online.org/DB_HT...
]. Finally, [5[5] BELLO CRUZ JY & DE OLIVEIRA W. 2014. Level bundle-like algorithms for convex optimization. Journal of Global Optimization, 59(4):787-809. Doi=10.1007/s10898-013-0096-4.
https://doi.org/10.1007/s10898-013-0096-...
] keeps the stability center fixed along the whole iterative process by setting for all k. is more flexible than in the proximal family. For exact oracles, the initial article =

The optimality certificate checks the gap between the upper and lower bounds that also define the level parameter

for some coefficient γ ∈ (0,1). For polyhedral feasible sets X, the computation of the lower bound amounts to solving an LP; for more economical alternative definitions of the lower bound, we refer to [43[43] DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
https://doi.org/10.1080/10556788.2013.87...
].

When the feasible set X is compact, the stopping test in Step 2 of Algorithmic Pattern 1 checks if the optimality gap ƒ kup - ƒ klow is below some tolerance. For the unbounded setting the stopping test of the proximal bundle method must be considered as well: stop either when the optimality gap is sufficiently small, or when both |Gk + Nk| and ɸk are small.

Impact of Inexactness

It was shown in [41[41] DE OLIVEIRA W 2011. Inexact bundle methods for stochastic optimization. PhD thesis, Federal University of Rio de Janeiro, COPPE/UFRJ, January 2011. (In Portuguese) Available at: http://www.cos.ufrj.br/index.php?option=com_publicacao&task= visualizar&cid%5B0%5D=2192.
http://www.cos.ufrj.br/index.php?option=...
] that if the feasible set is compact, no additional procedure needs to be done to cope with inexactness from the oracle. The non-compact setting is more intricate. Basically, the level parameter remains fixed when the oracle error is deemed too large, using the test (12) to detect large "noise". For more information on how to deal with oracle errors in level bundle methods when in (1a) the feasible set is unbounded see [35[35] MALICK J, DE OLIVEIRA W & ZAOURAR S. 2013. Nonsmooth optimization using uncontrolled inexact information, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/05/3892.html.
http://www.optimization-online.org/DB_HT...
].

3.3 Doubly Stabilized Bundle Methods

It is often observed empirically that generally proximal bundle methods perform better than the level variants when X =ℜn in problem (1a). On the other hand, the situation is reversed when the feasible set is a compact polyhedron. The doubly stabilized bundle method [46[46] DE OLIVEIRA W & SOLODOV M. 2013. A doubly stabilized bundle method for nonsmooth convex optimization, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/04/3828.html.
http://www.optimization-online.org/DB_HT...
] combines in a single algorithm the main features of the proximal and level methods, in an effort to exploit the best features of both variants.

A possible parameter set in this variant is

Subproblem Definition

Letting (·) denote the indicator function of the level set (14), the objective function of sub problem (6) is given by

As long as the level set Xklev is nonempty, the new iterate is given by the following particularization of subproblem (6):

which can be rewritten as

Lemma 2 in [46[46] DE OLIVEIRA W & SOLODOV M. 2013. A doubly stabilized bundle method for nonsmooth convex optimization, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/04/3828.html.
http://www.optimization-online.org/DB_HT...
] shows that the solution xk+1 of the QP (15) solves either (7) or (13) (when ω(x) = x,x〉 in the latter problem), so the doubly stabilized variant indeed combines the proximal and level methods. The numerical experiments reported in [46[46] DE OLIVEIRA W & SOLODOV M. 2013. A doubly stabilized bundle method for nonsmooth convex optimization, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/04/3828.html.
http://www.optimization-online.org/DB_HT...
] show a gain in performance, likely due to the increased versatility of this variant.

The unique QP solution is characterized as follows; see [46[46] DE OLIVEIRA W & SOLODOV M. 2013. A doubly stabilized bundle method for nonsmooth convex optimization, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/04/3828.html.
http://www.optimization-online.org/DB_HT...
, Prop. 2.1]:

The multipliers αk ≥ 0 and µk ≥ 1 satisfy the following relations for all jJk

As in the previous methods, the αkj-multipliers can be employed to select active linearizations. With this variant, the compression mechanism is also applicable.

Stability Center and Optimality Certificate

The rule to update the stability center Gk + Nk| and ɸk are small, or when the optimality gap is close to zero. is (9), as in the proximal bundle method. The stopping test combines the proximal and level criteria: the algorithm stops when either |

An additional bonus is that the µk-multipliers can be used to update the prox-stepsize. Specifically, in view of the optimality conditions above, given some γ ∈ (0,1), the rule

  • sets tk+1 = γ tk in case of null step and tk+1 = µktk if xk+1 is a serious iterate.

Impact of Inexactness

As shown in [46[46] DE OLIVEIRA W & SOLODOV M. 2013. A doubly stabilized bundle method for nonsmooth convex optimization, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/04/3828.html.
http://www.optimization-online.org/DB_HT...
], a distinctive feature of this variant is that it can be employed with both exact or inexact oracles without any changes, as long as the last "proximal step" linearization is kept in the information bundle.

Having laid down the background for problems with simple constraints, we now turn our attention to the constrained problem (1b).

4 CONSTRAINED SETTING

In spite of having only one scalar constraint, formulation (1b) is general and covers problems with m scalar convex constraints hj (x) ≤ 0, j = 1,...m, just by taking

A key definition in NCO is the so-called improvement function, a merit function which transforms (1b) into an unconstrained problem. The potential interest of such reformulation is to make it possible for algorithms to work without forcing feasibility. Indeed, constrained bundle methods such as those in [36[36] MIFFLIN R. 1977. An algorithm for constrained optimization with semismooth function-s, 2:191-207.] and [21[21] KIWIEL K 1985. Methods of descent for nondifferentiable optimization. Springer-Verlag, Berlin. , Ch. 5) can only work with feasible points. Although such a setting can be useful in some cases, there are also some applications for which the problem of computing one feasible point is as difficult as to solve the optimization problem itself (an example is the hydroreservoir management in [3[3] VAN ACKOOIJ W & SAGASTIZÁBAL C. 2014. Constrained bundle methods for upper inexact oracles with application to joint chance constrained energy problems. SIAM Journal on Optimization, 24(2):733-765. Doi=10.1137/120903099.
https://doi.org/10.1137/120903099...
,2[2] VAN ACKOOIJ W & DE OLIVEIRA W. 2014. Level bundle methods for constrained convex optimization with various oracles. Computational Optimization and Applications, 57:555-597.] mentioned in Section 5 below). For such applications, infeasible methods are of foremost importance. Using exact penalty objective functions as in [20[20] KIWIEL K. 1985. An exact penalty function algorithm for nonsmooth convex constrained minimization problems. IMA J. Numer. Anal., 5:111-119.,22[22] KIWIEL K 1991. Exact penalty functions in proximal bundle methods for constrained convex nondifferentiable minimization. Math. Programming, 52:285-302.] is one possibility, keeping in mind that estimating a suitable value of the penalty parameter is sometimes a delicate task. In what follows we focus on methods developed under a different paradigm, making use of improvement functions. For another alternative, we refer to the filter method in [18[18] KARAS E, RIBEIRO A, SAGASTIZÁBAL C & SOLODOV M. 2009. A bundle-filter method for nonsmooth convex constrained optimization. Math. Program. 116:297-320.].

4.1 Improvement Function

When the optimal value υopt of problem (1b) is known, a direct reformulation of (1b) consists in minimizing the function max{ ƒ (x) - υopt, h(x)} over the set X. Since often the optimal value is not available, instead one considers approximations of the form

which will be in turn modeled by cutting-plane functions.

The improvement function (16) finds a compromise between optimality and feasibility, represented by the first and second terms in the max-operation, respectively. The target choice depends on the method; [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.] takes τk1 := ƒ klow, a lower bound for υopt, and sets τ2 k := 0; in [23[23] KIWIEL K. 2008. A method of centers with approximate subgradient linearizations for nonsmooth convex optimization. SIAM Journal on Optimization 18:1467-1489.], the first target is the sum of the objective function value at the current stability center and a weighted measure of its feasibility; and other rules are considered in [23[23] KIWIEL K. 2008. A method of centers with approximate subgradient linearizations for nonsmooth convex optimization. SIAM Journal on Optimization 18:1467-1489.,51[51] SAGASTIZÁBAL C & SOLODOV M. 2005. An infeasible bundle method for nonsmooth convex constrained optimization without a penalty function or a filter. SIAM Journal on Optimization 16:146-169.,3[3] VAN ACKOOIJ W & SAGASTIZÁBAL C. 2014. Constrained bundle methods for upper inexact oracles with application to joint chance constrained energy problems. SIAM Journal on Optimization, 24(2):733-765. Doi=10.1137/120903099.
https://doi.org/10.1137/120903099...
].

As for the unconstrained case, we revise different bundle variants suited to (1b), starting with the proximal family.

4.2 Constrained Proximal Bundle Methods

The method introduced in [51[51] SAGASTIZÁBAL C & SOLODOV M. 2005. An infeasible bundle method for nonsmooth convex constrained optimization without a penalty function or a filter. SIAM Journal on Optimization 16:146-169.] takes the current serious step function value for the first target and τ k2 = 0 and applies an unconstrained proximal bundle method to the function

When the next serious iterate is generated, the first target is replaced accordingly, making the necessary changes to the bundle. These changes are not straightforward because, as explained in [51[51] SAGASTIZÁBAL C & SOLODOV M. 2005. An infeasible bundle method for nonsmooth convex constrained optimization without a penalty function or a filter. SIAM Journal on Optimization 16:146-169.], ensuring descent for the improvement function above does not necessary entail descent for the objective in (1b).

The parameters characterizing this family of methods is

Subproblem Definition

In this variant, the objective function in (6) is

yielding a QP problem, with unique solution as in (8) written with k(xk+1) therein replaced by

Stability Center and Optimality Certificate

The stability center, which also defines the first target, is changed when the current improvement function is sufficiently reduced using a test akin to (9), i.e.,

Because the improvement function is a max-function, the test requires progress on both optimality and feasibility criteria. Note also that, as Fk(k) = max{0, h(k)}, when the center is infeasible it is possible that f((xk+1) > f(k). So, outside the feasible region, the method is not monotone with respect to ƒ but it is monotone with respect to h, thus decreasing infeasibility. The situation is reversed once a feasible stability center is found.

The non-straightforward changes in the bundle alluded above refer to the fact that when , the bundle that was used to model the function Fk must now be adapted to model the new improvement function Fk+1(x) =max{ƒ (x) - ƒ (xk+1 ), h(x)}. We refer to Section 3 in [51[51] SAGASTIZÁBAL C & SOLODOV M. 2005. An infeasible bundle method for nonsmooth convex constrained optimization without a penalty function or a filter. SIAM Journal on Optimization 16:146-169.] for details.

Approximate optimality is declared when the center satisfies the inclusion Gk +Nk ∈∂єkFk(k) with both |Gk +Nk| and ϵk sufficiently small. When calculations are exact or for inexact oracles with bounded X, this test boils to the one using ɸk from (10).

Impact of Inexactness

Suppose now the constraint h in (1b) is hard to compute. The ideas above were generalized to handle inexact oracles in [3[3] VAN ACKOOIJ W & SAGASTIZÁBAL C. 2014. Constrained bundle methods for upper inexact oracles with application to joint chance constrained energy problems. SIAM Journal on Optimization, 24(2):733-765. Doi=10.1137/120903099.
https://doi.org/10.1137/120903099...
], using a test similar to (12) mutatis mutandis. The extension addresses two important issues, described below.

  • The targets in the improvement function are more general:

    for parameters ρk ∈ [0, ∞] and σk ∈[0, 1] such that 1 -σk + ρk is bounded away from zero. The function Fk from (16) is recovered by taking σkk = 0 at all iterations, a setting that may slow down the convergence when getting close to the boundary of the feasible set, due to zigzagging. Taking ρk > 0 prevents oscillating when close to a solution and may be beneficial for the speed of convergence. Note also that the target choice also allows for driving ρk to infinity, as in an exact penalty approach.

  • Rather than requiring descent for the improvement function, the stability center is updated when either the center is feasible and the new point is feasible and gives descent for the objective function, or the center is infeasible and the new point reduces infeasibility. This weaker condition is likely to update serious steps more often.

4.3 Constrained Bundle Methods in the Level Family

The first constrained level method able to deal with inexactness was proposed by [12[12] FÁBIÁN C. 2000. Bundle-type methods for inexact data. Central European Journal of Operations Research, 8 (special issue, T. Csendes and T. Rapcsk, eds.), pp. 35-55.], assuming the oracle is of the lower type with vanishing errors, that is ηkh → 0. Further improvements, incorporating the on-demand accuracy setting from [43[43] DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
https://doi.org/10.1080/10556788.2013.87...
], were considered in [13[13] FÁBIÁN C. 2013. Computational aspects of risk-averse optimisation in two-stage stochastic models, tech. rep., Institute of Informatics, Kecskemét College, Hungary. Available at: http://www.optimization-online.org/DB_HTML/2012/08/3574.html.
http://www.optimization-online.org/DB_HT...
]. The basic idea in these articles is to apply some unconstrained level bundle method to solve the linearly constrained problem minx∈X Fk(x) where

Since the algorithms in [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.,26[26] KIWIEL KC 1995. Proximal level bubdle methods for convex nondiferentiable optimization, saddlepoint problems and variational inequalities. Math. Program. 69:89-109.,12[12] FÁBIÁN C. 2000. Bundle-type methods for inexact data. Central European Journal of Operations Research, 8 (special issue, T. Csendes and T. Rapcsk, eds.), pp. 35-55.,13[13] FÁBIÁN C. 2013. Computational aspects of risk-averse optimisation in two-stage stochastic models, tech. rep., Institute of Informatics, Kecskemét College, Hungary. Available at: http://www.optimization-online.org/DB_HTML/2012/08/3574.html.
http://www.optimization-online.org/DB_HT...
] make use of certain dual variables to update the level parameter for the cutting-plane model of Fk, below we grouped them in the family of prima-ldual level bundle methods. In contrast, the methods proposed in [2[2] VAN ACKOOIJ W & DE OLIVEIRA W. 2014. Level bundle methods for constrained convex optimization with various oracles. Computational Optimization and Applications, 57:555-597.] do not set the objective function in the subproblems to be Fk but rather use the improvement function as a measure to certify optimality. The authors justify this choice as being more natural when dealing with inexactness from the oracle. The method was extended in [42[42] DE OLIVEIRA W 2014. Regularized nonsmooth methods for solving convex MINLP problems, tech. rep. Available at: www.oliveira.mat.br.
www.oliveira.mat.br...
] to solve convex mixed-integer non-linear programming problems.

For both types of methods, the parameter set is

4.3.1 Primal-Dual Constrained Level Bundle Methods

The algorithms in [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.,12[12] FÁBIÁN C. 2000. Bundle-type methods for inexact data. Central European Journal of Operations Research, 8 (special issue, T. Csendes and T. Rapcsk, eds.), pp. 35-55.] define the objective function in (6) as

In these primal-dual variants, the level set Xklev depends on a dual variable αk ∈ (0,1), computed by solving a one-dimensional optimization problem related to the max-operation defining Fk, so that αkx) + (1 -αk)k(x) is a cutting-plane approximation of the improvement function. Accordingly, the iterate xk+1 solves the QP subproblem:(

The level parameter ƒklev is updated by solving an LP, as in the unconstrained variant.

Stability Center and Optimality Certificate

The rule proposed in [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.] to update the stability center is simple: just take k as the current iterate xk. Differently, [26[26] KIWIEL KC 1995. Proximal level bubdle methods for convex nondiferentiable optimization, saddlepoint problems and variational inequalities. Math. Program. 69:89-109.] updates k only after observing enough progress in the improvement function Fk (as for the proximal variants). The new stability center can be any point in the information bundle (for example, a feasible iterate with lowest objective function value). The choice k = x1 for all k is also possible for the method proposed in [26[26] KIWIEL KC 1995. Proximal level bubdle methods for convex nondiferentiable optimization, saddlepoint problems and variational inequalities. Math. Program. 69:89-109.].

The optimality certificate is given by the improvement function: the algorithm stops with a δTol-solution when min j≤k Fk (xj) ≤ δTol.

Impact of Inexactness

Publication [12[12] FÁBIÁN C. 2000. Bundle-type methods for inexact data. Central European Journal of Operations Research, 8 (special issue, T. Csendes and T. Rapcsk, eds.), pp. 35-55.] works with lower oracles, so η+≡0, and the algorithm assumes the oracle error ηhk is driven to zero along the iterative process: eventually the method coincides with the variant for exact oracles.

4.3.2 Constrained Level Bundle Methods

The method proposed in [2[2] VAN ACKOOIJ W & DE OLIVEIRA W. 2014. Level bundle methods for constrained convex optimization with various oracles. Computational Optimization and Applications, 57:555-597.] defines the objective function in (6) as

where the level set Xklev is the intersection of the level set for the objective model k with an outer approximation of the constraint set. Accordingly, the next iterate solves the QP subproblem:

Lemma 2in [2[2] VAN ACKOOIJ W & DE OLIVEIRA W. 2014. Level bundle methods for constrained convex optimization with various oracles. Computational Optimization and Applications, 57:555-597.] assures that ƒ klev is a lower bound for the optimal value υopt of (1b) whenever the feasible set of QP (17) is empty.

The rule given in [2[2] VAN ACKOOIJ W & DE OLIVEIRA W. 2014. Level bundle methods for constrained convex optimization with various oracles. Computational Optimization and Applications, 57:555-597.] to update the lower bound and the level parameter does not rely on dual variables, and is much simpler:

Different from the primal-dual family, no additional optimization subproblem needs to be solved at each iteration (neither to define the dual variable αk nor to obtain ƒ klow). In this method, both selection and compression of the bundle is possible.

Stability Center and Optimality Certificate

Similar to the unconstrained level bundle method given in [43[43] DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
https://doi.org/10.1080/10556788.2013.87...
], a new stability center k is chosen in the information bundle whenever Xklev is empty. The stopping test is identical to the one in [30[30] LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program. 69:111-147.,26[26] KIWIEL KC 1995. Proximal level bubdle methods for convex nondiferentiable optimization, saddlepoint problems and variational inequalities. Math. Program. 69:89-109.,12[12] FÁBIÁN C. 2000. Bundle-type methods for inexact data. Central European Journal of Operations Research, 8 (special issue, T. Csendes and T. Rapcsk, eds.), pp. 35-55.], checking that minj ≤k Fk(xj)is sufficiently small.

Impact of Inexactness

The algorithm is robust to noise, in the s ense that no additional test is required to cope with inexactness. With an exact oracle, the improvement function is always nonnegative. Negative values for Fk(·) may arise when the oracle is inexact, in which case the method terminates with an (ηh + η+ + δTol)-solution to problem (1b). For lower oracles, driving to zero the error ηhk for certain iterates named substantial ensures that asymptotically an exact solution is found; [2[2] VAN ACKOOIJ W & DE OLIVEIRA W. 2014. Level bundle methods for constrained convex optimization with various oracles. Computational Optimization and Applications, 57:555-597.].

5 ONE PROBLEM, TWO FORMULATIONS

When dealing with real-life problems, modeling is an issue as important as solving the resulting optimization problem. As an illustration, we now consider an example from hydroreservoir management and represent the same physical problem in two different manners, yielding either a problem of the form (1a) or of the form (1b).

Figure 1 represents a set of power plants along a hydrological basin in which the reservoirs uphill have a volume subject to random streamflows due to melted snow. In the figure, there are N reservoirs, each one with volume V i (t) for i = 1,...,N and a certain time step t, out of which the first r are subject to uncertain inflows. For the full model details, we refer to [1[1] VAN ACKOOIJ W, HENRION R, MÖLLER A & ZORGATI R. 2013. Joint chance constrained programming for hydro reservoir management. Optimization and Engineering, accepted for publication.].

Figure 1
Schematic Representation of a Hydrovalley.

Since power plants are cascaded along the same basin, water released uphill fills the reservoirs downhill. The increased volume allows the plants downhill either to produce hydropower by directing water through turbines or to release water that is pumped to refill uphill reservoirs. These operations make the plants interdependent and, as such, the best practice is to manage them together rather than individually.

In particular, an important concern is to keep the reservoirs above some min-zone constraints, i.e., lower bounds set by the system operator for navigation or irrigation purposes. Since the reservoir volumes are uncertain, so are the min-zones values, and the optimization problem that defines the remuneration for each power plant has a stochastic nature. A possibility to define a turbine throughput and pumping strategy that is suitable for the whole hydrovalley is to endow the optimization problem with probabilistic constraints.

Suppose a (random) unit price π, a (random) lower bound £, and a probability level p ∈ (0, 1) are given. Letting x ∈ ℝn denote the abstract vector gathering the decision variables (release water through the turbines, pumping, etc), the hydroreservoir management problem has the form

When streamflows are represented by some autoregressive model with Gaussian innovations, the joint chance constraints above define a convex feasible set and

is a convex function, whose differentiability depends on the covariance matrix of the Gaussian noises. The oracle to compute values for h and a (sub)gradient needs to perform numerical integration in high dimensions. More precisely, since, typically, the horizon has T = 48 or 96 time steps, the vector x, formed by subvectors for each plant, each one of length T ,has a dimension of several hundreds.

We are therefore in our initial setting: the hydroreservoir management problem fits structure (1b) with a linear ƒ (·) := 〈π, ·〉, easy to evaluate, and a difficult constraint of the form (18).

To see how to cast the hydroreservoir problem in the format (1a), we follow [52[52] SHAPIRO A, DENTCHEVA D & RUSZCZYŃSKI A. 2009. Lectures on Stochastic Programming: Modeling and Theory. MPS-SIAM Series on Optimization, SIAM - Society for Industrial and Applied Mathematics and Math. Program. Society, Philadelphia. , Chapter 4.3.2) and consider the set of p-efficient points, a generalization of the p-quantile:

where F(ɀ) = ℙ(xɀ) is the distribution function of the min-zone bound ℓ.

As explained in [52[52] SHAPIRO A, DENTCHEVA D & RUSZCZYŃSKI A. 2009. Lectures on Stochastic Programming: Modeling and Theory. MPS-SIAM Series on Optimization, SIAM - Society for Industrial and Applied Mathematics and Math. Program. Society, Philadelphia. , Chapter 4.3.3), the equivalent reformulation of the hydroreservoir problem

can be dualized by introducing Lagrange multipliers 0 ≤ u ∈ ℝn for the constraint xɀ. The resulting dual problem

fits setting (1a). Oracles for estimating the value and gradient of the hard function h, involving integer programming, are considered in [10[10] DENTCHEVA D & MARTINEZ G. 2003. Regularization methods for optimization problems with probabilistic constraints. Math. Program. 138:223-251.].

The loop is looped: starting with a problem as in (1b) we found an equivalent problem in the form (1a). The decision of which formulation to choose to eventually solve the hydroreservoir management problem will depend on the availability of the oracle and NCO solvers. For instance, designing an oracle for (18) requires having a good code for numerical integration while the hard function depending on p-efficient points calls for combinatorial optimization techniques. A final, not less important, consideration is to analyze what is the desired output of the problem (only the optimal value, or a solution, or a feasible point, etc).

6 FINAL COMMENTS

We gave a panorama of state-of-the-art bundle algorithms that were developed in the last 10-15 years to deal with inaccurate oracle information. Applications in energy, finance, transportation or planning often require solving with high precision and in a reliable manner large-scale problems for which the oracle information is hard, if not impossible, to obtain. The new generation of bundle methods reviewed in this work constitutes an important set of optimization tools for such real-life problems.

Our presentation covers methods for unconstrained, linearly constrained and nonlinearly constrained convex nonsmooth optimization problems. As made clear by the hydroreservoir example in Section 5, the choice of which specific formulation and bundle variant to employ depends largelyonthe structure of the problem to be solved. For instance, for large-scale unconstrained optimization problems the proximal bundle method seems to be more suitable than the level one, since each iterate can be determined by maximizing a quadratic function over a simplex (this is a QP subproblem, dual to (7)). On the other hand, if the feasible set is bounded, the level bundle variant might be preferable because the fact of having a lower bound allows the method to build an optimality gap, a stopping criterion that is easier to satisfy in general. Similarly if the optimal value of the problem is known in advance. The doubly stabilized variant can be a good solver for repeated solution of different types of problems, since it gives an automatic mechanism to combine the advantages of the proximal and level methods for the unconstrained or linearly constrained setting.

Notwithstanding, each bundle variant has parameters to be dealt with, such as the prox-stepsize or the level parameter. The convergence theory for each method relies on the proper updating of such parameters. For the proximal family, for instance, the stepsize cannot increase at null steps and (as for subgradient methods) must form a divergent series. For the level family, ƒ klev is often a convex combination of the current upper and lower bounds. Depending on the updating rule for the lower bound, the level parameter can make the subproblem infeasible. As long as the subproblem is a QP, this is not a major concern: most of modern QP solvers return with an exit flag informing infeasibility. In this case, the lower bound and level parameter are easily updated, see Subsection 4.3.2. Furthermore, to prevent the level set from being empty, the lower bound can be computed by solving an LP at each iteration, as discussed in Subsection 3.2.

An important point to note is that for lower and on-demand accuracy oracles (not for upper oracles), inexact bundle methods can eventually find an exact solution even if the oracle information is inexact most of time: the solution quality depends only on how accurate the oracle is at serious steps.

Regarding computational packages, several of the variants described above are freely available for academic use upon request to the authors, including [15[15] FRANGIONI A. 2002. Generalized bundle methods. SIAM Journal on Optimization 13:117-156.] and [44[44] DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
https://doi.org/10.1007/s10107-014-0809-...
].

We finish by commenting on several future directions of research.

For the proximal family, [4[4] ASTORINO A, FRANGIONI A, FUDULI A & GORGONE E. 2013. A nonmonotone proximal bundle method with (potentially) continuous step decisions. SIAM Journal on Optimization 23:1784-1809.] makes a first incursion into new criteria to define stability centers. Differently from (9), the serious sequence may have non-monotone function values. The idea is to have more serious steps and to avoid the algorithm stalling at null steps. It would be interesting to see if it is possible to relate these new criteria to crucial objects in the level family, such as the optimality gap.

Concercing research impacting applications, we can mention four topics. First, asynchronous parallelization as in [14[14] FISCHER F & HELMBERG C. 2014. A parallel bundle framework for asynchronous subspace optimization of nonsmooth convex functions. SIAM Journal on Optimization 24(2):795-822. Doi=10.1137/120865987.
https://doi.org/10.1137/120865987...
] deserves further investigation. Second, the issue of primal recovery when solving a problem via Lagrangian relaxation: it remains to understand how the primal solutions are polluted by "noise" when the Lagrangian subproblems are solved inexactly and some inexact bundle algorithm is used. Third, regarding mixed-integer problems, the extension of bundle methods [42[42] DE OLIVEIRA W 2014. Regularized nonsmooth methods for solving convex MINLP problems, tech. rep. Available at: www.oliveira.mat.br.
www.oliveira.mat.br...
] left open some issues on complexity and restricted memory. Fourth, when in finance or economics there are multiple solutions, some special one can be singled out by, for instance, looking for the solution that is closest to some reference point. Exploiting the structure in such nonsmooth minimal norm solution problems could lead to the design of new specialized solution algorithms, based on the bundle methodology.

Finally, along the lines of the work [38[38] MIFFLIN R & SAGASTIZÁBAL C. 2005. A VU -algorithm for convex minimization. Math. Program. 104: 583-608.] for proximal bundle methods, it would be interesting to incorporate VU-decomposition into the level family to device superlinearly convergent level bundle methods.

ACKNOWLEDGMENTS

We are grateful to Robert Mifflin for his careful reading and useful remarks. The second author is partially supported by Grants CNPq 303840/2011-0, AFOSR FA9950-11-1-0139, as well as by PRONEX-Optimization and FAPERJ.

REFERENCES

  • [1]
    VAN ACKOOIJ W, HENRION R, MÖLLER A & ZORGATI R. 2013. Joint chance constrained programming for hydro reservoir management Optimization and Engineering, accepted for publication.
  • [2]
    VAN ACKOOIJ W & DE OLIVEIRA W. 2014. Level bundle methods for constrained convex optimization with various oracles. Computational Optimization and Applications, 57:555-597.
  • [3]
    VAN ACKOOIJ W & SAGASTIZÁBAL C. 2014. Constrained bundle methods for upper inexact oracles with application to joint chance constrained energy problems. SIAM Journal on Optimization, 24(2):733-765. Doi=10.1137/120903099.
    » https://doi.org/10.1137/120903099
  • [4]
    ASTORINO A, FRANGIONI A, FUDULI A & GORGONE E. 2013. A nonmonotone proximal bundle method with (potentially) continuous step decisions. SIAM Journal on Optimization 23:1784-1809.
  • [5]
    BELLO CRUZ JY & DE OLIVEIRA W. 2014. Level bundle-like algorithms for convex optimization. Journal of Global Optimization, 59(4):787-809. Doi=10.1007/s10898-013-0096-4.
    » https://doi.org/10.1007/s10898-013-0096-4
  • [6]
    BEN-TAL A & NEMIROVSKI A. 2005. Noneuclidean restricted memory level method for large-scale convex optimization. Math. Program., 102:407-456.
  • [7]
    BORGHETTI A, FRANGIONI A, LACALANDRA F & NUCCI C. 2003. Lagrangian heuristics based on disaggregated bundle methods for hydrothermal unit commitment. IEEE Transactions on Power Systems, 18:313-323.
  • [8]
    BRANNLUND U, KIWIEL KC & LINDBERG PO. 1995. A descent proximal level bundle method for convex nondifferentiable optimization. Operations Research Letters, 17:121-126.
  • [9]
    CHENEY E & GOLDSTEIN A. 1959. Newton's method for convex programming and Tchebycheff approximations, 1:253-268.
  • [10]
    DENTCHEVA D & MARTINEZ G. 2003. Regularization methods for optimization problems with probabilistic constraints. Math. Program. 138:223-251.
  • [11]
    EMIEL G & SAGASTIZÁBAL C. 2010. Incremental-like bundle methods with application to energy planning. Computational Optimization and Applications 46:305-332.
  • [12]
    FÁBIÁN C. 2000. Bundle-type methods for inexact data. Central European Journal of Operations Research, 8 (special issue, T. Csendes and T. Rapcsk, eds.), pp. 35-55.
  • [13]
    FÁBIÁN C. 2013. Computational aspects of risk-averse optimisation in two-stage stochastic models, tech. rep., Institute of Informatics, Kecskemét College, Hungary. Available at: http://www.optimization-online.org/DB_HTML/2012/08/3574.html.
    » http://www.optimization-online.org/DB_HTML/2012/08/3574.html
  • [14]
    FISCHER F & HELMBERG C. 2014. A parallel bundle framework for asynchronous subspace optimization of nonsmooth convex functions. SIAM Journal on Optimization 24(2):795-822. Doi=10.1137/120865987.
    » https://doi.org/10.1137/120865987
  • [15]
    FRANGIONI A. 2002. Generalized bundle methods. SIAM Journal on Optimization 13:117-156.
  • [16]
    FRANGIONI A & GENDRON B. 2013. A stabilized structured Dantzig Wolfe decomposition method. Math. Program. 140:45-76.
  • [17]
    HINTERMÜLLER M. 2001. A proximal bundle method based on approximate subgradients. Computational Optimization and Applications 20:245-266.
  • [18]
    KARAS E, RIBEIRO A, SAGASTIZÁBAL C & SOLODOV M. 2009. A bundle-filter method for nonsmooth convex constrained optimization. Math. Program. 116:297-320.
  • [19]
    KELLEY J. 1960. The cutting-plane method for solving convex programs. Journal of the Society for Industrial and Applied Mathematics, 8:703-712.
  • [20]
    KIWIEL K. 1985. An exact penalty function algorithm for nonsmooth convex constrained minimization problems. IMA J. Numer. Anal., 5:111-119.
  • [21]
    KIWIEL K 1985. Methods of descent for nondifferentiable optimization. Springer-Verlag, Berlin.
  • [22]
    KIWIEL K 1991. Exact penalty functions in proximal bundle methods for constrained convex nondifferentiable minimization. Math. Programming, 52:285-302.
  • [23]
    KIWIEL K. 2008. A method of centers with approximate subgradient linearizations for nonsmooth convex optimization. SIAM Journal on Optimization 18:1467-1489.
  • [24]
    KIWIEL K & LEMARÉCHAL C. 2009. An inexact bundle variant suited to column generation. Math. Program. 118:177-206.
  • [25]
    KIWIEL KC 1995. Approximations in proximal bundle methods and decomposition of convex programs. Journal of Optimization Theory and Applications, 84:529-548. 10.1007/BF02191984.
    » https://doi.org/10.1007/BF02191984
  • [26]
    KIWIEL KC 1995. Proximal level bubdle methods for convex nondiferentiable optimization, saddlepoint problems and variational inequalities. Math. Program 69:89-109.
  • [27]
    KIWIEL KC 2006. A proximal bundle method with approximate subgradient linearizations. SIAM Journal on Optimization 16:1007-1023.
  • [28]
    LAN G. 2013. Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math. Program pp. 1-45.
  • [29]
    LEMARÉCHAL C. 1975. An extension of Davidon methods to nondifferentiable problems. Math. Programming Stud., 3:95-109.
  • [30]
    LEMARÉCHAL C, NEMIROVSKII A & NESTEROV Y. 1995. New variants of bundle methods. Math. Program 69:111-147.
  • [31]
    LEMARÉCHAL C, OUOROU A & PETROU G. 2011. A bundle-type algorithm for routing in telecommunication data networks. Comput. Optim. Appl., 44:385-409.
  • [32]
    LEMARÉCHAL C AND SAGASTIZÁBAL C. 1994. An approach to variable metric bundle methods. Lecture Notes in Control and Information Science, 197:144-162.
  • [33]
    LEMARÉCHAL C & SAGASTIZÁBAL C. 1997. Variable metric bundle methods: From conceptual to implementable forms. Math. Program 76:393-410. 10.1007/BF02614390.
    » https://doi.org/10.1007/BF02614390
  • [34]
    LIN H. 2012. An inexact spectral bundle method for convex quadratic semidefinite programming. Computational Optimization and Applications 53:45-89.
  • [35]
    MALICK J, DE OLIVEIRA W & ZAOURAR S. 2013. Nonsmooth optimization using uncontrolled inexact information, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/05/3892.html.
    » http://www.optimization-online.org/DB_HTML/2013/05/3892.html
  • [36]
    MIFFLIN R. 1977. An algorithm for constrained optimization with semismooth function-s, 2:191-207.
  • [37]
    MIFFLIN R & SAGASTIZÁBAL C. 2000. On VU -theory for functions with primal-dual gradient structure. Siam Journal on Optimization, 11:547-571.
  • [38]
    MIFFLIN R & SAGASTIZÁBAL C. 2005. A VU -algorithm for convex minimization. Math. Program 104: 583-608.
  • [39]
    MIFFLIN R & SAGASTIZÁBAL C. 2012. Optimization Stories, vol. Extra Volume ISMP 2012, ed. by M. Grötschel, DOCUMENTA MATHEMATICA, ch. A Science Fiction Story in Nonsmooth Optimization Originating at IIASA, p. 460.
  • [40]
    MOREAU J. 1965. Proximité et dualité dans un espace Hilbertien. Bull. Soc. Math. France, 93:273-299.
  • [41]
    DE OLIVEIRA W 2011. Inexact bundle methods for stochastic optimization. PhD thesis, Federal University of Rio de Janeiro, COPPE/UFRJ, January 2011. (In Portuguese) Available at: http://www.cos.ufrj.br/index.php?option=com_publicacao&task= visualizar&cid%5B0%5D=2192.
    » http://www.cos.ufrj.br/index.php?option=com_publicacao&task= visualizar&cid%5B0%5D=2192
  • [42]
    DE OLIVEIRA W 2014. Regularized nonsmooth methods for solving convex MINLP problems, tech. rep. Available at: www.oliveira.mat.br.
    » www.oliveira.mat.br
  • [43]
    DE OLIVEIRA W & SAGASTIZÁBAL C. 2014. Level bundle methods for oracles with on-demand accuracy. Optimization Methods and Software, 29(6):1180-1209. Doi=10.1080/10556788.2013.871282.
    » https://doi.org/10.1080/10556788.2013.871282
  • [44]
    DE OLIVEIRA W, SAGASTIZÁBAL C & LEMARÉCHAL C. 2014. Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Mathematical Programming, pp. 1-37. Springer Berlin Heidelberg. Doi=10.1007/s10107-014-0809-6.
    » https://doi.org/10.1007/s10107-014-0809-6
  • [45]
    DE OLIVEIRA W, SAGASTIZÁBAL C & SCHEIMBERG S. 2011. Inexact bundle methods for twostage stochastic programming. SIAM Journal on Optimization 21:517-544.
  • [46]
    DE OLIVEIRA W & SOLODOV M. 2013. A doubly stabilized bundle method for nonsmooth convex optimization, tech. rep. Available at: http://www.optimization-online.org/DB_HTML/2013/04/3828.html.
    » http://www.optimization-online.org/DB_HTML/2013/04/3828.html
  • [47]
    RICHTÁRIK P. 2012. Approximate level method for nonsmooth convex minimization. Journal of Optimization Theory and Applications 152:334-350.
  • [48]
    SAGASTIZÁBAL C. 2011. Nonsmooth optimization: Thinking outside of the black box. SIAG/OPT Views-and-News, 220:2-10.
  • [49]
    SAGASTIZÁBAL C 2012. Divide to conquer: decomposition methods for energy optimization. Math. Program 134:187-222.
  • [50]
    SAGASTIZÁBAL C. 2013. Composite proximal bundle method. Mathematical Programming 140:189-233.
  • [51]
    SAGASTIZÁBAL C & SOLODOV M. 2005. An infeasible bundle method for nonsmooth convex constrained optimization without a penalty function or a filter. SIAM Journal on Optimization 16:146-169.
  • [52]
    SHAPIRO A, DENTCHEVA D & RUSZCZYŃSKI A. 2009. Lectures on Stochastic Programming: Modeling and Theory. MPS-SIAM Series on Optimization, SIAM - Society for Industrial and Applied Mathematics and Math Program. Society, Philadelphia.
  • [53]
    SOLODOV MV. 2003. On approximations with finite precision in bundle methods for nonsmooth optimization. Journal of Optimization Theory and Applications 119:151-165.
  • [54]
    WOLFE P. 1975. A method of conjugate subgradients for minimizing nondifferentiable functions. Math. Programming Stud., 3:145-173.

Publication Dates

  • Publication in this collection
    Sep-Dec 2014

History

  • Received
    30 Sept 2013
  • Accepted
    12 Dec 2013
Sociedade Brasileira de Pesquisa Operacional Rua Mayrink Veiga, 32 - sala 601 - Centro, 20090-050 Rio de Janeiro RJ - Brasil, Tel.: +55 21 2263-0499, Fax: +55 21 2263-0501 - Rio de Janeiro - RJ - Brazil
E-mail: sobrapo@sobrapo.org.br