Skip to main content
main-content

Über dieses Buch

Stochastic Recursive Algorithms for Optimization presents algorithms for constrained and unconstrained optimization and for reinforcement learning. Efficient perturbation approaches form a thread unifying all the algorithms considered. Simultaneous perturbation stochastic approximation and smooth fractional estimators for gradient- and Hessian-based methods are presented. These algorithms:

• are easily implemented;

• do not require an explicit system model; and

• work with real or simulated data.

Chapters on their application in service systems, vehicular traffic control and communications networks illustrate this point. The book is self-contained with necessary mathematical results placed in an appendix.

The text provides easy-to-use, off-the-shelf algorithms that are given detailed mathematical treatment so the material presented will be of significant interest to practitioners, academic researchers and graduate students alike. The breadth of applications makes the book appropriate for reader from similarly diverse backgrounds: workers in relevant areas of computer science, control engineering, management science, applied mathematics, industrial engineering and operations research will find the content of value.

Inhaltsverzeichnis

Frontmatter

Introduction to Stochastic Recursive Algorithms

Frontmatter

Introduction

Optimization methods play a central role in many engineering disciplines. In this chapter, we give a broad overview of the optimization settings and algorithms including simultaneous perturbation approaches as well as give a brief summary of later chapters.
S. Bhatnagar, H. Prasad, L. Prashanth

Deterministic Algorithms for Local Search

Most algorithms for stochastic optimization can be viewed as noisy versions of well-known incremental update deterministic optimization algorithms. Hence, we review in this chapter, some of the well-known algorithms for deterministic optimization. We shall study the noisy versions of these algorithms in later chapters.
S. Bhatnagar, H. Prasad, L. Prashanth

Stochastic Approximation Algorithms

Stochastic approximation algorithms have been one of the main focus areas of research on solution methods for stochastic optimization problems. The Robbins-Monro algorithm [17] is a basic stochastic approximation scheme that has been found to be applicable in a variety of settings that involve finding the roots of a function under noisy observations. We first review in this chapter the Robbins-Monro algorithm and its convergence. In cases where one is interested in optimizing the steady-state system performance, i.e., the objective is a long-run average cost function, multi-timescale variants of the Robbins-Monro algorithm have been found useful. We also review multi-timescale stochastic approximation in this chapter since many of the schemes presented in the later chapters shall involve such algorithms.
S. Bhatnagar, H. Prasad, L. Prashanth

Gradient Estimation Schemes

Frontmatter

Kiefer-Wolfowitz Algorithm

In this chapter, we review the Finite Difference Stochastic Approximation (FDSA) algorithm, also known as Kiefer-Wolfowitz (K-W) algorithm, and some of its variants for finding a local minimum of an objective function. The K-W scheme is a version of the Robbins-Monro stochastic approximation algorithm and incorporates balanced two-sided estimates of the gradient using two objective function measurements for a scalar parameter. When the parameter is an N-dimensional vector, the number of function measurements using this algorithm scales up to 2N. A one-sided variant of this algorithm in the latter case requires N + 1 function measurements. We present the original K-W scheme, first for the case of a scalar parameter, and subsequently for a vector parameter of arbitrary dimension. Variants including the one-sided version are then presented. We only consider here the case when the objective function is a simple expectation over noisy cost samples and not when it has a long-run average form. The latter form of the cost objective would require multi-timescale stochastic approximation, the general case of which was discussed in Chapter 3. Stochastic algorithms for the long-run average cost objectives will be considered in later chapters.
S. Bhatnagar, H. Prasad, L. Prashanth

Gradient Schemes with Simultaneous Perturbation Stochastic Approximation

In this chapter, we discuss an interesting class of gradient search algorithms which estimate the gradient by perturbing all parameter components simultaneously. These algorithms go by the name Simultaneous Perturbation Stochastic Approximation (SPSA) and require only one or two evaluations of the cost objective regardless of the parameter dimension. The original SPSA algorithm due to Spall [26] required two function measurements and used random perturbations with properties that are seen to be easily satisfied by independent, symmetric, zero-mean, ±1 valued, Bernoulli random variables. A one-measurement version of this algorithm was subsequently proposed [27] but did not perform as well as its two-measurement counterpart. In [8], certain deterministic constructions for the perturbation sequences were proposed. It was observed in particular that a construction based on Hadamard matrices considerably improves the performance of one-measurement SPSA over its random perturbation counterpart. We discuss gradient SPSA algorithms in this chapter, both for the case of expected cost as well as long-run average cost objectives.
S. Bhatnagar, H. Prasad, L. Prashanth

Smoothed Functional Gradient Schemes

We studied SPSA-based gradient estimation techniques in the previous chapter. Along similar lines, we present in this chapter, smoothed functional (SF)-based estimators of the gradient. While SF is also based on simultaneously perturbing the parameter vector, unlike SPSA, for the purpose of perturbation, one uses a smoothing function that possesses certain properties. An alternate view of the SF approach is that the gradient is convolved with a smoothing function, which in turn could possibly help in finding the global minimum of the smoothed objective. We discuss SF-based algorithms where the smoothing is done using Gaussian and Cauchy density functions. The regular SF algorithms require only one measurement of the objective function. We also provide the two-measurement variants of all the algorithms presented.
S. Bhatnagar, H. Prasad, L. Prashanth

Hessian Estimation Schemes

Frontmatter

Newton-Based Simultaneous Perturbation Stochastic Approximation

In this chapter, we present four different Newton SPSA algorithms for the long-run average cost objective. The random perturbation technique requiring zero-mean, bounded, symmetric perturbation random variables having a common distribution and mutually independent of one another is used to derive the various Hessian estimates. These algorithms require four, three, two and one simulations, respectively, and are seen to be efficient in practice. Note that, though we discuss Newton SPSA algorithms only for the long-run average cost setting here, all the Hessian estimation schemes discussed below can also be used for the expected cost setting as well.
S. Bhatnagar, H. Prasad, L. Prashanth

Newton-Based Smoothed Functional Algorithms

Gaussian SF estimates of the Hessian are derived by taking the convolution of the Hessian of the objective function with a multi-variate Gaussian density functional. Through an integration-by-parts argument applied twice, the same is seen to be the convolution of the function itself with a scaled multi-variate Gaussian density. This results in a one-simulation estimate of the Hessian. The same simulation also helps in obtaining a one-simulation gradient estimate (see Chapter 6). Thus, one obtains a one-simulation Newton-based SF algorithm. A two-simulation estimate of the Hessian is also derived that incorporates the same two simulations as for the two-simulation gradient estimate, also derived in Chapter 6. This results in a two-simulation Newton SF algorithm. We limit the discussion in this chapter to Gaussian-based SF estimates only.
S. Bhatnagar, H. Prasad, L. Prashanth

Variations to the Basic Scheme

Frontmatter

Discrete Parameter Optimization

In this chapter, we consider the case when optimization has to be performed over a parameter set that is discrete valued and has a finite number of points. We present adaptations of the SPSA and SF algorithms discussed previously using certain projection mappings. We consider here the case of a long-run average cost objective.
S. Bhatnagar, H. Prasad, L. Prashanth

Algorithms for Constrained Optimization

This chapter develops algorithms for parameter optimization under multiple functional (inequality) constraints. Both the objective as well as the constraint functions depend on the parameter and are suitable long-run averages. The Lagrangian relaxation technique is used together with multi-timescale stochastic approximation and algorithms based on gradient and Newton SPSA/SF ideas where the afore-mentioned parameter is updated on a faster timescale as compared to the Lagrange parameters are presented.
S. Bhatnagar, H. Prasad, L. Prashanth

Reinforcement Learning

Reinforcement learning (RL) in general refers to a rich class of simulation-based approaches that are geared towards solving problems of stochastic control. Such problems are often formulated using the framework of Markov decision Processes (MDPs). Regular solution procedures for MDPs suffer from two major problems: (a) they require complete model information and (b) the amount of computational effort required to solve such procedures grows exponentially in the size of the state space (the curse of dimensionality). Both problems are effectively tackled when using RL. Most RL algorithms are based on stochastic approximation and incorporate outcomes from real or simulated data directly. Further, feature-based function approximation is often used to handle large/high-dimensional state spaces. In this chapter, we discuss algorithms that are based on both full-state representations as well as function approximation. A distinguishing feature of these algorithms is that they are based on simultaneous perturbation techniques. Apart from being easily implementable, some of these algorithms also exhibit significant improvement in performance over other well known algorithms in the literature.
S. Bhatnagar, H. Prasad, L. Prashanth

Applications

Frontmatter

Service Systems

Service-based economies and business models have gained significant importance in recent times. The clients and service providers exchange value through service interactions and reach service outcomes. Service requests of a client can vary greatly in the skills required to fulfill the request, expected turn-around time, and the context of the client’s business needs. As a result, automation of service operations has been limited and service delivery is largely a labor-intensive business. Hence, it is crucial to optimize labor costs while maintaining the desired quality of service. In this chapter, we describe a study where the stochastic optimization methods have been applied to minimize labor costs in a service system. This chapter is based on [2,3,6].
S. Bhatnagar, H. Prasad, L. Prashanth

Road Traffic Control

In this chapter, we describe applications of reinforcement learning and simultaneous perturbation methods described earlier, for the problem of signal control at road traffic junctions. This chapter is based on [4.3].
S. Bhatnagar, H. Prasad, L. Prashanth

Communication Networks

This chapter puts together some of the applications of simultaneous perturbation-based approaches in the context of communication networks. Three applications are studied in particular: (i) the problem of tuning the parameters for the random early detection adaptive queue management scheme, (ii) finding the optimal retransmission probabilities for the slotted Aloha multi-access communication system and (iii) obtaining dynamic price allocation schemes for Internet traffic management.
S. Bhatnagar, H. Prasad, L. Prashanth

Backmatter

Weitere Informationen

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

Whitepaper

- ANZEIGE -

Globales Erdungssystem in urbanen Kabelnetzen

Bedingt durch die Altersstruktur vieler Kabelverteilnetze mit der damit verbundenen verminderten Isolationsfestigkeit oder durch fortschreitenden Kabelausbau ist es immer häufiger erforderlich, anstelle der Resonanz-Sternpunktserdung alternative Konzepte für die Sternpunktsbehandlung umzusetzen. Die damit verbundenen Fehlerortungskonzepte bzw. die Erhöhung der Restströme im Erdschlussfall führen jedoch aufgrund der hohen Fehlerströme zu neuen Anforderungen an die Erdungs- und Fehlerstromrückleitungs-Systeme. Lesen Sie hier über die Auswirkung von leitfähigen Strukturen auf die Stromaufteilung sowie die Potentialverhältnisse in urbanen Kabelnetzen bei stromstarken Erdschlüssen. Jetzt gratis downloaden!

Bildnachweise