The subdifferential of measurable composite max integrands and smoothing approximation

Burke, James V.; Chen, Xiaojun; Sun, Hailin

doi:10.1007/s10107-019-01441-9

The subdifferential of measurable composite max integrands and smoothing approximation

Full Length Paper
Series B
Published: 24 October 2019

Volume 181, pages 229–264, (2020)
Cite this article

Mathematical Programming Submit manuscript

678 Accesses
8 Citations
Explore all metrics

Abstract

The subdifferential calculus for the expectation of nonsmooth random integrands involves many fundamental and challenging problems in stochastic optimization. It is known that for Clarke regular integrands, the Clarke subdifferential of the expectation equals the expectation of their Clarke subdifferential. In particular, this holds for convex integrands. However, little is known about the calculation of Clarke subgradients for the expectation of non-regular integrands. The focus of this contribution is to approximate Clarke subgradients for the expectation of random integrands by smoothing methods applied to the integrand. A framework for how to proceed along this path is developed and then applied to a class of measurable composite max integrands. This class contains non-regular integrands from stochastic complementarity problems as well as stochastic optimization problems arising in statistical learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the resolution of misspecified convex optimization and monotone variational inequality problems

Article 22 June 2020

Stochastic proximal gradient methods for nonconvex problems in Hilbert spaces

Article Open access 12 January 2021

Nonsmooth spectral gradient methods for unconstrained optimization

Article 30 January 2017

References

Ahn, M., Pang, J.S., Xin, J.: Difference-of-convex learning I: directional stationarity, optimality, and sparsity. SIAM J. Optim. 27, 1637–1665 (2017)
Article MathSciNet Google Scholar
Aumann, R.J.: Integrals of set-valued functions. J. Math. Anal. Appl. 12, 1–12 (1965)
Article MathSciNet Google Scholar
Blundell, R., Powell, J.L.: Censored regression quantiles with endogenous regression. J. Econom. 141, 65–83 (2007)
Article Google Scholar
Burke, J.V.: Second order necessary and sufficient conditions for convex composite NDO. Math. Program. 38, 287–302 (1987)
Article MathSciNet Google Scholar
Burke, J.V., Hoheisel, T.: Epi-convergent smoothing with applications to convex composite functions. SIAM J. Optim. 23, 1457–1479 (2013)
Article MathSciNet Google Scholar
Burke, J.V., Hoheisel, T., Kanzow, C.: Gradient consistency for integral-convolution smoothing functions. Set-Valued Var. Anal. 21, 359–376 (2013)
Article MathSciNet Google Scholar
Burke, J.V., Luke, D.R.: Variational analysis applied to the problem of optimal phase retieval. SIAM J. Control Optim. 42, 576–595 (2003)
Article MathSciNet Google Scholar
Chen, C., Mangasarian, O.L.: A class of smoothing functions for nonlinear and mixed complementarity problems. Comput. Optim. Appl. 5, 97–138 (1996)
Article MathSciNet Google Scholar
Chen, X.: Smoothing methods for nonsmooth, nonconvex minimization. Math. Program. 134, 71–99 (2012)
Article MathSciNet Google Scholar
Chen, X., Fukushima, M.: A smoothing method for a mathematical program with P-matrix linear complementarity constraints. Comput. Optim. Appl. 27, 223–246 (2004)
Article MathSciNet Google Scholar
Chen, X., Fukushima, M.: Expected residual minimization method for stochastic linear complementarity problems. Math. Oper. Res. 30, 1022–1038 (2005)
Article MathSciNet Google Scholar
Chen, X., Qi, L., Sun, D.: Global and superlinear convergence of the smoothing Newton method and its application to general box constrained variational inequalities. Math. Comput. 67, 519–540 (1998)
Article MathSciNet Google Scholar
Chen, X., Wets, R., Zhang, Y.: Stochastic variational inequalities: residual minimization smoothing sample average approximations. SIAM J. Optim. 22, 649–673 (2012)
Article MathSciNet Google Scholar
Clarke, F.H.: Optimization and Nonsmooth Analysis, Volume 5 of Classics in Applied Mathematics. SIAM, Philadelphia (1990)
Book Google Scholar
Clarke, F.H., Ledyaev, Y.S., Stern, R.J., Wolenski, P.R.: Nonsmooth Analysis and Control Theory. Springer, New York (1998)
MATH Google Scholar
Dunford, N., Schwartz, J.T.: Linear Operators, Part1: General Theory. Wiley, Hoboken (1988)
Google Scholar
Folland, G.B.: Real Analysis, 2nd edn. Wiley, New York (1999)
MATH Google Scholar
Hildenbrand, W.: Core and Equilibria of a Large Economy. Princeton University Press, Princeton (1974)
MATH Google Scholar
Lyapunov, A.: Sur les fonctions-vecteur complétement additives. Bull. Acad. Sci. USSR Ser. Math. 4, 465–478 (1940)
Google Scholar
Mordukhvich, B.: Variational Analysis and Generalized Differentiation II. Springer, Berlin (2006)
Book Google Scholar
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19, 1574–1906 (2009)
Article MathSciNet Google Scholar
Ralph, D., Xu, H.: Convergence of stationary points of sample average two stage stochastic programs: a generalized equation approach. Math. Oper. Res. 36, 568–592 (2011)
Article MathSciNet Google Scholar
Richter, H.: Verallgemeinerung eines in der statistik benötigten satzes der masstheorie. Math. Ann. 150, 85–90 (1963)
Article MathSciNet Google Scholar
Rockafellar, R.T., Uryasev, S.: Optimization of conditional value-at-risk. J. Risk 2, 21–41 (2000)
Article Google Scholar
Rockafellar, R.T.: Integral functionals, normal integrands and measurable selections. In: Nonlinear Operators in the Calculus of Variations, Volume 543 in Lecture Notes in Mathematics, pp. 157–207. Springer, New York (1976)
Rockafellar, R.T., Wets, R.: On the interchange of subdifferentiation and conditional expectation for convex functions. Stochastics 7, 173–182 (1982)
Article MathSciNet Google Scholar
Rockafellar, R.T., Wets, R.: Variational Analysis. Springer, New York (1998)
Book Google Scholar
Rudin, W.: Principles of Mathematical Analysis, 3rd edn. McGraw-Hill, San Francisco (1976)
MATH Google Scholar
Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM, Philadelphia (2009)
Book Google Scholar
Tardella, F.: A new proof of the Lyapunov convexity theorem. SIAM J. Control Optim. 28, 478–481 (1990)
Article MathSciNet Google Scholar
Wets, R.: Stochastic programming. In: Nemhauser, G.L., et al. (eds.) Handbooks in OR & MS, vol. 1, pp. 573–629 (1989)
Xu, H.: An implicit programming approach for a class of stochastic mathematical programs with complementarity constraints. SIAM J. Optim. 16, 670–696 (2006)
Article MathSciNet Google Scholar
Xu, H., Ye, J.J.: Necessary optimality conditions for two-stage stochastic programs with equilibaium constraints. SIAM J. Optim. 20, 1685–1715 (2010)
Article MathSciNet Google Scholar
Xu, H., Zhang, D.: Smooth sample average approximation of stationary points in nonsmooth stochastic optimization and applications. Math. Program. 119, 371–401 (2009)
Article MathSciNet Google Scholar

Download references

Acknowledgements

We would like to thank Associate Editor and two referees for their helpful comments.

Author information

Authors and Affiliations

Department of Applied Mathematics, University of Washington, Seattle, USA
James V. Burke
Department of Applied Mathematics, Hong Kong Polytechnic University, Hong Kong, China
Xiaojun Chen
Jiangsu Key Lab for NSLSCS, School of Mathematical Sciences, Nanjing Normal University, Nanjing, 210023, China
Hailin Sun

Authors

James V. Burke
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hailin Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaojun Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is support by NSF Grant No. DMS-1908890, Hong Kong Research Grant Council Grant 153000/17P, and NSFC Grant Nos. 11871276 and 11571178.

Appendix: Background

1.1 Finite-dimensional variational analysis

Since we allow mappings to have infinite values, it is convenient to define the extended reals ${{\overline{{\mathbb {R}}}}}:={\mathbb {R}}\cup {\{+\infty \}}$. The effective domain of $f:\,{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}\,$, denoted ${\text{ dom }\,}f\subseteq {\mathbb {R}}^n$, is the set on which f is finite. To avoid certain pathological mappings the discussion is restricted to proper not everywhere infinite) lower semi-continuous (lsc) functions. Of particular importance is the epigraph of such functions: ${\text{ epi }\, f}:=\left\{ (x,\mu )\,\left| \,f(x)\le \mu \right. \right\} $. We have that f is lsc if and only if ${\text{ epi }\, f}$ is closed, and f is convex if and only if ${\text{ epi }\, f}$ is convex.

Definition 8

(Subderivatives) [27, Exercise 9.15] For a locally Lipschitz function $f:\,{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}\,$ near a point ${{u}_*}\in {\mathbb {R}}^n$ with $f({{u}_*})$ finite,

(i)
the subderivative$df({{u}_*}):{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}$ is defined by
$$\begin{aligned} df({{u}_*})(w):= \underset{\tau \downarrow 0}{\lim \inf ~} \frac{f({{u}_*}+\tau w) - f({{u}_*})}{\tau }; \end{aligned}$$
(ii)
the regular subderivative (or the Clarke generalized directional derivative when f is locally Lipschitz) $~{\widehat{d}}f({{u}_*}):{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}~$ is defined by
$$\begin{aligned} {{{\widehat{d}}}}f({{u}_*})(w):= \underset{u\rightarrow {{u}_*},\,\tau \downarrow 0}{\lim \sup }\frac{f(u+\tau w)-f(u)}{\tau }. \end{aligned}$$

Definition 9

(Subgradients, subdifferentials and subdifferential regularity) Consider a locally Lipschitz function $f:{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}$, a point $v\in {\mathbb {R}}^n$, and a point ${{u}_*}\in {\mathbb {R}}^n$ with $f({{u}_*})$ finite.

(i)
[27, Theorem 8.49] The vector v is a Clarke subgradient of f at ${{u}_*}$ if v satisfies
$$\begin{aligned} {{{\widehat{d}}}}f({{u}_*})(w)\ge \left\langle v\, ,\, w\right\rangle ~\forall ~w\in {\mathbb {R}}^n. \end{aligned}$$
We call the set of Clarke subgradients v the Clarke subdifferential of f at ${{u}_*}$ and denote this set by $\partial f({{u}_*})$.
(ii)
[27, Corollary 8.19] $f:\,{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}\,$ is said to be subdifferentially regular (or Clarke regular) at ${{u}_*}\in {\text{ dom }\,}f$ with $\partial f({{u}_*})\ne \emptyset $ if
$$\begin{aligned} df({{u}_*})(w) = {{{\widehat{d}}}}f({{u}_*})(w)\quad \forall \, w\in {\mathbb {R}}^n. \end{aligned}$$
(iii)
[14, Definition 2.6.1] [12] The vector v is a B-subgradient of f at ${{u}_*}$ if
$$\begin{aligned} v=\lim _{u^k\rightarrow {{u}_*}} \nabla f(u^k), \quad \mathrm{where }\,\, f \,\, \text{ is } \text{ differentiable } \text{ at }\,\, u^k. \end{aligned}$$
We call the set of B-subgradients v of f at ${{u}_*}$ the B-subdifferential of f at ${{u}_*}$ and denote this set by $\partial ^B f({{u}_*})$.
(iv)
[27, Definition 8.3] The vector v is an M-subgradient of f at ${{u}_*}$ if there are sequences $ u^k\rightarrow {{u}_*}$ and $v^k \rightarrow v$ with
$$\begin{aligned} \underset{u\rightarrow u^k}{\lim \inf ~}\frac{f(u)-f(u^k) - \left\langle v^k\, ,\, u-u^k\right\rangle }{\Vert u-u^k\Vert }\ge 0. \end{aligned}$$
We call the set of M-subgradients v of f at ${{u}_*}$ the M-subdifferential of f at ${{u}_*}$ and denote this set by $\partial ^M f({{u}_*})$.

Remark 6

In [27], the notion of subdifferential regularity is defined in [27, Definition 7.25]. In the definition given above we employ characterizations of this notion given by the cited results. Note that subdifferential mappings are multi-functions.

Definition 10

(Strict continuity and strict differentiability) Let $H:\,D\rightarrow {\mathbb {R}}^m\,$, $D\subseteq {\mathbb {R}}^n$, and $h:\,{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}\,$.

(i)
[Strict Continuity [27, Definition 9.1]] We say that H is strictly continuous at ${\bar{x}}\in {\text{ int }\left( D\right) }$ if
$$\begin{aligned} \mathrm {lip}\, H({\bar{x}}):=\limsup _{{\mathop {x\ne x'}\limits ^{x,x'\rightarrow {\bar{x}}}}} \frac{\left\| H(x')-H(x)\right\| }{\left\| x'-x\right\| }<\infty \, . \end{aligned}$$
(ii)
[Strict Differentiability [27, Definition 9.17]] We say that h is strictly differentiable at a point ${\bar{x}}\in {\text{ dom }\,}{h}$ if h is differentiable at ${\bar{x}}$ and
$$\begin{aligned} \lim _{{\mathop {x\ne x'}\limits ^{x,x'\rightarrow {\bar{x}}}}} \frac{h(x')-h(x)-\left\langle \nabla h({\bar{x}})\, ,\, x'-x\right\rangle }{\left\| x'-x\right\| }=0\, . \end{aligned}$$

It is easily seen that if h is continuously differentiable on an open set U, then h is strictly differentiable and subdifferentially regular on U with $\partial h(x)=\{\nabla h(x)\}$ for all $x\in U$ ([27, Theorem 9.18 and Exercise 9.64]).

The notion of strict continuity of f at a point ${\bar{x}}$ implies the existence of a neighborhood of ${\bar{x}}$ on which f is Lipschitz continuous, that is, f is locally Lipschitz continuous at ${\bar{x}}$ where the local Lipschitz modulus is lower bounded by $\mathrm {lip}\, H({\bar{x}})$. In this light, Definition 8 and Definition 9(ii) combine to tell us that

$$\begin{aligned} df({{u}_*})(w) = {{{\widehat{d}}}}f({{u}_*})(w)=\lim _{\tau \downarrow 0} \frac{f({{u}_*}+\tau w)-f({{u}_*})}{\tau } \quad \forall \, w\in {\mathbb {R}}^n, \end{aligned}$$

(55)

wherever f is strictly continuous and subdifferentially regular at ${{u}_*}$. Moreover, in this case, [27, Theorem 8.30] tells us that

$$\begin{aligned} df(x)(v)=\sup \left\{ \left\langle g\, ,\, v\right\rangle \,\left| \,g\in \partial f(x)\right. \right\} . \end{aligned}$$

(56)

Remark 7

(Subdifferentials of Compositions) If $g:\,X\subset {\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}\,$ is given as the composition of two functions $f:\,Y\subset {\mathbb {R}}^m\rightarrow {{\overline{{\mathbb {R}}}}}\,$ and $h:\,X\rightarrow Y\,$, i.e. $g(x)=(f\circ h)(x)=f(h(x))$, then we write $\partial g(x)=\partial (f\circ h)(x).$ On the other hand, we write $\partial f(h(x))$ to denote the subdifferential of f evaluated at h(x).

Theorem 9

(Strict differentiability and the subdifferential) [27, Theorem 9.18] [14, Proposition 2.2.4] Let $h:\,{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}}\,$ with ${\bar{x}}\in {\text{ dom }\,}{h}$. Then h is strictly differentiable at ${\bar{x}}$ if and only if h is strictly continuous at ${\bar{x}}$ and $\partial h({\bar{x}})=\{\nabla h({\bar{x}})\}$.

1.2 Measurable multi-functions

We now review some of the properties of measurable multi-functions used in this paper [2, 15, 18, 27]. For more information on this topic, we refer the interested reader to [27, Chapter 14] and [25].

A multi-function, or multi-valued mapping, S from ${\mathbb {R}}^k$ to ${\mathbb {R}}^s$ is a mapping that takes points in ${\mathbb {R}}^k$ to sets in ${\mathbb {R}}^s$, and is denoted by $S: {\mathbb {R}}^k \rightrightarrows {\mathbb {R}}^s$. The outer limit of S at ${{\bar{x}}}\in {\mathbb {R}}^k$relative to$X\subseteq {\mathbb {R}}^k$ is

$$\begin{aligned} \mathop {\mathrm{Limsup}}_{x\rightarrow _X{{\bar{x}}}} S(x):= \big \{v \in {\mathbb {R}}^s\mid \exists \{x^k\}\rightarrow _X {{\bar{x}}}, \{v^k\}\rightarrow v\in {\mathbb {R}}^s: v^k\in S(x^k) \quad \forall k\in {\mathbb {N}}\big \}\nonumber \\ \end{aligned}$$

(57)

and the inner limit of S at ${{\bar{x}}}$relative toX is

$$\begin{aligned} \mathop {\mathrm{Liminf}}_{x\rightarrow _X{{\bar{x}}}} S(x):= \big \{v \in {\mathbb {R}}^s \mid \forall \{x^k\}\rightarrow _X {{\bar{x}}}, \,\exists \{v^k\}\rightarrow v\in {\mathbb {R}}^s: v^k\in S(x^k) \quad \forall k\in {\mathbb {N}}\big \}. \end{aligned}$$

Here the notation $\{x^k\}\rightarrow _X {{\bar{x}}}$ means that $\{x^k\}\subseteq X$ with $x^k\rightarrow {{\bar{x}}}$. If $X={\mathbb {R}}^k$, we write $x\rightarrow {{\bar{x}}}$ instead of $x\rightarrow _{{\mathbb {R}}^k}{{\bar{x}}}$. We say that S is outer semicontinuous (osc) at ${{\bar{x}}}$relative toX if

$$\begin{aligned} \mathop {\mathrm{Limsup}}_{x\rightarrow _X{{\bar{x}}}} S(x)\subseteq S({{\bar{x}}}). \end{aligned}$$

When the outer and inner limits coincide, we write

$$\begin{aligned} \mathop {\mathrm{Lim}}_{x\rightarrow _X {{\bar{x}}}} S(x):=\mathop {\mathrm{Limsup}}_{x\rightarrow _X {{\bar{x}}}} S(x), \end{aligned}$$

and say that S is contiuous at ${{\bar{x}}}$relative toX.

Let $\varXi $ be a nonempty subset of ${\mathbb {R}}^\ell $ and let ${{\mathcal {A}}}$ be a $\sigma $-field of subsets of $\varXi $, called the measurable subsets of $\varXi $ or the ${{\mathcal {A}}}$-measurable subsets. Let $\rho :\,{{\mathcal {A}}}\rightarrow [0,1]\,$ be a $\sigma $-finite Borel regular, complete, non-atomic, probability measure on ${{\mathcal {A}}}$. The corresponding measure space is denoted $(\varXi ,{{\mathcal {A}}}, \rho )$. A multi-function $\varPsi : \varXi \rightrightarrows {\mathbb {R}}^n$ is said to be ${{\mathcal {A}}}$-measurable, or simply measurable, if for all open sets the set is in ${{\mathcal {A}}}.$ The multi-function $\varPsi $ is said to be ${{\mathcal {A}}}\otimes {{\mathcal {B}}}^n$-measurable if ${{\,\mathrm{gph}\,}}(\varPsi ) = \left\{ (\xi ,v)\,\left| \,v\in \varPsi (\xi )\right. \right\} \in {{\mathcal {A}}}\otimes {{\mathcal {B}}}^n$, where ${{\mathcal {B}}}^n$ denotes the Borel $\sigma $-field on ${\mathbb {R}}^n$ and ${{\mathcal {A}}}\otimes {{\mathcal {B}}}^n$ is the $\sigma $-field on $\varXi \times {\mathbb {R}}^n$ generated by all sets $A\times D$ with $A\in {{\mathcal {A}}}$ and $D\in {{\mathcal {B}}}^n$. If $\varPsi (\xi )$ is closed for each $\xi $ then $\varPsi $ is closed-valued. Similarly, $\varPsi $ is said to be convex-valued if $\varPsi (\xi )$ is convex for each $\xi $. Finally, we note that the completeness of the measure space guarantees the measurability of subsets of $\varXi $ obtained as the projections of measurable subsets of $\varXi \times {\mathbb {R}}^n$:

In particular, this implies that the multi-function $\varPsi $ is ${{\mathcal {A}}}$-measurable if and only if ${{\,\mathrm{gph}\,}}(\varPsi )$ is ${{\mathcal {A}}}\otimes {{{\mathcal {B}}}}^n$-measurable [27, Theorem 14.8].

Let $\varPsi : \varXi \rightrightarrows {\mathbb {R}}^n$, and denote by $~{{\mathcal {S}}}(\varPsi )~$ the set of $\rho $-measurable functions $f:\,\varXi \rightarrow {\mathbb {R}}^n\,$ that satisfy $f(\xi )\in \varPsi (\xi )$ for a.e. $\xi \in \varXi $. We call $~{{\mathcal {S}}}(\varPsi )~$ the set of measurable selections of $\varPsi $.

Theorem 10

(Measurable selections) [27, Corollary 14.6] A closed-valued measurable map $\varPsi : \varXi \rightrightarrows {\mathbb {R}}^n$ always admits a measurable selection.

We say that the measurable multi-function $\varPsi : \varXi \rightrightarrows {\mathbb {R}}^n$ is integrably bounded, or for emphasis $\rho $-integrably bounded, if there is a $\rho $-integrable function $a:\,\varXi \rightarrow {\mathbb {R}}^n_+\,$ such that

$$\begin{aligned} \Vert v\Vert _\infty \le a(\xi ) \end{aligned}$$

(58)

for all pairs $(\xi ,v)\in \varXi \times {\mathbb {R}}^n$ satisfying $v\in \varPsi (\xi ).$ Here and elsewhere we interpret vector inequalities as element-wise inequalities. Let $1\le p\le \infty $. When $\varXi = {\mathbb {R}}^{\ell }$, we let $L^p_m({\mathbb {R}}^{\ell },{{\mathcal {A}}},\rho )$ denote the Banach space of functions mapping ${\mathbb {R}}^{\ell }$ to ${\mathbb {R}}^m$. When $p=2$, $L^2_m({\mathbb {R}}^{\ell },{{\mathcal {A}}},\rho )$ is a Hilbert space with the inner product on the measure space $({\mathbb {R}}^{\ell },{{\mathcal {A}}},\rho )$ given by

$$\begin{aligned} \langle \psi ,\,\phi \rangle _\rho = \int _{{\mathbb {R}}^{\ell }} \left\langle \psi (\xi )\, ,\, \phi (\xi )\right\rangle d\rho , \end{aligned}$$

where $\left\langle \cdot \, ,\, \cdot \right\rangle $ denotes the Euclidean inner product. If $\rho ({\mathbb {R}}^{\ell })<\infty $, then

$$\begin{aligned} L^q_m({\mathbb {R}}^{\ell },{{\mathcal {A}}},\rho )\subseteq L^p_m({\mathbb {R}}^{\ell },{{\mathcal {A}}},\rho )\ \text{ whenever } \ 1\le p\le q\le \infty . \end{aligned}$$

If the function a in (58) is such that $\left\| a(\xi )\right\| _{\scriptscriptstyle p}$ is integrable with respect to the measure $\rho $ on the measure space $(\varXi ,{{\mathcal {A}}},\rho )$, then the multi-function $\varPsi $ is said to be $L^p$-bounded, where $\left\| \cdot \right\| _{\scriptscriptstyle p}$ denotes the p-norm of vectors.

Proposition 2

[7, Proposition 2.2] and [16, Corollary IV.8.4](Weak compactness of measurable selections) Let the multi-function $\varPsi :{\mathbb {R}}^{\ell }\rightrightarrows {\mathbb {R}}^m$ be closed- and convex-valued, and $L^2$-bounded on $L^2_m({\mathbb {R}}^{\ell },{{\mathcal {M}}}^n,\lambda _n)$, where ${{\mathcal {M}}}^n$ is the Lebesgue field on ${\mathbb {R}}^n$ and $\lambda _n$ is n-dimensional Lebesgue measure. Then the set of measurable selections ${{\mathcal {S}}}(\varPsi )$ is a weakly compact, convex set in $L^2_m({\mathbb {R}}^{\ell },{{\mathcal {M}}}^n,\lambda _n)$.

We now develop some properties of integrals of multi-valued mappings. Given a measurable multi-function $\varPsi : \varXi \rightrightarrows {\mathbb {R}}^n$, we define the integral of $\varPsi $ over $\varXi $ with respect to the measure $\rho $ by

$$\begin{aligned} \int \varPsi d\rho :=\left\{ \int _\varXi f d\rho \,\left| \,f\in {\mathcal {S}}(\varPsi )\right. \right\} . \end{aligned}$$

The next theorem, due to Hildenbrand [18], is a restatement of Theorems 3 and 4 of Aumann [2] for multi-functions on the non-atomic measure space $(\varXi ,{{\mathcal {A}}},\rho )$. These results are central to the theory of integrals of multi-valued functions.

Theorem 11

(Integrals of multi-functions) [18, Theorem 4 and Proposition 7] The following properties hold for integrably bounded multi-functions $\varPsi : \varXi \rightrightarrows {\mathbb {R}}^n$ on non-atomic measure spaces $(\varXi ,{{\mathcal {A}}},\rho )$.

(a)
If $\varPsi $ is ${{\mathcal {A}}}\otimes {{\mathcal {B}}}^n$-measurable, then $\int \varPsi d\rho = \int {{\,\mathrm{conv\,}\,}}\varPsi d\rho $.
(b)
If $\varPsi $ is closed valued (not necessarily ${{\mathcal {A}}}\otimes {{\mathcal {B}}}^n$-measurable), then $\int \varPsi d\rho $ is compact.

We conclude this section with a very elementary, but useful lemma on measurable tubes, i.e. multi-valued mappings $\varPsi : \varXi \rightrightarrows {\mathbb {R}}^n$ of the form

$$\begin{aligned} \varPsi (\xi ):=\kappa (\xi ){\mathbb {B}}, \end{aligned}$$

(59)

where ${\mathbb {B}}:=\left\{ x\,\left| \,\left\| x\right\| _2\le 1\right. \right\} $ is the closed unit ball in ${\mathbb {R}}^n$ and $\kappa :\,\varXi \rightarrow {\mathbb {R}}_+\,$ is measurable.

Lemma 11

(Tubes) Let $\varPsi : \varXi \rightrightarrows {\mathbb {R}}^n$ be a measurable tube as in (59) with $\kappa \in L^2_1(\varXi ,{{\mathcal {A}}},\rho )$ non-negative a.e. on $\varXi $. Then, for every $E\in {{\mathcal {A}}}$, $\int _E\varPsi (\xi ) d\rho \subseteq \left[ \int _E\kappa (\xi ) d\rho \right] {\mathbb {B}}\subseteq \left\| {{\kappa }}\right\| _2\rho (E){\mathbb {B}}$.

Proof

The mapping $\varPsi $ in (59) is obviously closed valued and measurable. Therefore, Theorem 10 tells us that ${{\mathcal {S}}}(\varPsi )$ is non-empty. Let $E\in {{\mathcal {A}}}$ and $s\in {{\mathcal {S}}}(\varPsi )$. Then

$$\begin{aligned} \left| \int _Es(\xi ) d\rho \right| \le \int _E|s(\xi )| d\rho \le \int _E \kappa (\xi ) d\rho , \end{aligned}$$

so that $\int _Es(\xi ) d\rho \in \left[ \int _E\kappa (\xi ) d\rho \right] {\mathbb {B}}$. This proves the lemma since $ \int _E \kappa (\xi ) d\rho =\left\langle {{\kappa }}\, ,\, {{\mathcal {X}}}_E\right\rangle \le \left\| {{\kappa }}\right\| _2\rho (E).$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Burke, J.V., Chen, X. & Sun, H. The subdifferential of measurable composite max integrands and smoothing approximation. Math. Program. 181, 229–264 (2020). https://doi.org/10.1007/s10107-019-01441-9

Download citation

Received: 24 March 2018
Accepted: 12 October 2019
Published: 24 October 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s10107-019-01441-9

Keywords

Mathematics Subject Classification

90C15

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The subdifferential of measurable composite max integrands and smoothing approximation

Abstract

Access this article

Similar content being viewed by others

On the resolution of misspecified convex optimization and monotone variational inequality problems

Stochastic proximal gradient methods for nonconvex problems in Hilbert spaces

Nonsmooth spectral gradient methods for unconstrained optimization

References

Acknowledgements