1 Preliminaries

In this paper we investigate nonlinear minmax location problems that are generalizations of the classical Sylvester problem in location theory—not to be confused with Sylvester’s line problem, and of the celebrated Apollonius problem, formulated by means of an extended perturbed minimal time function via a conjugate duality approach, necessary and sufficient optimality conditions being delivered together with characterizations of the optimal solutions in some particular instances. This approach is necessary in order to be able to numerically solve such problems and their corresponding dual problems by means of a proximal method. Applying the mentioned algorithm on the duals of some concrete location problems in matlab delivers optimal solutions to the latter faster and with reduced costs. To the best of our knowledge this is the first time such a method is considered for this type of location optimization problems and the general framework we consider opens the possibility for other problems of more or less similar type (arising, for instance, in machine learning) to be solved in an analogous way. The original location optimization problem cannot be directly numerically solved by means of the usual algorithms because the involved functions often lack differentiability, while a direct employment of some proximal point method is not possible because of the complicated structure of the objective function, that consists of the maximum of n functions, each containing a composition of functions.

In order to introduce the general nonlinear minmax location problems we propose a new perturbed minimal time function that generalizes the classical minimal time function, introduced over 4 decades ago and recently reconsidered by Mordukhovich and Nam in a series of papers (see, for instance, [14,15,16, 19]) and the book [13], and several of its recent extensions (cf. [12, 16, 19, 24, 31]). The motivation to investigate such problems comes from both theoretical and practical reasons, as location type problems arise in various areas of research and real life, such as geometry, physics, economics or health management, applications from these fields being mentioned in our paper as possible interpretations of our results. As suggested, for instance, in [1, 21], solving general location problems as considered in this paper could prove to be useful in dealing with some classes of constrained optimization, too, like the ones that appear in machine learning. Actually, given the fact that the algorithm we propose is able to successfully solve location optimization problems with large data sets in high dimensions faster than its counterparts from the literature makes us confident regarding a future usage of this technique on big data problems arising in machine learning, for instance those approached by means of support vector techniques.

To be able to deal with the considered general nonlinear minmax location problems by means of conjugate duality we rewrite them as multi-composed optimization problems, for which we have recently proposed a duality approach in [10, 25, 27, 29]. The corresponding necessary and sufficient optimality conditions are then derived. While most of the theoretical results are provided in the general framework of Banach spaces, in the more restrictive setting of Hilbert spaces we were also able to provide characterizations of the optimal solutions of the considered problems by means of the dual optimal solutions. Two special cases of the general problem, motivated by economic interpretations, are discussed in a more detailed way, followed by an exact formula of the projection operator onto the epigraph of the maximum of norms, that may prove to be useful in other applications, too. The fourth section of the paper is dedicated to numerical experiments that are presented in a finitely dimensional framework that is specific to most of the possible practice applications of our results. Employing a splitting proximal point method from [3], we solve in matlab concrete location optimization problems corresponding to the mentioned special cases and their conjugate duals, rewritten as unconstrained minimization problems. The computational results show that the primal optimal solutions are obtained faster when numerically solving the dual problems. One of these concrete examples was numerically solved in [13, 17] by means of a subgradient method and a comparison of the computational results is provided as well, stressing once again the superiority of the algorithm proposed in the present paper. Another comparison is made with the log–exponential smoothing accelerated gradient method proposed in [1] in several examples, one of them including a large data set in high dimensions and our method turns out again to converge faster towards the optimal solution of the considered location problem.

Let X be a Hausdorff locally convex space and \(X^*\) its topological dual space endowed with the weak* topology \(w(X^*,X)\). Thus the dual of \(X^*\) is X. For \(x\in X\) and \(x^*\in X^*\), let \(\langle x^*,x\rangle :=x^*(x)\) be the value of the linear continuous functional \(x^*\) at x. A set \(U\subseteq X\) is called convex if \(t x+(1-t)y\in U\) for all \(x,y\in U\) and \(t\in [0,1]\). A nonempty set \(K\subseteq X\) that satisfies the condition \(t K\subseteq K\) for all \(t\ge 0\) is said to be a cone. Note that any cone contains the origin of the space it lies in, denoted by \(0_{X}\) for the space X. Consider a convex cone \(K\subseteq X\), which induces on X a partial ordering relation\(\leqq _K\)”, defined by \(\leqq _K:=\{(x,y)\in X\times X: y-x\in K\}\), i.e. for \(x,y \in X\) it holds \(x \leqq _{K} y \Leftrightarrow y-x \in K\). We attach to X a largest element with respect to “\(\leqq _K\)”, denoted by \(+\infty _K\), which does not belong to X and denote \({\overline{X}}=X\cup \{+ \infty _K\}\). Then it holds \(x\leqq _K +\infty _{K}\) for all \(x\in {\overline{X}}\). We write \(x\le _K y\) when \(x \leqq _{K} y\) and \(x \ne y\), \(\le :=\leqq _{{\mathbb {R}}_+}\) and \(<:=\le _{{\mathbb {R}}_+}\). On \({\overline{X}}\) consider the following operations and conventions: \(x+(+\infty _{K})=(+\infty _{K})+x:=+\infty _{K}\) for all \(x \in {\overline{X}}\text { and }\lambda \cdot (+\infty _{K}):=+\infty _{K}\) for all \(\lambda \in [0,+\infty ]\). \(K^*:=\{x^*\in X^*:~\langle x^*,x\rangle \ge 0\)\(\forall x\in K\}\) is the dual cone of K and by convention \(\langle x^{*},+\infty _K \rangle :=+\infty \) for all \(x^{*}\in K^{*}\). By a slight abuse of notation we denote the extended real space \({\overline{{\mathbb {R}}}}={\mathbb {R}}\cup \{\pm \infty \}\) and consider on it the following operations and conventions: \(\lambda +(+\infty )=(+\infty )+\lambda :=+\infty \) for all \(\lambda \in {[}-\infty ,+\infty ],~ \lambda +(-\infty )=(-\infty )+\lambda :=-\infty \) for all \(\lambda \in {[}-\infty ,+\infty ), ~\lambda \cdot (+\infty ):=+\infty \) for all \(\lambda \in [0,+\infty ],~\lambda \cdot (+\infty ):=-\infty \) for all \(\lambda \in {[}-\infty ,0),~\lambda \cdot (-\infty ):=-\infty \) for all \(\lambda \in (0,+\infty ],~ \lambda \cdot (-\infty ):=+\infty \) for all \(\lambda \in {[}-\infty ,0)\) and \(0(-\infty ):=0\). Given \(S\subseteq X\), we denote its algebraic interior by \(\text {core }S\), its normal cone at \(x\in X\) is \(N_S(x):=\{x^*\in X^*:\langle x^*,y-x\rangle \le 0\)\(\forall ~ y\in S\}\) if \(x\in S\) and \(N_S(x)=\emptyset \) otherwise, its conic hull is \(\text {cone }S:=\{\lambda x:x\in S,~\lambda \ge 0\}\), while if S is convex its strong quasi relative interior (see [6]) is \( \text {sqri}\, S:=\{x\in S:\text {cone }(S-x)\) is a closed linear subspace\(\}\).

For a given function \(f:X\rightarrow {\overline{{\mathbb {R}}}}\) we consider its effective domain\(\text {dom}\, f:=\{x\in X:~f(x)<+\infty \}\) and call fproper if \(\text {dom}\,f \ne \emptyset \) and \(f(x)>-\infty \) for all \(x \in X\). The epigraph of f is \(\text {epi}\, f = \{(x, r)\in X\times {\mathbb {R}}: f(x)\le r\}\). Recall that a function \(f:X\rightarrow {\overline{{\mathbb {R}}}}\) is called convex if \(f(\lambda x+(1-\lambda ) y)\le \lambda f(x)+(1-\lambda )f(y)\) for all \(x,y\in X\) and all \(\lambda \in [0,1]\). For a subset \(A\subseteq X\), its indicator function\(\delta _{A}:X\rightarrow {\overline{{\mathbb {R}}}}\) is

$$\begin{aligned} \delta _{A}(x) := \left\{ \begin{array}{l@{\quad }l} 0,&{} \text {if } x \in A,\\ + \infty , &{}\text {otherwise}, \end{array} \right. \end{aligned}$$

and its support function\(\sigma _{A}:X^*\rightarrow {\overline{{\mathbb {R}}}}\) is \(\sigma _{A}(x^*)=\sup _{x\in A}\langle x^*,x\rangle \). The conjugate function of f with respect to the nonempty subset \(S \subseteq X\) is defined by

$$\begin{aligned} f^{*}_{S}:X^{*} \rightarrow {\overline{{\mathbb {R}}}},\quad f^{*}_{S}(x^{*})=\sup \limits _{x \in S}\{\langle x^{*},x\rangle -f(x)\}. \end{aligned}$$

One has the Young–Fenchel inequality\(f(x)+f^*_S(x^*)\ge \langle x^*,x\rangle \) for all \(x\in S\) and all \(x^*\in X^*\). In the case \(S=X\), \(f^{*}_{S}\) turns into the classical Fenchel–Moreau conjugate function of f denoted by \(f^{*}\). The conjugate of \(f^*\) is said to be the biconjugate function of f and is denoted by \(f^{**}:X\rightarrow {\overline{{\mathbb {R}}}}\). Given the proper functions \(f_i:X\rightarrow \overline{{\mathbb {R}}}\), \(i=1, \ldots , n\), their infimal convolution is \(f_1\square f_2 \square \ldots \square f_n:X\rightarrow {{\overline{{\mathbb {R}}}}}\), \(\big ( f_1\square f_2 \square \ldots \square f_n \big )(x)=\inf \big \{\sum _{i=1}^nf_i(x_i): x_i\in X, i=1, \ldots , n, \sum _{i=1}^nx_i=x \big \}\). We say that the infimal convolution is exact at \(x\in X\) when for x the infimum in its definition is attained.

A function \(f:X\rightarrow {\overline{{\mathbb {R}}}}\) is called lower semicontinuous at\({\overline{x}}\in X\) if \(\liminf _{x\rightarrow {\overline{x}}}f(x)\ge f({\overline{x}})\) and when this function is lower semicontinuous at all \(x\in X\), then we call it lower semicontinuous. The largest lower semicontinuous function nowhere larger than f is its lower semicontinuous envelope\({\bar{f}}:X\rightarrow {\overline{{\mathbb {R}}}}\). Let \(W\subseteq X\) be a nonempty set, then a function \(f:X\rightarrow {\overline{{\mathbb {R}}}}\) is called K-increasing onW, if from \(x\leqq _K y\) follows \(f(x)\le f(y)\) for all \(x,y\in W\). When \(W=X\), then we call the function fK-increasing. If we take an arbitrary \(x\in X\) such that \(f(x)\in {\mathbb {R}}\), then we call the set \(\partial f(x):=\{x^*\in X^*:f(y)-f(x)\ge \langle x^*,y-x\rangle ~\forall y\in X\}\) the (convex) subdifferential of f at x, where the elements of this set are called subgradients. Moreover, if \(\partial f(x)\ne \emptyset \), then we say that f is subdifferentiable at x and if \(f(x)\notin {\mathbb {R}}\), then we make the convention that \(\partial f(x):=\emptyset \). Note that the subgradients of f can be characterized by means of \(f^*\), more precisely \(x^*\in \partial f(x)\) if and only if \(f(x)+f^*(x^*)=\langle x^*,x\rangle \), i.e. the Young–Fenchel inequality is fulfilled as an equality for x and \(x^*\).

Let Z be another Hausdorff locally convex space partially ordered by the convex cone \(Q\subseteq Z\) and \(Z^*\) its topological dual space endowed with the weak* topology \(w(Z^*,Z)\). The domain of a vector function \(F:X \rightarrow {\overline{Z}}=Z\cup \{+\infty _{Q}\}\) is \(\text {dom}\,F:=\{x \in X:F(x) \ne +\infty _{Q}\}\). F is called proper if \(\text {dom}\, F \ne \emptyset \). When \(F(\lambda x+(1-\lambda ) y)\leqq _{Q} \lambda F(x)+(1-\lambda )F(y)\) holds for all \(x,y \in X\) and all \(\lambda \in [0,1]\) the function F is said to be Q-convex. The Q-epigraph of F is \(\text {epi}_{Q}F=\{(x,z)\in X\times Z:F(x)\leqq _Q z\}\) and when Q is closed we say that F is Q-\(\text {epi}\)-closed if \(\text {epi}_{Q}F\) is a closed set. Let us mention that in the case \(Z={\mathbb {R}}\) and \(Q={\mathbb {R}}_+\), the notion of Q-\(\text {epi}\)-closedness falls into the one of lower semicontinuity. For a \(z^*\in Q^*\) we define the function \((z^*F):X\rightarrow {\overline{{\mathbb {R}}}}\) by \((z^*F)(x):=\langle z^*,F(x)\rangle \). Then \(\text {dom}(z^*F)=\text {dom}\, F\). Moreover, it is easy to see that if F is Q-convex, then \((z^*F)\) is convex for all \(z^*\in Q^*\). Let us point out that by the operations we defined on a Hausdorff locally convex space attached with a maximal element and on the extended real space, there holds \(0f=\delta _{\text {dom}\, f}\) and \((0_{Z^*}F)=\delta _{\text {dom}\, F}\) for any \(f:X\rightarrow {\overline{{\mathbb {R}}}}\) and \(F:X \rightarrow {\overline{Z}}\). The vector function F is called positivelyQ-lower semicontinuous at\(x \in X\) if \((z^*F)\) is lower semicontinuous at x for all \(z^*\in Q^*\). The function F is called positivelyQ-lower semicontinuous if it is positively Q-lower semicontinuous at every \(x\in X\). Note that if F is positively Q-lower semicontinuous, then it is also Q-\(\text {epi}\)-closed, while the inverse statement is not true in general (see: [6, Proposition 2.2.19]). \(F:X\rightarrow {\overline{Z}}\) is called (KQ)-increasing onW, if from \(x\leqq _{K} y\) follows \(F(x)\leqq _{Q} F(y)\) for all \(x,y\in W\). When \(W=X\), we call this function (KQ)-increasing. Last but not least denote the optimal objective value of an optimization problem (P) by v(P) and note that when an infimum/supremum is attained we write \(\min \)/\(\max \) instead of \(\inf \)/\(\sup \).

Furthermore, let \({\mathcal {H}}\) be a real Hilbert space equipped with the scalar product \(\langle \cdot ,\cdot \rangle _{{\mathcal {H}}}\), where the associated norm \(\Vert \cdot \Vert _{{\mathcal {H}}}\) is defined by \(\Vert y\Vert _{{\mathcal {H}}}:=\sqrt{\langle y,y\rangle _{{\mathcal {H}}}}\) for all \(y\in {\mathcal {H}}\). If \({\mathcal {H}}={\mathbb {R}}^m\), then \(\Vert \cdot \Vert _{{\mathbb {R}}^m}\) is the Euclidean norm and we will write for simplicity just \(\Vert \cdot \Vert \). The proximal point operator of parameter \(\gamma >0\) of a function \(f:{\mathcal {H}}\rightarrow {{\overline{{\mathbb {R}}}}}\) at \(x\in {\mathcal {H}}\) is defined as

$$\begin{aligned} {{\,\mathrm{prox}\,}}_{\gamma f}: {\mathcal {H}} \rightarrow {\mathcal {H}},\ {{\,\mathrm{prox}\,}}_{\gamma f}(x)= \mathop {\hbox {arg min}}\limits \limits _{y\in {\mathcal {H}}} \left\{ \gamma f(y) + \frac{1}{2} \Vert y-x\Vert ^2\right\} . \end{aligned}$$

For more on convex optimization in Hilbert spaces we warmly recommend [3].

2 Nonlinear minmax location problems

As the perturbed minimal time functions play a decisive role in this article, we start this section with some of their properties.

2.1 Properties of the perturbed minimal time function

In order to introduce the perturbed minimal time functions, one needs first to define a gauge. In the literature one can find different functions called gauges, see, for instance, [9] or [22, Section 15]. In the following we call gauge function (known in the literature also as the Minkowski functional) of a set \(C\subseteq X\) the function \(\gamma _C:X\rightarrow {\overline{{\mathbb {R}}}}\), defined by

$$\begin{aligned} \gamma _C(x): = \inf \{\lambda >0:x\in \lambda C\}. \end{aligned}$$

Note that the gauge function can also take the value \(+\infty \) if there does not exists an element \(\lambda >0\) such that \(x\in \lambda C\), as by definition it holds \(\inf \emptyset =+\infty \). From the definition it follows that \(\text {dom}\, \gamma _C = \text {cone }C\) if \(0\in C\) and \(\text {dom}\, \gamma _C = \text {cone }C{\setminus } \{0\}\) if \(0\notin C\). As \(\gamma _\emptyset \equiv + \infty \), we consider further the set C to be nonempty. The conjugate function of \(\gamma _C\) is (cf. [6, Example 2.3.4], as the additional hypotheses imposed there on C are actually not employed for this formula) \(\gamma _C^*:X^*\rightarrow {\overline{{\mathbb {R}}}}\)

$$\begin{aligned} \gamma _C^*(x^*)={\left\{ \begin{array}{ll} 0, &{} \text {if } \sigma _C(x^*)\le 1,\\ +\infty , &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

Furthermore, the polar set of C is \(C^0:=\{x^*\in X^*:\sigma _C(x^*)\le 1\}\), and by means of the polar set the dual gauge of C is defined by

$$\begin{aligned} \gamma _{C^0}(x^*):=\sup _{x\in C}\langle x^*,x\rangle =\sigma _C(x^*). \end{aligned}$$

Remark 2.1

The conjugate of \(\gamma _C\) can equivalently be expressed by

$$\begin{aligned}\gamma _C^*(x^*): = {\left\{ \begin{array}{ll} 0, &{} \text {if } \gamma _{C^0}(x^*)\le 1,\\ +\infty , &{} \text {otherwise}, \end{array}\right. } = \delta _{C^0}(x^*)\;\, \forall x^*\in X^*. \end{aligned}$$

It is well-known that for the gauge function the generalized Cauchy–Schwarz inequality is fulfilled. The proof of this inequality is rare to find and for this reason we give it here.

Lemma 2.1

It holds \(\gamma _C(x)\gamma _{C^0}(x^*)\ge \langle x^*,x\rangle \) for all \(x\in X\), \(x^*\in X^*\).

Proof

Let \(x^*\in X^*\) and \(x\in X\). If \(\langle x^*,x\rangle \le 0\), then there is nothing to prove, as the gauge and the dual gauge are nonnegative functions.

Let \(\langle x^*,x\rangle >0\). If \(\gamma _C(x)>0\) then we have that

$$\begin{aligned} \gamma _{C^0}(x^*)\ge \sup _{\lambda >0,~\gamma _C(x)\le \frac{1}{\lambda }}\lambda \langle x^*,x\rangle =\langle x^*,x\rangle \sup _{0<\lambda \le \frac{1}{\gamma _C(x)}}\lambda =\frac{1}{\gamma _C(x)}\langle x^*,x\rangle , \end{aligned}$$

i.e. \(\gamma _C(x)\gamma _{C^0}(x^*)\ge \langle x^*,x\rangle \).

Otherwise, if \(\gamma _C(x)=0\), then one has that

$$\begin{aligned} \gamma _{C^0}(x)\ge \sup _{\lambda >0,~0=\gamma _C(x)\le \frac{1}{\lambda }}\lambda \langle x^*,x\rangle =+\infty , \end{aligned}$$

i.e. \(\gamma _C(x)\gamma _{C^0}(x^*)=0(+\infty )\ge \langle x^*,x\rangle \).\(\square \)

Given a nonempty set \(\Omega \subset X\) and a proper function \(f:X\rightarrow {\overline{{\mathbb {R}}}}\), we define the extended perturbed minimal time function\({\mathcal {T}}_{\Omega ,f}^C:X\rightarrow {\overline{{\mathbb {R}}}}\) as the infimal convolution of \(\gamma _C\), f and \(\delta _{\Omega }\), i.e. \({\mathcal {T}}_{\Omega ,f}^C:=\gamma _C\square f\square \delta _{\Omega }\), more precisely

$$\begin{aligned} {\mathcal {T}}_{\Omega ,f}^C(x):=\inf _{y\in X,~z\in \Omega }\left\{ \gamma _C(x-y-z)+f(y)\right\} . \end{aligned}$$

Remark 2.2

To the best of our knowledge the function \({\mathcal {T}}_{\Omega ,f}^C\) has not been considered in this form in the literature yet and it covers as special cases several important functions. For instance, if \(f=\delta _{\{0_{X}\}}\), then one gets the classical minimal time function (see [13,14,15]) \({\mathcal {T}}_{\Omega }^C:{\mathbb {R}}^n\rightarrow {{\overline{{\mathbb {R}}}}},\ {\mathcal {T}}_{\Omega }^C(x):=\inf \{t\ge 0: (x-t C)\cap \Omega \ne \emptyset \}\), that, when \(\Omega =\{0_X\}\) collapses to the gauge function. In [12] one finds two perturbations of the classical minimal time function, \(\gamma _C\square f\) (introduced in [31] and motivated by a construction specific to differential inclusions) and \(\gamma _C\square (f+ \delta _{\Omega })\), that contains as a special case the perturbed distance function introduced in [24]. The latter function has motivated us to introduce \({\mathcal {T}}_{\Omega ,f}^C\), where the function f and the set \(\Omega \) do not share the same variable anymore and can thus be split in the dual representations. Other generalizations of the classical minimal time function can be found, for instance, in [5, 16, 19]. The minimal time function and its generalizations have been employed in various areas of research such as location theory (cf. [13, 19]), nonsmooth analysis (cf. [5, 12, 14,15,16, 19, 24, 31]), control theory and Hamilton–Jacobi partial differential equation (mentioned in [5]), best approximation problems (cf. [24]) and differential inclusions (cf. [31]). Note also the connection observed in [19] between the minimal time function and the scalarization function due to Tammer (Gerstewitz) considered in vector optimization.

Moreover, as \({\mathcal {T}}_{\Omega ,f}^C\) is an infimal convolution its conjugate function turns into (see [6, Proposition 2.3.8.(b)]) \(({\mathcal {T}}_{\Omega ,f}^C)^*=\gamma _C^*+f^*+\sigma _{\Omega } = \delta _{C^0}+f^*+\sigma _{\Omega }\) (see Remark 2.1).

Since \(\text {dom}\, \gamma _C\ne \emptyset \) and \(\gamma _C\) is a nonnegative function, it follows by [6, Lemma 2.3.1.(b)] that \(\gamma _C^*=\delta _{C^0}\) is proper, convex and lower semicontinuous. If f has an affine minorant, then \(f^*\) is a proper, convex and lower semicontinuous function, too. Further, under the additional assumption \(C^0\cap \text {dom}\, f^*\cap \text {dom}\,\sigma _\Omega \ne \emptyset \), [6, Theorem 2.3.10] yields that the biconjugate of \({\mathcal {T}}_{\Omega ,f}^C\) is, under the mentioned hypotheses, given by \(({\mathcal {T}}_{\Omega ,f}^C)^{**}=\overline{\gamma _C^{**}\Box f^{**}\Box \sigma ^*_{\Omega }}\), and one can derive as byproducts conjugate and biconjugate formulae for the classical minimal time function and its other extensions mentioned in Remark 2.2.

Theorem 2.1

Let C be convex, closed and contain \(0_X\), \(\Omega \) be closed and convex and \(f:X\rightarrow {\overline{{\mathbb {R}}}}\) be also convex and lower semicontinuous such that \(C^0\cap \text {dom}\, f^*\cap \text {dom}\,\sigma _\Omega \ne \emptyset \). Suppose that one of the following holds

  1. (a)

    \(\text {epi}\,\gamma _C+\text {epi}\,f+(\Omega \times {\mathbb {R}}_+)\) is closed,

  2. (b)

    there exists an element \(x^*\in C^0\cap \text {dom}\, f^*\cap \text {dom}\, \sigma _{\Omega }\) such that two of the functions \(\delta _{C^0}, f^* \) and \(\sigma _{\Omega }\) are continuous at \(x^*\).

Then \({\mathcal {T}}_{\Omega ,f}^C\) is proper, convex and lower semicontinuous and moreover, it holds

$$\begin{aligned} {\mathcal {T}}_{\Omega ,f}^C(x)=\min _{y\in X,~z\in \Omega }\left\{ \gamma _C(x-y-z)+f(y)\right\} \quad \forall x\in X, \end{aligned}$$

i.e. the infimal convolution of \(\gamma _C\), f and \(\delta _{\Omega }\) is exact.

Proof

As C is closed and convex such that \(0_X\in C\), it follows by [27, Theorem 1] that \(\gamma _C\) is proper, convex and lower semicontinuous. Further, the nonemptiness, closedness and convexity of \(\Omega \) imply the properness, convexity and lower semicontinuity of \(\delta _{\Omega }\). Hence, one gets from the Fenchel–Moreau Theorem that \(\gamma _C^{**}=\gamma _C\), \(f^{**}=f\) and \(\delta ^{**}_{\Omega }=\delta _{\Omega }\) and from [6, Theorem 3.5.8.(a)], taking into consideration that the conjugate functions of f, \(\delta _{\Omega }\) and \(\gamma _C\) are proper, convex and lower semicontinuous (as noted above), follows the desired statement.\(\square \)

Remark 2.3

Under the hypotheses of Theorem 2.1 the subdifferential of \({\mathcal {T}}_{\Omega ,f}^C\) can be written for any \(x\in X\) as \(\partial {\mathcal {T}}_{\Omega ,f}^C(x)= \partial \gamma _C(x-y-z) \cap \partial f(y) \cap N_\Omega (z)\), where y and z are the points where the minimum in the definition of the infimal convolution is attained. Note, moreover, that the subdifferential of \(\gamma _C\) at any \(x\in X\) coincides with the face of \(C^0\) exposed by x (cf. [11]), i.e. \(\partial \gamma _C (x) = \{x^*\in C^0: \langle x^*, x\rangle = \sigma _{C^0}(x)\}\).

Remark 2.4

Results connected to the situation when the regularity condition (b) in Theorem 2.1 is fulfilled can be found, for instance, in [23]. The support function of a compact convex set is real valued and continuous, however, in the absence of compactness it is surely continuous only on the (relative) interior of its domain, while an indicator function is continuous over the interior of the corresponding set.

Taking f to be a gauge or an indicator function one obtains the following geometrical interpretations of the generalization of the extended perturbed minimal time function.

Remark 2.5

Let \(C,\Omega ,G\subseteq X\) be convex and closed sets such that \(0_X\in C\cap G\). Then

$$\begin{aligned} {\mathcal {T}}_{\Omega ,\gamma _G}^{-C}(x)= & {} \inf _{\begin{array}{c} \alpha ,\beta>0,~z\in \Omega ,~y\in X,\\ x-y-z\in -\alpha C,~y\in \beta G \end{array}}\{\alpha +\beta \}=\inf _{\begin{array}{c} \alpha ,\beta>0,~z\in \Omega ,~k\in X,\\ x-k\in -\alpha C,~k-z\in \beta G \end{array}}\{\alpha +\beta \}\\= & {} \inf _{\begin{array}{c} \alpha ,\beta >0,\\ (x+\alpha C)\cap (\Omega +\beta G)\ne \emptyset \end{array}}\{\alpha +\beta \}. \end{aligned}$$

The last formula suggests interpreting \(\alpha \) as the minimal time needed for the given point x to reach the set \(\Omega \) along the constant dynamics \(-C\), while \(\Omega \) is moving in direction of x with respect to the constant dynamics characterized by the set G. The value \(\beta \) gives then the minimal time needed for \(\Omega \) to reach x.

Remark 2.6

Let \(S\subseteq X\), \(\Omega \) and C be convex and closed, with \(S\ne \emptyset \) and \(0_X\in C\). Then

$$\begin{aligned} {\mathcal {T}}_{\Omega ,\delta _S}^{-C}(x)= & {} \inf \{\lambda>0:y\in S,~z\in \Omega ,~x-y-z\in -\lambda C\}\\= & {} \inf \{\lambda >0:(x+\lambda C)\cap (S+\Omega )\ne \emptyset \}. \end{aligned}$$

The extended perturbed minimal time function reduces to the classical minimal time function with the target set \(S+\Omega \), i.e. if the set C describes constant dynamics, then \({\mathcal {T}}_{\Omega ,\delta _S}^{-C}(x)\) is the minimal time \(\lambda >0\) needed for the point x to reach the target set \(S+\Omega \) (see for instance [13]). However, one can also write

$$\begin{aligned} {\mathcal {T}}_{\Omega ,\delta _S}^{-C}(x)= & {} \inf \{\lambda>0:y\in S,~z\in \Omega ,~x-y-z\in -\lambda C\}\nonumber \\= & {} \inf \{\lambda >0:(x+S+\lambda C)\cap \Omega \ne \emptyset \}, \end{aligned}$$
(1)

and, when C characterizes again constant dynamics, \({\mathcal {T}}_{\Omega ,\delta _S}^{-C}(x)\) can be understood as the minimal time \(\lambda > 0\) needed for the set S translated by the point x to reach the target set \(\Omega \).

Remark 2.7

When C is convex and closed with \(0\in C\), \(\Omega \) is convex and compact and \(f=\delta _{\{0_X\}}\), then \(\sigma _{\Omega }\) is continuous and \({\mathcal {T}}_{\Omega ,\delta _{\{0_X\}}}^C\) is proper, convex and lower semicontinuous by Theorem 2.1, and can be written as \({\mathcal {T}}_{\Omega ,\delta _{\{0_X\}}}^C=\min _{z\in \Omega }\gamma _C(\cdot -z)\). This statement can also be found in the special case \(X={\mathbb {R}}^n\) in [13, Theorem 3.33 and Theorem 4.7]. Moreover, in [12] it is assumed that \(\gamma _C \square (f+\delta _{\Omega })\) is exact, under similar hypotheses that would actually guarantee this outcome.

Remark 2.8

If \(0_{X}\in \text {core }C\), \(\gamma _C\) has a full domain, consequently so does the corresponding extended perturbed minimal time function since in general \(\text {dom}\, {\mathcal {T}}_{\Omega ,f}^C=\text {dom}\, \gamma _C+\text {dom}\, f+\text {dom}\, \delta _{\Omega }\).

3 Duality results

3.1 Location problem with perturbed minimal time functions

In [27] the authors approached nonlinear minmax location problems by means of the conjugate duality in the case where the distances were measured by gauge functions. In this section we investigate such location problems in a more general setting, namely, where the distances are measured by perturbed minimal time functions.

Let X be a Banach space (note that most of the following investigations can be extended to a Fréchet space, too) and \(a_i\in {\mathbb {R}}_+\), \(i=1, \ldots , n\), be given nonnegative set-up costs, where \(n\ge 2\) and consider the following generalized location problem

$$\begin{aligned} (P^S_{h,{\mathcal {T}}})&\inf _{x\in S}\max _{1\le i\le n}\left\{ h_i\left( {\mathcal {T}}_{\Omega _i,f_i}^{C_i}(x)\right) +a_i\right\} , \end{aligned}$$

where \(S\subseteq X\) is nonempty, closed and convex, \(C_i\subseteq X\) is closed and convex with \(0_X\in \text {int }C_i\), \(\Omega _i\subseteq X\) is nonempty, convex and compact, \(f_i:X\rightarrow {\overline{{\mathbb {R}}}}\) is proper, convex and lower semicontinuous, \(h_i:{\mathbb {R}}\rightarrow {\overline{{\mathbb {R}}}}\) with \(h_i(x)\in {\mathbb {R}}_+\), if \(x\in {\mathbb {R}}_+\), and \(h_i(x)=+\infty \), otherwise, is proper, convex, lower semicontinuous and increasing on \({\mathbb {R}}_+\), \(i=1, \ldots , n\).

Note that the assumptions made above yield that \(0_{X^*}\in C^0_i\cap \text {dom}\, \sigma _{\Omega }\cap \text {dom}\, f_i^*\) and as \(\gamma ^*_{C_i}=\delta _{C^0}\) and \(\sigma _{\Omega _i}\) are continuous functions (as \(0\in \text {int }C_i\) and \(\Omega _i\) is convex and compact), one gets by Theorem 2.1 that \({\mathcal {T}}_{\Omega _i,f_i}^{C_i}\) is a proper, convex and lower semicontinuous function with full domain and thus continuous, \(i=1, \ldots , n\). Moreover, since \(h_i\) is a proper, convex, lower semicontinuous and increasing function, \(i=1, \ldots , n\), it follows that the objective function of \((P^S_{h,{\mathcal {T}}})\) is proper, convex and lower semicontinuous, which means that \((P^S_{h,{\mathcal {T}}})\) is a convex optimization problem.

Now, we analyze how can be understood the location problem \((P^S_{h,{\mathcal {T}}})\) in the more simple situation where the function \(h_i\) is linear continuous on \((0, +\infty )\) and \(f_i\) is the indicator function of a nonempty, closed and convex subset of X, \(i=1, \ldots , n\).

Remark 3.1

In the context of the Remark 2.5 let us consider the following concrete minmax location problem where \(h_i (x)= x + \delta _{{\mathbb {R}}_+}(x)\), \(x\in {\mathbb {R}}\), \(a_i=0\), and \(C_i\), \(G_i\) and \(\Omega _i\) are closed and convex sets such that \(0_X\in C_i\cap G_i\) for all \(i=1, \ldots , n\),

$$\begin{aligned}&(P^S_{\gamma _G,{\mathcal {T}}}) \inf _{x\in X}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{-C_i}(x)\right\} =\inf _{\begin{array}{c} x\in X,~t\in {\mathbb {R}},~{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{-C_i}(x)\le t,\\ i=1, \ldots , n \end{array}}t\\&\quad =\inf _{\begin{array}{c} x\in X,~t\in {\mathbb {R}},~\inf \{\alpha _i+\beta _i>0:(x+\alpha _i C_i)\cap (z_i+\beta _iG_i)\}\le t,\\ z_i\in \Omega _i, i=1, \ldots , n \end{array}}t=\inf _{\begin{array}{c} x\in X,~\alpha _i,~\beta _i,~t> 0,~\alpha _i+\beta _i\le t, z_i\in \Omega _i,\\ (x+\alpha _i C_i)\cap (z_i+\beta _iG_i)\ne \emptyset ,~ i=1, \ldots , n \end{array}}t. \end{aligned}$$

The last formulation allows the following economical interpretation. Given n countries, each with a growing demand \(G_i\) for a product and an average income (or average budget) characterized by the set \(\Omega _i\), \(i=1, \ldots , n\), consider a company, which produces and sells this product, planning to build a production facility. The production speed of the production facility as well as the preference of the company for a country are characterized by the sets \(C_i\), \(i=1, \ldots , n\). Then the objective of the company is to determine a location \({\overline{x}}\) for a production facility such that the total demand for the product can be satisfied in the shortest time, i.e. the company wants to enter all lucrative markets as fast as possible. Additionally, in the general case when the functions \(h_1, \ldots , h_n\) are not necessarily linear (over their domains), these can be seen as cost or production functions, while the set-up costs \(a_1, \ldots , a_n\) are the costs of the testing of the products to meet the various specifications asked by each of the n countries.

Remark 3.2

When \(L_i\subseteq X\), \(i=1, \ldots , n\), are nonempty, closed and convex sets, \(f_i=\delta _{L_i}\) and \(h_i= \cdot + \delta _{{\mathbb {R}}_+}(\cdot )\), \(i=1, \ldots , n\), \((P^S_{h,{\mathcal {T}}})\) reads as (see also Remark 2.6)

$$\begin{aligned} (P^S_{{\mathcal {T}}})&\inf _{x\in S}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\delta _{L_i}}^{C_i}(x)+a_i\right\} =\inf \limits _{\begin{array}{c} x\in S,~t\in {\mathbb {R}},\\ {\mathcal {T}}_{\Omega _i,\delta _{L_i}}^{C_i}(x)+a_i\le t,\\ i=1, \ldots , n \end{array}}t= \inf \limits _{\begin{array}{c} x\in S,~t\in {\mathbb {R}},\\ \inf \left\{ \lambda _i>0:(x-\lambda _iC_i)\cap (\Omega _i+L_i)\ne \emptyset \right\} +a_i\le t,\\ i=1, \ldots , n \end{array}}t \end{aligned}$$

and can be seen as finding a point \(x\in S\) and the smallest number \(t>0\) such that

$$\begin{aligned} (x-(t-a_i)C_i)\cap (\Omega _i+L_i)\ne \emptyset \quad \forall i=1, \ldots , n, \end{aligned}$$
(2)

where \(C_i\) can be defined as a generalized ball with radius \(t-a_i\), \(i=1, \ldots , n\) (see [13]). This approach is especially useful if the target set is hard to handle, but can be split into a Minkowski sum of two simpler sets \(\Omega _i\) and \(L_i\), \(i=1, \ldots , n\), as happens for instance with the rounded rectangles that can be written as sums of rectangles and circles. Note also that [2] addresses the situation when the projection onto a Minkowski sum of closed convex sets coincides with the sums of projections into these sets. Alternatively, \((P^S_{{\mathcal {T}}})\) can be written as

$$\begin{aligned} (P^S_{{\mathcal {T}}})&\inf _{x\in S}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\delta _{L_i}}^{C_i}(x)+a_i\right\} = \inf \limits _{\begin{array}{c} x\in S,~t\in {\mathbb {R}},\\ \inf \left\{ \lambda _i>0:(x-L_i-\lambda _iC_i)\cap \Omega _i\ne \emptyset \right\} +a_i\le t,\\ i=1, \ldots , n \end{array}}t, \end{aligned}$$

which allows the interpretation as finding a point \(x\in S\) and the smallest number \(t>0\) such that

$$\begin{aligned} (x-L_i-(t-a_i)C_i)\cap \Omega _i\ne \emptyset \quad \forall i=1, \ldots ,n. \end{aligned}$$
(3)

Both (2) and (3) are generalizations of the classical Sylvester problem that consists in finding the smallest circle that encloses finitely many given points.

In order to approach the problem \((P^S_{h,{\mathcal {T}}})\) by means of the conjugate duality concept introduced in [25] (with \(X_0={\mathbb {R}}^n\) partially ordered by the convex cone \(K_0={\mathbb {R}}^n_+\), \(X_1=X^n\) partially ordered by the trivial cone \(K_1=\{0_{X^n}\}\) and \(X_2=X\)), we consider the following functions

$$\begin{aligned} f:{\mathbb {R}}^n\rightarrow {\overline{{\mathbb {R}}}},\ \begin{array}{l@{\quad }l} f(z):=\left\{ \begin{array}{l@{\quad }l} \max \limits _{1\le i\le n}\{h_i(z_i)+a_i\},&{} \text {if } z=(z_1, \ldots ,z_n)^\top \in {\mathbb {R}}^n_+,~i=1, \ldots , n,\\ + \infty ,&{} \text {otherwise}, \end{array} \right. \end{array} \end{aligned}$$

\(F:X^n\rightarrow {\mathbb {R}}^n\), \(F(y_1, \ldots , y_n):=\left( {\mathcal {T}}_{\Omega _1,f_1}^{C_1}(y_1), \ldots ,{\mathcal {T}}_{\Omega _n,f_n}^{C_n}(y_n)\right) ^\top \) and \(G:X\rightarrow X^n\), \(G(x):=(x, \ldots ,x)\). With these newly introduced functions we can write the optimization problem \((P^S_{h,{\mathcal {T}}})\) as a multi-composed optimization problem (cf. [25, 27])

$$\begin{aligned} (P^S_{h,{\mathcal {T}}})&\inf _{x\in S}(f\circ F\circ G)(x). \end{aligned}$$

Notice that the function f is proper, convex, \({\mathbb {R}}_+^n\)-increasing on \(F(\text {dom}\, F)+K_0=\text {dom}\, f={\mathbb {R}}^n_+\) and lower semicontinous. Moreover, as the functions \({\mathcal {T}}_{\Omega _i,f_i}^{C_i}\), \(i=1, \ldots , n\), are proper, convex and lower semicontinuous, it is obvious that the function F is proper, \({\mathbb {R}}^n_+\)-convex and \({\mathbb {R}}^n_+\)-\(\text {epi}\)-closed. In addition, as the function G is linear continuous, it follows that the function F does not need to be monotone as asked in the general theory in [10, 25, 27, 29]. Employing the duality concept introduced in [10, 25] we attach to \((P^S_{h,{\mathcal {T}}})\) the following conjugate dual problem

$$\begin{aligned} (D^S_{h,{\mathcal {T}}}) \sup _{\begin{array}{c} z_i^{*}\in {\mathbb {R}}_+,~w_i^{*}\in X^*,\\ i=1, \ldots , n \end{array}}\Bigg \{\inf \limits _{x\in S}\left\{ \sum _{i=1}^n\langle w_i^{*},x\rangle \right\} -f^*(z^{*})-(z^{*}F)^*(w^{*})\Bigg \}, \end{aligned}$$

where \(z^{*}=(z_1^{*}, \ldots ,z_n^{*})^\top \in {\mathbb {R}}_+^n\) and \(w^{*}=(w_1^{*}, \ldots , w_n^{*})\in (X^*)^n\) are the dual variables. By [27, Theorem 4] one has

$$\begin{aligned} f^*(z_1^{*}, \ldots ,z_n^{*})=\min _{\begin{array}{c} \sum \limits _{i=1}^n \lambda _i\le 1,~\lambda _i\ge 0,\\ i=1, \ldots , n \end{array}}\left\{ \sum _{i=1}^n [(\lambda _ih_i)^*(z_i^{*})-\lambda _ia_i]\right\} , \end{aligned}$$

while \((z^{*}F)^*(w^{*})=\sum _{i=1}^n\left( z_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}\right) ^*(w_i^{*})\), thus \((D^S_{h,{\mathcal {T}}})\) becomes

$$\begin{aligned} (D^S_{h,{\mathcal {T}}})~~\sup \limits _{\begin{array}{c} \sum \limits _{i=1}^n \lambda _i\le 1,~\lambda _i,z_i^{*}\ge 0,\\ w_i^{*}\in X^*,~ i=1, \ldots , n \end{array}}&\Bigg \{-\sigma _S\left( -\sum _{i=1}^n w_i^{*}\right) -\sum \limits _{i=1}^n[(\lambda _ih_i)^*(z_i^{*})-\lambda _ia_i] \\&-\sum \limits _{i=1}^n\left( z_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}\right) ^*(w_i^{*})\Bigg \}. \end{aligned}$$

In order to investigate further this dual problem, we separate in the sum \(\sum _{i=1}^n(\lambda _ih_i)^*\) the terms with \(\lambda _i>0\) and the terms with \(\lambda _i=0\) as well as in \(\sum _{i=1}^n\big (z_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}\big )^*\) the terms with \(z_i^{*}>0\) and the terms with \(z_i^{*}=0\) in \((D^S_{h,{\mathcal {T}}})\). Denote \(I=\left\{ i\in \{1, \ldots , n\}:z_i^{*}>0\right\} \) and \(R=\left\{ r\in \{1, \ldots , n\}:\lambda _r>0\right\} \). If \(i\in \{1, \ldots , n\} {\setminus } I\) it holds \((0\cdot {\mathcal {T}}_{\Omega _i,f_i}^{C_i})^*=\sigma _X= \delta _{\{0_{X^*}\}}\), while when \(i\in I\) one gets

$$\begin{aligned} \left( z_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}\right) ^*(w_i^{*})=\left\{ \begin{array}{l@{\quad }l} z_i^{*}f_i^*\left( \frac{1}{z_i^{*}}w_i^{*}\right) +\sigma _{\Omega _i}(w^{*}_{i}),&{} \text {if } \gamma _{C^0_i}(w^{*}_{i})\le z^{*}_{i},\\ +\infty ,&{}\text {otherwise.} \end{array} \right. \end{aligned}$$
(4)

Further, let us consider the case \(r\in \{1, \ldots , n\} {\setminus } R\), i.e. \(\lambda _r=0\), then one has, since \(z_r^{*}\ge 0\),

$$\begin{aligned} (0\cdot h_r)^*(z_r^{*})=\sup _{z_{r}\ge 0}\{z^{*}_{r}z_{r}\}=\left\{ \begin{array}{l@{\quad }l} 0, &{}\text {if } z^{*}_{r}=0,\\ +\infty ,&{}\text {otherwise.} \end{array} \right. \end{aligned}$$
(5)

For \(r\in R\), i.e. \(\lambda _r>0\), follows

$$\begin{aligned} (\lambda _rh_r)^*(z_r^{*})=\lambda _rh_r^*\left( \frac{z_r^{*}}{\lambda _r}\right) . \end{aligned}$$
(6)

Hence, formula (5) implies that if \(r\notin R\) then \(z_r^{*}=0\), otherwise the values being not relevant for the dual problem, which means that \(I\subseteq R\). Therefore \((D^S_{h,{\mathcal {T}}})\) turns into

$$\begin{aligned}&\sup \limits _{\begin{array}{c} \lambda _i,~z_i^{*}\ge 0,~i=1, \ldots , n, \\ I=\left\{ i\in \{1, \ldots , n\}:z_i^{*}>0\right\} \subseteq R=\left\{ r\in \{1, \ldots , n\}:\lambda _r>0\right\} ,\\ w_i^{*}\in X^*,~\gamma _{C^0_i}(w_i^{*})\le z_i^{*},~i\in I,~\sum \limits _{r\in R}\lambda _r\le 1 \end{array}} \Bigg \{-\sigma _S\left( -\sum \limits _{i\in I} w_i^{*}\right) -\sum \limits _{r\in R}\lambda _r\left[ h_r^*\left( \frac{z_r^{*}}{\lambda _r}\right) -a_r\right] \nonumber \\&\quad -\sum \limits _{i\in I}\left[ z_i^{*}f_i^*\left( \frac{1}{z_i^{*}}w_i^{*}\right) +\sigma _{\Omega _i}(w^{*}_{i})\right] \Bigg \}. \end{aligned}$$
(7)

Remark 3.3

Let \(a_i=0\), \(i=1, \ldots , n\). Taking the functions \(h_i\) as in Remark 3.2, their conjugates are \(h_i^*=\delta _{(-\infty , 1]}\), \(i=1, \ldots , n\), and the conjugate dual problem to \((P^S_{{\mathcal {T}}})\) reads as

$$\begin{aligned}&(D^S_{{\mathcal {T}}})~~\sup \limits _{\begin{array}{c} \lambda _i,~z_i^{*}\ge 0,~i=1, \ldots , n,~ \sum \limits _{r\in R}\lambda _r\le 1, \\ I=\left\{ i\in \{1, \ldots , n\}:z_i^{*}>0\right\} \subseteq R=\left\{ r\in \{1, \ldots , n\}:\lambda _r>0\right\} ,\\ z_r^{*}\le \lambda _r,~r\in R,~w_i^{*}\in X^*,~\gamma _{C^0_i}(w_i^{*})\le z_i^{*},~i\in I \end{array}} \left\{ -\sigma _S\left( -\sum \limits _{i\in I} w_i^{*}\right) \right. \\&\quad -\sum \limits _{i\in I}{\left. \left[ z_i^{*}f_i^*\left( \frac{1}{z_i^{*}}w_i^{*}\right) +\sigma _{\Omega _i}(w^{*}_{i})\right] \right\} }. \end{aligned}$$

This dual problem can be simplified as follows.

Proposition 3.1

The problem \((D^S_{{\mathcal {T}}})\) can be equivalently written as

$$\begin{aligned}&({\widetilde{D}}^S_{{\mathcal {T}}}) \sup \limits _{\begin{array}{c} u_i^{*}\ge 0,~i=1, \ldots , n,{\widetilde{I}}=\{i\in \{1, \ldots , n\}:u_i^{*}>0\},\\ v_i^{*}\in X^*, i\in {\widetilde{I}},\gamma _{C_i^0}(v_i^{*})\le u_i^{*},i\in {\widetilde{I}},\sum \limits _{i\in {\widetilde{I}}}u_i^{*}\le 1 \end{array}} \Bigg \{-\sigma _S\left( -\sum \limits _{i\in {\widetilde{I}}} v_i^{*}\right) \\&\quad -\sum \limits _{i\in {\widetilde{I}}}\left[ u_i^{*}f_i^*\left( \frac{1}{u_i^{*}}v_i^{*}\right) +\sigma _{\Omega _i}(v^{*}_{i})\right] \Bigg \}. \end{aligned}$$

Proof

Take first a feasible element \((\lambda ,z^{*},w^{*})=(\lambda _1, \ldots ,\lambda _n,z_1^{*}, \ldots ,z^{*}_n,w^{*}_I)\in {\mathbb {R}}_+^n\times {\mathbb {R}}_+^n\times (X^*)^{|I|}\) to the problem \((D_{{\mathcal {T}}}^{S})\), where by \(w^{*}_I\in (X^*)^{|I|}\) we denote the vector having as components \(w^*_i\) with \(i\in I\), and set \({\widetilde{I}}= I\), \(u_i^{*}=\lambda _i,~ i\in {\widetilde{I}},~u_j^{*}=0,~j\notin {\widetilde{I}}\) and \(v_i^{*}=w_i^{*},~i\in {\widetilde{I}},~v_j^{*}=0_{X^*},~j\notin {\widetilde{I}}\), then it follows from the feasibility of \((\lambda ,z^{*},w^{*})\) that \(\sum _{i\in {\widetilde{I}}}^n u^{*}_i\le 1,~ u_i^{*}>0,~v_i^{*}\in X^*,~\gamma _{C^0_i}(v^{*}_{i})\le u^{*}_{i},~ i\in {\widetilde{I}}\) and \(u_j^{*}=0,~j\notin {\widetilde{I}}\), i.e. \((u^{*},v^{*})\in {\mathbb {R}}^n_+\times (X^*)^{|{\widetilde{I}}|}\) is feasible to the problem \(({\widetilde{D}}_{{\mathcal {T}}}^{S})\). Hence, it holds \(-\sigma _S\big (-\sum _{i\in I} w_i^{*}\big )-\sum _{i\in I}\big [z_i^*f_i^*((1/z_i^*)w^{*}_{i})+\sigma _{\Omega _i}(w^{*}_{i})\big ] =-\sigma _S\big (-\sum _{i \in {\widetilde{I}}} v_i^{*}\big )-\sum _{i \in {\widetilde{I}}} \big [u_i^*f_i^*((1/u_i^*)v^{*}_{i})+\sigma _{\Omega _i}(v^{*}_{i})\big ]\le v({\widetilde{D}}_{{\mathcal {T}}}^{S})\) for all \((\lambda ,z^{*},w^{*})\) feasible to \((D_{{\mathcal {T}}}^{S})\), i.e. \(v(D_{{\mathcal {T}}}^{S})\le v({\widetilde{D}}_{{\mathcal {T}}}^{S})\).

To prove the opposite inequality, take a feasible element \((u^{*},v^{*})\) of the problem \(({\widetilde{D}}_{{\mathcal {T}}}^{S})\) and set \(I=R={\widetilde{I}}\), \(z_i^{*}=\lambda _i=u_i^{*}\) and \(w_i^{*}=v_i^{*}\) for \(i\in I=R\) and \(z_j^{*}=\lambda _j=0\) for \(j\notin I=R\), then we have from the feasibility of \((u^{*},v^{*})\) that \(\sum _{r\in R}\lambda _r\le 1\), \(z_k^{*}=\lambda _k>0,~k\in R,~\lambda _l=0,~l\notin R\) and \(\gamma _{C^0_i}(w^{*}_{i})\le z^{*}_{i},~i\in I\), which means that \((\lambda ,z^{*},w^{*})\) is a feasible element of \((D_{{\mathcal {T}}}^{S})\) and it holds \(-\sigma _S\big (-\sum _{i\in I} v_i^{*}\big )-\sum _{i\in I}^n \big [ u_i^*f_i^*((1/u_i^*)v^{*}_{i})+\sigma _{\Omega _i}(v^{*}_{i})\big ] =-\sigma _S\big (-\sum _{i\in I} w_i^{*}\big )-\sum _{i\in I}\big [z_i^*f_i^*((1/z_i^*)w^{*}_{i})+\sigma _{\Omega _i}(w^{*}_{i})\big ]\le v(D_{{\mathcal {T}}}^{S})\) for all \((u^{*},v^{*})\) feasible to \(({\widetilde{D}}_{{\mathcal {T}}}^{S})\), which implies \(v({\widetilde{D}}_{{\mathcal {T}}}^{S})\le v(D_{{\mathcal {T}}}^{S})\). Finally, it follows that \(v({\widetilde{D}}_{{\mathcal {T}}}^{S})= v(D_{{\mathcal {T}}}^{S})\). \(\square \)

Also the general dual problem \((D^S_{h,{\mathcal {T}}})\) can be rewritten as follows.

Proposition 3.2

The problem \((D^S_{h,{\mathcal {T}}})\) can be equivalently written as

$$\begin{aligned} ({\widehat{D}}^S_{h,{\mathcal {T}}})~\sup \limits _{\begin{array}{c} \lambda _i,~z_i^{*}\ge 0,~w_i^{*}\in X^*, \sum \limits _{i=1}^n\lambda _i\le 1, \\ \gamma _{C^0_i}(w_i^{*})\le z_i^{*}, i=1, \ldots , n \end{array}}&\Bigg \{-\sigma _S\left( -\sum \limits _{i=1}^n w_i^{*}\right) -\sum \limits _{i=1}^n\left[ (\lambda _i h_i)^*\left( z_i^{*}\right) -\lambda _ia_i\right. \\&\left. +\,(z_i^{*}f_i)^*\left( w_i^{*}\right) +\sigma _{\Omega _i}(w^{*}_{i})\right] \Bigg \}. \end{aligned}$$

Proof

Let \((\lambda _1, \ldots ,\lambda _n,z_1^{*}, \ldots ,z_n^{*},w_1^{*}, \ldots ,w_n^{*})\) be a feasible solution to \(({\widehat{D}}^S_{h,{\mathcal {T}}})\), then it follows from \(r\notin R=\left\{ r\in \{1, \ldots , n\}:\lambda _r>0\right\} \) by (5) that \(z_r^{*}=0\), i.e. \(I=\left\{ i\in \{1, \ldots , n\}:z_i^{*}>0\right\} \subseteq R\), and for \(i\in \{1, \ldots , n\}{\setminus } I\) we have \(0\le \gamma _{C_i^0}(w_i^{*})\le 0\Leftrightarrow w_i^{*}=0_{X^*}\). This means that \((\lambda _1, \ldots ,\lambda _n,z_1^{*}, \ldots ,z_n^{*},w^{*}_I)\) is feasible to \((D^S_{h,{\mathcal {T}}})\) and by (5) and (6) follows immediately that \(v({\widehat{D}}^S_{h,{\mathcal {T}}})\le v(D^S_{h,{\mathcal {T}}})\).

Conversely, by the previous considerations it is clear that from any feasible solution to \((D^S_{h,{\mathcal {T}}})\) one can immediately construct a feasible solution to \(({\widehat{D}}^S_{h,{\mathcal {T}}})\) such that \(v(D^S_{h,{\mathcal {T}}})\le v({\widehat{D}}^S_{h,{\mathcal {T}}})\) by taking \(w_i^*=0_{X^*}\) for \(i\in \{1, \ldots , n\}{\setminus } I\). \(\square \)

Remark 3.4

The index sets I and R of the dual problem \((D^S_{h,{\mathcal {T}}})\) in (7) give a minute characterization of the set of feasible solutions and are useful in the further approach. From the numerical aspect however, they transform the dual (7) into a discrete optimization problem, making it very hard to solve. For this reason we use for theoretical approaches the dual \((D^S_{h,{\mathcal {T}}})\) in the form of (7) and for numerical studies its equivalent dual formulation provided in Proposition 3.2. In this context, the dual \(({\widetilde{D}}_{{\mathcal {T}}}^{S})\) is equivalent to

$$\begin{aligned} ({\widetilde{D}}_{{\mathcal {T}}}^{S})\sup _{\begin{array}{c} z_i^{*}\ge 0 ,w_i^{*}\in X^*,~\gamma _{C_i^0}(w_i^{*})\le z_i^{*},\\ i=1, \ldots , n,\sum \limits _{i=1}^nz_i^{*}\le 1 \end{array}} \left\{ -\sigma _S\left( -\sum _{i=1}^n w_i^{*}\right) +\sum _{i=1}^n\left[ z_i^{*}a_i-f_i^*(w^{*}_{i})-\sigma _{\Omega _i}(w^{*}_{i})\right] \right\} . \end{aligned}$$

The weak duality for the primal–dual pair \((P^S_{h,{\mathcal {T}}})-(D^S_{h,{\mathcal {T}}})\) holds by construction, i.e. \(v(P^S_{h,{\mathcal {T}}})\ge v(D^S_{h,{\mathcal {T}}})\), and we show that the considered hypotheses guarantee strong duality, too.

Theorem 3.1

(Strong duality) Between \((P^S_{h,{\mathcal {T}}})\) and \((D^S_{h,{\mathcal {T}}})\) strong duality holds, i.e. \(v(P^S_{h,{\mathcal {T}}}) = v(D^S_{h,{\mathcal {T}}})\) and the conjugate dual problem has an optimal solution \(({\overline{\lambda }}_1, \ldots ,{\overline{\lambda }}_n,{\overline{z}}_1^{*}, \ldots ,{\overline{z}}_n^{*},{\overline{w}}_{{{\overline{I}}}}^{*})\in {\mathbb {R}}_+^n\times {\mathbb {R}}_+^n\times (X^*)^{|\overline{I}|}\) with the corresponding index sets \({\overline{I}}\subseteq {\overline{R}} \subseteq \{1, \ldots , n\}\).

Proof

The conclusion follows by [25, Theorem 4], whose hypotheses are fulfilled as seen below. The properness and convexity properties of the involved functions and sets are guaranteed by the standing assumptions formulated in the beginning of the section. It remains to verify the fulfillment of a regularity condition. We use the generalized interior point regularity condition \((RC^C_2)\) introduced in [25] for multi-composed optimization problems that is a development of the one given in the general case in [30]. First, notice that f is lower semicontinuous, \(K_0={\mathbb {R}}^n_+\) is closed and has a nonempty interior, S is closed, F is \({\mathbb {R}}^n_+\)-\(\text {epi}\)-closed, while the linear continuous function G is obviously \(\{0_{X^n}\}\)-\(\text {epi}\)-closed. The continuity of G voids (see [25, Remark 5]) the necessity of having \(\text {int }K_1\ne \emptyset \), a condition that is in this case not fulfilled. The other requirements of the regularity condition are fulfilled as well, namely \(0_{X}\in \text {sqri}((X\cap S)+X)=X\), \(0_{{\mathbb {R}}^n}\in \text {sqri}(F(\text {dom}\, F)-\text {dom}\, f+K_0)=\text {sqri}(F(\text {dom}\, F)-{\mathbb {R}}_+^n+{\mathbb {R}}_+^n)={\mathbb {R}}^n\) and (recall that \(\text {dom}\, {\mathcal {T}}_{\Omega _i,f_i}^{C_i}=X\), \(i=1, \ldots , n\)) \(0_{X^n}\in \text {sqri}(G(\text {dom}\, G\cap \text {dom}\, g\cap S)-\text {dom}\, F+K_1)=\text {sqri}(G(S)-X^n +\{0_{X^n}\})=X^n\).\(\square \)

The next statement is dedicated to deriving necessary and sufficient optimality conditions for the primal–dual pair \((P^S_{h,{\mathcal {T}}})-(D^S_{h,{\mathcal {T}}})\).

Theorem 3.2

(Optimality conditions) (a) Let \({\overline{x}}\in S\) be an optimal solution to the problem \((P_{h,{\mathcal {T}}}^{S})\). Then there exists \(({\overline{\lambda }}_1, \ldots ,{\overline{\lambda }}_n,{\overline{z}}_1^{*}, \ldots ,{\overline{z}}_n^{*},{\overline{w}}_{{{\overline{I}}}}^{*}) \in {\mathbb {R}}_+^n\times {\mathbb {R}}_+^n\times (X^*)^{|{{\overline{I}}}|}\) with the corresponding index sets \({\overline{I}}\subseteq {\overline{R}} \subseteq \{1, \ldots , n\}\) as an optimal solution to \((D_{h,{\mathcal {T}}}^{S})\) such that

  1. (i)

    \(\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} =\sum \limits _{i\in {\overline{I}}}{\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})- \sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\left[ h_r^*\left( \frac{{\overline{z}}_r^{*}}{{\overline{\lambda }}_r}\right) -a_r\right] = \sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\left[ h_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) +a_r\right] ,\)

  2. (ii)

    \({\overline{\lambda }}_rh_r^*\left( \frac{{\overline{z}}_r^{*}}{{\overline{\lambda }}_r}\right) +{\overline{\lambda }}_rh_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) = {\overline{z}}_r^{*}{\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})~\forall r\in {\overline{R}}\),

  3. (iii)

    \({\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})+ {\overline{z}}_i^{*}f_i^*\left( \frac{1}{{\overline{z}}_i^{*}}{\overline{w}}_i^{*}\right) + \sigma _{\Omega _i}({\overline{w}}^{*}_{i})=\langle {\overline{w}}_i^{*},{\overline{x}}\rangle ~\forall i\in {\overline{I}}\),

  4. (iv)

    \(\sum \limits _{i\in {\overline{I}}}\langle {\overline{w}}_i^{*},{\overline{x}}\rangle =-\sigma _S\left( -\sum \limits _{i\in {\overline{I}}}{\overline{w}}_i^{*}\right) \),

  5. (v)

    \(\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} = h_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) +a_r~\forall r\in {\overline{R}}\),

  6. (vi)

    \(\sum \limits _{r\in {\overline{R}}} {\overline{\lambda }}_r=1,~{\overline{\lambda }}_k> 0,~k\in {\overline{R}},~{\overline{\lambda }}_l=0,~l\notin {\overline{R}},~{\overline{z}}_i^{*}>0,~i\in {\overline{I}},\) and \({\overline{z}}_j^{*}=0,~j\notin {\overline{I}},\)

  7. (vii)

    \(\gamma _{C^0_i}({\overline{w}}_i^{*})={\overline{z}}_i^{*},~{\overline{w}}_i^{*}\in X^*{\setminus }\{0_{X^*}\},~i\in {\overline{I}}\).

(b) If there exists \({\overline{x}}\in S\) such that for some \(({\overline{\lambda }}_1, \ldots ,{\overline{\lambda }}_n,{\overline{z}}_1^{*}, \ldots ,{\overline{z}}_n^{*},{\overline{w}}_1^{*}, \ldots ,{\overline{w}}_{{\overline{I}}}^{*})\in {\mathbb {R}}_+^n\times {\mathbb {R}}_+^n\times (X^*)^{|{\overline{I}}|}\) with the corresponding index sets \({\overline{I}} \subseteq {\overline{R}} \subseteq \{1, \ldots , n\}\) the conditions (i)–(vii) are fulfilled, then \({\overline{x}}\) is an optimal solution to \((P_{h,{\mathcal {T}}}^{S})\), \(({\overline{\lambda }}_1, \ldots ,{\overline{\lambda }}_n,{\overline{z}}_1^{*}, \ldots ,{\overline{z}}_n^{*},{\overline{w}}^{*}_{{\overline{I}}})\) is an optimal solution to \((D_{h,{\mathcal {T}}}^{S})\) and \(v(P_{h,{\mathcal {T}}}^{S}) = v(D_{h,{\mathcal {T}}}^{S})\).

Proof

(a) By [25, Theorem 5] we obtain the following necessary and sufficient optimality conditions for the primal–dual pair \((P^S_{h,{\mathcal {T}}})-(D^S_{h,{\mathcal {T}}})\)

(i’):

\(\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} +\sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\left[ h_r^*\left( \frac{{\overline{z}}_r^{*}}{{\overline{\lambda }}_r}\right) -a_r\right] =\sum \limits _{i\in {\overline{I}}}{\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}}),\)

(ii’):

\(\sum \limits _{i\in {\overline{I}}}{\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})+\sum \limits _{i\in {\overline{I}}}\left[ {\overline{z}}_i^{*}f_i^*\left( \frac{1}{{\overline{z}}_i^{*}}{\overline{w}}_i^{*}\right) + \sigma _{\Omega _i}({\overline{w}}^{*}_{i})\right] =\sum \limits _{i\in {\overline{I}}}\langle {\overline{w}}_i^{*},{\overline{x}}\rangle \),

(iii’):

\(\sum \limits _{i\in {\overline{I}}}\langle {\overline{w}}_i^{*},{\overline{x}}\rangle +\sigma _S\left( -\sum \limits _{i\in {\overline{I}}}{\overline{w}}_i^{*}\right) =0\),

(iv’):

\(\sum \limits _{r\in {\overline{R}}} {\overline{\lambda }}_r\le 1,~{\overline{\lambda }}_k> 0,~k\in {\overline{R}},~{\overline{\lambda }}_l=0,~l\notin {\overline{R}},~{\overline{z}}_i^{*}>0,~i\in {\overline{I}},\) and \({\overline{z}}_j^{*}=0,~j\notin {\overline{I}},\)

(v’):

\(\gamma _{C^0_i}({\overline{w}}_i^{*})\le {\overline{z}}_i^{*},~{\overline{w}}_i^{*}\in X^*,~i\in {\overline{I}}\).

Additionally, one has by Theorem 3.1 that \(v(P_{h,a}^{S}) = v(D_{h,a}^{S})\), i.e.

$$\begin{aligned} \max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\}= & {} -\sigma _S\left( -\sum \limits _{i\in {\overline{I}}}{\overline{w}}_i^{*}\right) -\sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\left[ h_r^*\left( \frac{{\overline{z}}_r^{*}}{{\overline{\lambda }}_r}\right) -a_r\right] \\&-\sum \limits _{i\in {\overline{I}}}\left[ {\overline{z}}_i^{*}f_i^*\left( \frac{1}{{\overline{z}}_i^{*}}{\overline{w}}_i^{*}\right) + \sigma _{\Omega _i}({\overline{w}}^{*}_{i})\right] , \end{aligned}$$

that can be equivalently written as

$$\begin{aligned}&\left[ \max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} -\sum \limits _{r\in {\overline{R}}}\left( {\overline{\lambda }}_rh_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) +{\overline{\lambda }}_ra_r\right) \right] \\&\quad +\,\sum \limits _{i\in {\overline{I}}}\left[ {\overline{z}}_i^{*}\left( {\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})\right) + {\overline{z}}_i^{*}f_i^*\left( \frac{1}{{\overline{z}}_i^{*}}{\overline{w}}_i^{*}\right) + \sigma _{\Omega _i}({\overline{w}}^{*}_{i})-\langle {\overline{w}}_i^{*},{\overline{x}}\rangle \right] \\&\quad +\,\left[ \sigma _S\left( -\sum \limits _{i\in {\overline{I}}}{\overline{w}}_i^{*}\right) +\sum \limits _{i\in {\overline{I}}}\langle {\overline{w}}_i^{*},{\overline{x}}\rangle \right] +\sum \limits _{i\in {\overline{I}}}\left[ {\overline{\lambda }}_ih_i^*\left( \frac{{\overline{z}}_i^{*}}{{\overline{\lambda }}_i}\right) \right. \\&\quad \left. +\,{\overline{\lambda }}_ih_i\left( {\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})\right) - {\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})\right] \\&\quad +\,\sum \limits _{r\in {\overline{R}}{\setminus } {\overline{I}}}\left[ {\overline{\lambda }}_rh_r^*\left( 0\right) +{\overline{\lambda }}_rh_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) - 0\cdot \left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) \right] =0, \end{aligned}$$

where the last two sums arise from the fact that \({\overline{I}} \subseteq {\overline{R}}\). By [27, Lemma 2] holds that the term within the first pair of brackets above is nonnegative. Moreover, by the Young–Fenchel inequality we have that the terms within the other brackets are nonnegative, too, and hence, it follows that all the terms within the brackets must be equal to zero. Combining the last statement with the optimality conditions \((i')\)\((v')\) yields

  1. (i)

    \(\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} =\sum \limits _{i\in {\overline{I}}}{\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})-\sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\left[ h_r^*\left( \frac{{\overline{z}}_r^{*}}{{\overline{\lambda }}_r}\right) -a_r\right] =\sum \limits _{r\in {\overline{R}}}\left( {\overline{\lambda }}_rh_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) +{\overline{\lambda }}_ra_r\right) ,\)

  2. (ii)

    \({\overline{\lambda }}_rh_r^*\left( \frac{{\overline{z}}_r^{*}}{{\overline{\lambda }}_r}\right) +{\overline{\lambda }}_rh_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) = {\overline{z}}_r^{*}{\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})~\forall r\in {\overline{R}}\),

  3. (iii)

    \({\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})+ {\overline{z}}_i^{*}f_i^*\left( \frac{1}{{\overline{z}}_i^{*}}{\overline{w}}_i^{*}\right) + \sigma _{\Omega _i}({\overline{w}}^{*}_{i})=\langle {\overline{w}}_i^{*},{\overline{x}}\rangle ~\forall i\in {\overline{I}}\),

  4. (iv)

    \(\sum \limits _{i\in {\overline{I}}}\langle {\overline{w}}_i^{*},{\overline{x}}\rangle =-\sigma _S\left( -\sum \limits _{i\in {\overline{I}}}{\overline{w}}_i^{*}\right) \),

  5. (v)

    \(\sum \limits _{r\in {\overline{R}}} {\overline{\lambda }}_r\le 1,~{\overline{\lambda }}_k> 0,~k\in {\overline{R}},~{\overline{\lambda }}_l=0,~l\notin {\overline{R}},~{\overline{z}}_i^{*}>0,~i\in {\overline{I}},\) and \({\overline{z}}_j^{*}=0,~j\notin {\overline{I}},\)

  6. (vi)

    \(\gamma _{C^0_i}({\overline{w}}_i^{*})\le {\overline{z}}_i^{*},~{\overline{w}}_i^{*}\in X^*,~i\in {\overline{I}}\), and \({\overline{w}}_j^{*}=0_{X^*},~j\notin {\overline{I}}\).

From conditions (i) and (v) we obtain that

$$\begin{aligned}&\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} =\sum \limits _{r\in {\overline{R}}}\left( {\overline{\lambda }}_rh_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) +{\overline{\lambda }}_ra_r\right) \\&\quad \le \sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} \le \max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} , \end{aligned}$$

which means on the one hand that

$$\begin{aligned} \sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} =\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} , \end{aligned}$$

i.e. condition (v) can be written as

$$\begin{aligned} \sum \limits _{r\in {\overline{R}}} {\overline{\lambda }}_r=1,~{\overline{\lambda }}_k> 0,~k\in {\overline{R}},~{\overline{\lambda }}_l=0,~l\notin {\overline{R}},~{\overline{z}}_i^{*}>0,~i\in {\overline{I}}, \text { and } {\overline{z}}_j^{*}=0,~j\notin {\overline{I}}, \end{aligned}$$
(8)

and on the other hand that

$$\begin{aligned} \sum \limits _{r\in {\overline{R}}}\left( {\overline{\lambda }}_rh_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) +{\overline{\lambda }}_ra_r\right) = \sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} \end{aligned}$$
(9)

or, equivalently,

$$\begin{aligned} \sum \limits _{r\in {\overline{R}}}{\overline{\lambda }}_r\left[ \max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} - h_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) +a_r\right] =0. \end{aligned}$$
(10)

As the brackets in (10) are nonnegative and \({\overline{\lambda }}_r>0\) for \(r\in {\overline{R}}\), it follows that the terms inside the brackets must be equal to zero, more precisely,

$$\begin{aligned} \max \limits _{1\le j\le n}\left\{ h_j\left( {\mathcal {T}}_{\Omega _j,f_j}^{C_j}({\overline{x}})\right) +a_j\right\} = h_r\left( {\mathcal {T}}_{\Omega _r,f_r}^{C_r}({\overline{x}})\right) +a_r \quad \forall r\in {\overline{R}}. \end{aligned}$$
(11)

Further, Theorem 2.1 implies the existence of \({\overline{p}}_i,~{\overline{q}}_i\in X\) such that

$$\begin{aligned} {\mathcal {T}}_{\Omega _i,f_i}^{C_i}({\overline{x}})= \gamma _{C_i}({\overline{x}}-{\overline{p}}_i-{\overline{q}}_i)+f_i({\overline{p}}_i)+\delta _{\Omega _i}({\overline{q}}_i)\quad \forall i=1,.., n. \end{aligned}$$

Employing the condition (iii) one gets

$$\begin{aligned} {\overline{z}}_i^{*}\gamma _{C_i}({\overline{x}}-{\overline{p}}_i-{\overline{q}}_i)-{\overline{z}}_i^{*}f_i({\overline{p}}_i)- \delta _{\Omega _i}({\overline{q}}_i)+ {\overline{z}}_i^{*}f_i^*\left( \frac{1}{{\overline{z}}_i^{*}}{\overline{w}}_i^{*}\right) + \sigma _{\Omega _i}({\overline{w}}^{*}_{i})=\langle {\overline{w}}_i^{*},{\overline{x}}\rangle , \end{aligned}$$

equivalently writable as

$$\begin{aligned}&\left[ {\overline{z}}_i^{*}\gamma _{C_i}({\overline{x}}-{\overline{p}}_i-{\overline{q}}_i)-\langle {\overline{w}}_i^{*},{\overline{x}}-{\overline{p}}_i-{\overline{q}}_i\rangle \right] +\left[ {\overline{z}}_i^{*}f_i({\overline{p}}_i)+ {\overline{z}}_i^{*}f_i^*\left( \frac{1}{{\overline{z}}_i^{*}}{\overline{w}}_i^{*}\right) -\langle {\overline{w}}_i^{*},{\overline{p}}_i\rangle \right] \nonumber \\&\quad +\left[ \delta _{\Omega _i}({\overline{q}}_i)+\sigma _{\Omega _i}({\overline{w}}^{*}_{i})-\langle {\overline{w}}_i^{*},{\overline{q}}_i\rangle \right] ,~i\in {\overline{I}}. \end{aligned}$$
(12)

By the Young–Fenchel inequality all the brackets in (12) are nonnegative and must therefore be equal to zero, i.e.

$$\begin{aligned} {\overline{z}}_i^{*}\gamma _{C_i}({\overline{x}}-{\overline{p}}_i-{\overline{q}}_i)= & {} \langle {\overline{w}}_i^{*},{\overline{x}}-{\overline{p}}_i-{\overline{q}}_i\rangle ,\nonumber \\ {\overline{z}}_i^{*}f_i({\overline{p}}_i)+ {\overline{z}}_i^{*}f_i^*\left( \frac{1}{{\overline{z}}_i^{*}}{\overline{w}}_i^{*}\right)= & {} \langle {\overline{w}}_i^{*},{\overline{p}}_i\rangle , \nonumber \\ \delta _{\Omega _i}({\overline{q}}_i)+\sigma _{\Omega _i}({\overline{w}}^{*}_{i})= & {} \langle {\overline{w}}_i^{*},{\overline{q}}_i\rangle ,~i\in {\overline{I}}. \end{aligned}$$
(13)

Now, by (13), condition (vi) and Lemma 2.1 (the generalized Cauchy–Schwarz inequality) yield

$$\begin{aligned} {\overline{z}}_i^{*}\gamma _{C_i}({\overline{x}}-{\overline{p}}_i-{\overline{q}}_i)=\langle {\overline{w}}_i^{*},{\overline{x}}-{\overline{p}}_i-{\overline{q}}_i\rangle \le \gamma _{C_i^0}({\overline{w}}_i^*)\gamma _{C_i}({\overline{x}}-{\overline{p}}_i-{\overline{q}}_i)\le {\overline{z}}_i^{*}\gamma _{C_i}({\overline{x}}-{\overline{p}}_i-{\overline{q}}_i), \end{aligned}$$

which means that condition (vi) can be expressed as

$$\begin{aligned} \gamma _{C^0_i}({\overline{w}}_i^{*})={\overline{z}}_i^{*},~{\overline{w}}_i^{*}\in X^*{\setminus } \{0_{X^*}\},~i\in {\overline{I}},\text { and } {\overline{w}}_j^{*}=0_{X^*},~j\notin {\overline{I}}. \end{aligned}$$
(14)

The optimality conditions (i)–(vi), (8), (11) and (14) deliver the desired statement.

(b) All the calculations done in (a) can also be made in the reverse order, yielding thus the conclusion. \(\square \)

Remark 3.5

If we consider the situation when the set-up costs are arbitrary, i.e. \(a_i\) can also be negative, \(i=1, \ldots , n,\) then the conjugate function of f looks like (see [27, Remark 6])

$$\begin{aligned} f^*(z_1^{*}, \ldots ,z_n^{*})=\min _{\begin{array}{c} \sum \limits _{i=1}^n \lambda _i= 1,~\lambda _i\ge 0,\\ i=1, \ldots , n \end{array}}\left\{ \sum _{i=1}^n [(\lambda _ih_i)^*(z_i^{*})-\lambda _ia_i]\right\} . \end{aligned}$$

As a consequence, the corresponding dual problem turns out to be almost the same one as in (7) with the additional constraint \(\sum _{r\in R}\lambda _r=1\) and all the statements given in this subsection can be easily adapted for this general case where the set-up costs are arbitrary.

3.2 Special case I

We study now the location problem involved in the economical scenario discussed in Remark 3.1 (we set \(C_i=-C_i\)), i.e.

$$\begin{aligned} (P_{\gamma _G,~{\mathcal {T}}})~~\inf _{x\in X}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\right\} , \end{aligned}$$

and its dual problem (cf. Proposition 3.1, note that \(S=X\), \(a_i=0\) and \(f^*_i=\delta _{G_i^0}\), \(i=1, \ldots , n\))

$$\begin{aligned} (D_{\gamma _G,~{\mathcal {T}}})~~\sup \limits _{\begin{array}{c} z_i^{*}\ge 0,~i=1, \ldots , n,~I=\left\{ i\in \{1, \ldots , n\}:z_i^{*}>0\right\} , \\ w_i^{*}\in X^*,~i\in I,~\gamma _{C_i^0}(w_i^{*})\le z_i^{*},\\ \gamma _{G_i^0}(w_i^{*})\le z_i^{*},i\in I,~\sum \limits _{i\in I}z_i^{*}\le 1,~\sum \limits _{i\in I}w_i^{*}=0_{X^*} \end{array}} \left\{ -\sum _{i\in I}\sigma _{\Omega _i}\left( w_i^{*}\right) \right\} . \end{aligned}$$

Theorem 3.1 yields the following duality statement for the primal–dual pair \((P_{\gamma _G,~{\mathcal {T}}})\)-\((D_{\gamma _G,~{\mathcal {T}}})\).

Theorem 3.3

(Strong duality) Between \((P_{\gamma _G,~{\mathcal {T}}})\) and \((D_{\gamma _G,~{\mathcal {T}}})\) holds strong duality, i.e. \(v(P_{\gamma _G,~{\mathcal {T}}}) = v(D_{\gamma _G,~{\mathcal {T}}})\) and the dual problem has an optimal solution.

The necessary and sufficient optimality conditions for the primal–dual pair of optimization problems \((P_{\gamma _G,~{\mathcal {T}}})-(D_{\gamma _G,~{\mathcal {T}}})\) can be obtained by using the same ideas as in Theorem 3.2.

Theorem 3.4

(Optimality conditions) (a) Let \({\overline{x}}\in X\) be an optimal solution to the problem \((P_{\gamma _G,~{\mathcal {T}}})\). Then there exists an optimal solution to \((D_{\gamma _G,~{\mathcal {T}}})\)\(({\overline{z}}^*_1,\ldots ,{\overline{z}}_n^*,{\overline{w}}_1^{*}, \ldots ,{\overline{w}}_n^{*})\) with the corresponding index set \({\overline{I}} \subseteq \{1, \ldots , n\}\) such that

  1. (i)

    \(\max \limits _{1\le j\le n}\left\{ {\mathcal {T}}_{\Omega _j,\gamma _{G_j}}^{C_j}({\overline{x}})\right\} =\sum \limits _{i\in {\overline{I}}}{\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}({\overline{x}})\),

  2. (ii)

    \({\overline{z}}_i^{*}{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}({\overline{x}})+ \sigma _{\Omega _i}({\overline{w}}^{*}_{i})=\langle {\overline{w}}_i^{*},{\overline{x}}\rangle \quad \forall i\in {\overline{I}}\),

  3. (iii)

    \(\sum \limits _{i\in {\overline{I}}} {\overline{w}}_i^{*}=0_{X^*}\),

  4. (iv)

    \(\max \limits _{1\le j\le n}\left\{ {\mathcal {T}}_{\Omega _j,\gamma _{G_j}}^{C_j}({\overline{x}})\right\} = {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}({\overline{x}}) \quad \forall i\in {\overline{I}}\),

  5. (v)

    \(\sum \limits _{i\in {\overline{I}}} {\overline{z}}^{*}_i=1,~{\overline{z}}_i^{*}>0,~i\in {\overline{I}},\) and \({\overline{z}}_j^{*}=0,~j\notin {\overline{I}},\)

  6. (vi)

    \(\gamma _{C^0_i}({\overline{w}}_i^{*})={\overline{z}}_i^{*},~{\overline{w}}_i^{*}\in X^*{\setminus }\{0_{X^*}\},~\gamma _{G^0_i}({\overline{w}}_i^{*})\le \gamma _{C^0_i}({\overline{w}}_i^{*}),~i\in {\overline{I}}\).

(b) If there exists \({\overline{x}}\in {\mathcal {H}}\) such that for some \(({\overline{z}}^*_1,\ldots ,{\overline{z}}_n^*,{\overline{w}}_1^{*}, \ldots ,{\overline{w}}_n^{*})\) and the corresponding index set \({\overline{I}}\) the conditions (i)–(vi) are fulfilled, then \({\overline{x}}\) is an optimal solution to \((P_{\gamma _G,~{\mathcal {T}}})\), \(({\overline{z}}^*_1,\ldots ,{\overline{z}}_n^*,{\overline{w}}_1^{*}, \ldots ,{\overline{w}}_n^{*})\) is an optimal solution to \((D_{\gamma _G,~{\mathcal {T}}})\) and \(v(P_{\gamma _G,~{\mathcal {T}}}) = v(D_{\gamma _G,~{\mathcal {T}}})\).

Proof

As \(h_i^*=\delta _{(-\infty ,1]}\) for all \(i=1, \ldots , n\), one has from the optimality condition (ii) of Theorem 3.2 that \({\overline{z}}_r^{*}{\mathcal {T}}_{\Omega _r,\gamma _{G_r}}^{C_r}({\overline{x}})= {\overline{\lambda }}_r^{*}{\mathcal {T}}_{\Omega _r,\gamma _{G_r}}^{C_r}({\overline{x}})\) for all \(r\in {\overline{R}}\), which in turn yields that \({\overline{I}}={\overline{R}}\) and \({\overline{\lambda }}_i={\overline{z}}_i^*\) for all \(i\in {\overline{I}}\) (as \(0<z_r^*\le \lambda _r\) and \({\overline{I}}\subseteq {\overline{R}}\)). Furthermore, as \(f_i^*=\delta _{G_i^0}\), it follows by the optimality conditions (iii) and (vii) of Theorem 3.2 that \(\gamma _{G^0_i}({\overline{w}}_i^{*})\le \gamma _{C^0_i}({\overline{w}}_i^{*})\) for all \(i\in {\overline{I}}\). Summing up these facts with the optimality conditions of Theorem 3.2 yields the desired statement. \(\square \)

We use the optimality conditions listed in Theorem 3.4 to provide a more exact characterization to the optimal solutions to the optimization problem \((P_{\gamma _G, {\mathcal {T}}})\).

Theorem 3.5

Let \(\cap _{i\in {\overline{I}}}\Omega _i=\emptyset \), \(0\in \text {int }G_i\), \(C_i^0\cap G_i\cap \text {dom}\,\sigma _{\Omega _i}\ne \emptyset \) for all \(i\in {\overline{I}}\), and \({\overline{x}}\in X\) be an optimal solution to the optimization problem \((P_{\gamma _G,~{\mathcal {T}}})\). If \(({\overline{z}}^*_1,\ldots ,{\overline{z}}_n^*,{\overline{w}}_1^{*}, \ldots ,{\overline{w}}_n^*) \in \mathbb {R}^n_+\times (X^*)^n\) is an optimal solution to \((D_{\gamma _G,~{\mathcal {T}}})\) with the corresponding \({\overline{I}} \subseteq \{1, \ldots , n\}\), then

$$\begin{aligned} {\overline{x}}\in \bigcap \limits _{i\in {\overline{I}}}\left[ \partial \left( v(D_{\gamma _G,~{\mathcal {T}}})\gamma _{C_i^0}\right) ({\overline{w}}_i^*)+\partial \sigma _{\Omega _i}({\overline{w}}_i^*)\right] . \end{aligned}$$

Proof

From \(0_X\in \text {int }C_i\) and \(0_X\in \text {int }G_i\) follows that \(0_{X^*}\in \text {int }C_i^0\) and \(0_{X^*}\in \text {int }G^0_i\) such that \(\gamma _{C^0_i}\) and \(\gamma _{G_i^0}\) are continuous for all \(i\in {\overline{I}}\). Hence, Theorem 2.1 secures the existence of \(\phi _i\in X\) and \(\psi _i\in \Omega _i\) such that \({\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}({\overline{x}})=\gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)+\gamma _{G_i}(\phi _i)\), \(i\in {\overline{I}}\). Further, we have by the optimality conditions (ii) and (iv) of Theorem 3.4

$$\begin{aligned}&(\gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)+\gamma _{G_i}(\phi _i))\gamma _{C^0_i}({\overline{w}}_i^{*})+ \sigma _{\Omega _i}({\overline{w}}^{*}_{i})=\langle {\overline{w}}_i^{*},{\overline{x}}\rangle \\&\quad \Leftrightarrow \left[ \gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)\gamma _{C^0_i}({\overline{w}}_i^{*})-\langle {\overline{w}}_i^{*},{\overline{x}}-\phi _i-\psi _i\rangle \right] +\left[ \gamma _{G_i}(\phi _i)\gamma _{C^0_i}({\overline{w}}_i^{*})-\langle {\overline{w}}_i^{*},\phi _i\rangle \right] \\&\qquad +\left[ \sigma _{\Omega _i}({\overline{w}}^{*}_{i})-\langle {\overline{w}}_i^{*},\psi _i\rangle \right] =0,\quad i\in {\overline{I}}, \end{aligned}$$

from which follows with the Young–Fenchel inequality that

$$\begin{aligned}&{\overline{x}}-\phi _i-\psi _i \in \partial \left( \gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)\gamma _{C^0_i}\right) ({\overline{w}}_i^{*}), \end{aligned}$$
(15)
$$\begin{aligned}&\phi _i\in \partial \left( \gamma _{G_i}(\phi _i)\gamma _{C^0_i}\right) ({\overline{w}}_i^{*}),\end{aligned}$$
(16)
$$\begin{aligned}&\psi _i\in \partial \sigma _{\Omega _i}({\overline{w}}^{*}_{i}),\quad i\in {\overline{I}}. \end{aligned}$$
(17)

If \(\gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)+\gamma _{G_i}(\phi _i)=0\), \(i\in {\overline{I}}\), then we have by (15), (16) and (17) that \({\overline{x}}\in \partial \sigma _{\Omega _i}({\overline{w}}^{*}_{i})\) for all \(i\in {\overline{I}}\), such that \({\overline{x}}\in \Omega _i\) for all \(i\in {\overline{I}}\), which contradicts our assumption.

If there exists \(i\in {\overline{I}}\) such that \(\gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)=0\), then \(v(D_{\gamma _G,~{\mathcal {T}}})=\gamma _{G_i}(\phi _i)>0\) and we get by (15), (16) and (17) that

$$\begin{aligned} {\overline{x}}-\psi _i\in \partial \delta _{X^*}({\overline{w}}_i^*)+\partial \left( \gamma _{G_i}(\phi _i)\gamma _{C^0_i}\right) ({\overline{w}}_i^{*})= & {} \{0_{X^*}\}+ \gamma _{G_i}(\phi _i)\partial \gamma _{C^0_i}({\overline{w}}_i^{*})\\= & {} v(D_{\gamma _G,~{\mathcal {T}}})\partial \gamma _{C^0_i}({\overline{w}}_i^{*}). \end{aligned}$$

If there exists \(i\in {\overline{I}}\) such that \(\gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)>0\) and \(\gamma _{G_i}(\phi _i)=0\), then it follows in a similar way by (15), (16) and (17) that

$$\begin{aligned} {\overline{x}}-\psi _i\in v(D_{\gamma _G,~{\mathcal {T}}})\partial \gamma _{C^0_i}({\overline{w}}_i^{*}). \end{aligned}$$

Finally, if there exists \(i\in {\overline{I}}\) such that \(\gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)>0\) and \(\gamma _{G_i}(\phi _i)>0\), then one has by (15), (16) and (17) that

$$\begin{aligned} {\overline{x}}-\psi _i\in \partial \left( (\gamma _{C_i}({\overline{x}}-\phi _i-\psi _i)+\gamma _{G_i}(\phi _i))\gamma _{C^0_i}\right) ({\overline{w}}_i^{*})=v(D_{\gamma _G,~{\mathcal {T}}})\partial \gamma _{C^0_i}({\overline{w}}_i^{*}). \end{aligned}$$

In summary, we have \({\overline{x}}-\psi \in v(D_{\gamma _G,~{\mathcal {T}}})\partial \gamma _{C^0_i}({\overline{w}}_i^{*})\), which implies that \({\overline{x}}\in v(D_{\gamma _G,~{\mathcal {T}}})\partial \gamma _{C^0_i}({\overline{w}}_i^{*})+\partial \sigma _{\Omega _i}({\overline{w}}^{*}_{i}) \hbox { for all } i\in {\overline{I}}\).\(\square \)

Remark 3.6

Let \({\mathcal {H}}\) be a real Hilbert space, \(\beta _i>0\), \(p_i\in {\mathcal {H}}\), \(C_i=\{x\in {\mathcal {H}}:\beta _i\Vert x\Vert _{{\mathcal {H}}}\le 1\}\), \(\gamma _{G_i}\) a norm and \(\Omega _i=\{p_i\}\), \(i=1, \ldots , n\), with \(p_1,\ldots ,p_n\) distinct, then one has by Theorem 3.5

$$\begin{aligned} {\overline{x}}=\frac{v(D_{\gamma _G,~{\mathcal {T}}})}{\beta _i\Vert {\overline{w}}_i^*\Vert _{{\mathcal {H}}}}{\overline{w}}_i^*+p_i~\quad \forall i\in {\overline{I}}. \end{aligned}$$

Note that if \(v(P_{\gamma _G,~{\mathcal {T}}})=0\), then \(\gamma _{C_i}({\overline{x}}-p_i-{\overline{z}}_i)+\gamma _{G_i}({\overline{z}}_i)=0\) for all \(i=1,\ldots ,n\), which means that \(z_i=0_{{\mathcal {H}}}\) and \({\overline{x}}=p_i\) for all \(i=1,\ldots ,n\), i.e. one gets a contradiction. Therefore, taking into consideration that \(v(P_{\gamma _G,~{\mathcal {T}}}) >0\) and the strong duality statement, one gets \(v(D_{\gamma _G,~{\mathcal {T}}})>0\) and by the optimality condition (iii) of Theorem 3.4 follows

$$\begin{aligned} \sum _{i\in {\overline{I}}}\frac{\beta _i\Vert {\overline{w}}_i^*\Vert _{{\mathcal {H}}}}{v(D_{\gamma _G,~{\mathcal {T}}})}({\overline{x}}-p_i)= \sum _{i\in {\overline{I}}} {\overline{w}}_i^*=0_{{\mathcal {H}}}\Leftrightarrow {\overline{x}}=\frac{1}{\sum \limits _{i\in \overline{I}}\beta _i\Vert {\overline{w}}_i^*\Vert _{{\mathcal {H}}}}\sum _{i\in {\overline{I}}}\beta _i\Vert {\overline{w}}_i^*\Vert _{{\mathcal {H}}}p_i. \end{aligned}$$

Remark 3.7

Let \(\cap _{i\in {\overline{I}}}\Omega _i=\emptyset \), \(0\in \text {int }G_i\) and \(\gamma _{C_i^0}(x)=\gamma _{G_i^0}(x)=0\) if and only if \(x=0_X\), \(i=1, \ldots , n\).

(i) Following Proposition 3.2, the dual problem \(({\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\) can be rewritten as

$$\begin{aligned} ({\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})~~\sup \limits _{\begin{array}{c} y_i^{*}\in X^*,~i=1, \ldots , n,~\sum \limits _{i=1}^ny_i^{*}=0_{X^*},\\ \sum \limits _{i=1}^n\max \left\{ \gamma _{C_i^0}(y_i^*),~\gamma _{G_i^0}(y_i^*)\right\} \le 1 \end{array}} \left\{ -\sum _{i=1}^n\sigma _{\Omega _i}\left( y_i^{*}\right) \right\} , \end{aligned}$$

consequently \(v(P_{\gamma _G,~{\mathcal {T}}})=v({\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\).

(ii) As the Slater constraint qualification corresponding to \(({\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\) is fulfilled (for instance for \(y_i^*=0_{X^*}\), \(i=1, \ldots , n\)), there holds strong duality for it and its Lagrange dual problem, that can be reduced after some calculations to

$$\begin{aligned} (D{\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})~~\inf \limits _{\lambda \ge 0,~x\in X}\left\{ \lambda +\sum \limits _{i=1}^n\sup \limits _{y_i^*\in X^*}\left\{ \langle x,y_i^*\rangle -\lambda \max \left\{ \gamma _{C_i^0}(y_i^*),~\gamma _{G_i^0}(y_i^*)\right\} -\sigma _{\Omega _i}(y_i^*)\right\} \right\} . \end{aligned}$$
(18)

Since \(\lambda =0\) implies, taking into consideration that \(\cap _{i=1}^n \Omega _i=\emptyset \), that the value of the objective function of \((D{\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\) is \(+\infty \), one can write \(\lambda >0\) in the constraints of \((D{\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\). Moreover, since \(0_{X^*}\in \text {dom}\, \gamma _{C_i^0}\cap \text {dom}\, \gamma _{G_i^0}\cap \text {dom}\,\sigma _{\Omega _i}\) and \(\sigma _{\Omega _i}\) is continuous for all \(i=1, \ldots , n\), [6, Theorem 3.5.8.(a)] yields

$$\begin{aligned}&\sup \limits _{y_i^*\in X^*}\left\{ \langle x,y_i^*\rangle -\lambda \max \left\{ \gamma _{C_i^0}(y_i^*),~\gamma _{G_i^0}(y_i^*)\right\} -\sigma _{\Omega _i}(y_i^*)\right\} \nonumber \\&\quad =\min _{y_i\in \Omega }\left\{ \lambda \max \left\{ \gamma _{C_i^0}(\cdot ),~\gamma _{G_i^0}(\cdot )\right\} ^*\left( \frac{1}{\lambda }(x-y_i)\right) \right\} . \end{aligned}$$
(19)

(iii) For any \(i=1, \ldots , n\), the conjugate of \(\max \left\{ \gamma _{C_i^0}(\cdot ),~\gamma _{G_i^0}(\cdot )\right\} \) from (19) becomes

$$\begin{aligned} \max \left\{ \gamma _{C_i^0}(\cdot ),~\gamma _{G_i^0}(\cdot )\right\} ^*(x) =\sup _{\begin{array}{c} x^*\in X^*,~t\ge 0,\\ \gamma _{C_i^0}(x^*)\le t,~\gamma _{G_i^0}(x^*)\le t \end{array}}\{\langle x,x^*\rangle -t\},~i=1, \ldots , n. \end{aligned}$$
(20)

As the Slater constraint qualification for the problem in the right-hand side of (20) is obviously fulfilled, one obtains via strong Lagrange duality

$$\begin{aligned} \max \left\{ \gamma _{C_i^0}(\cdot ),~\gamma _{G_i^0}(\cdot )\right\} ^*(x)=\min _{\begin{array}{c} \alpha \ge 0,~\beta \ge 0,\\ \alpha +\beta \le 1 \end{array}}\left( \alpha \gamma _{C_i^0}+\beta \gamma _{G_i^0}\right) ^*(x). \end{aligned}$$
(21)

Note that a more general formula for this conjugate can be found in [7]. Recall that \(0_{X}\in \text {int }C_i\) and \(0_{X}\in \text {int }G_i\), which implies that \(0_{X^*}\in \text {int }C_i^0\) and \(0_{X^*}\in \text {int }G_i^0\) and thus, \(\text {dom}\, \gamma _{C_i^0}=\text {dom}\, \gamma _{G_i^0}=X^*\). Hence, we have \(0\cdot \gamma _{C_i^0}=0\cdot \gamma _{G_i^0}=\delta _{X^*}\). We apply [6, Theorem 3.5.8.(a)] to the formula in the right-hand side of (21), where the minimum is assumed to be attained at \(({\bar{\alpha }}, {\bar{\beta }})\).

If \({\bar{\alpha }}=0\) and \({\bar{\beta }}>0\), then we have

$$\begin{aligned}&\min _{0\le \beta \le 1}\left( \delta _{X^*}+\beta \gamma _{G_i^0}\right) ^*(x)=\min _{0< \beta \le 1}\left\{ \delta _{\{0_X\}}(x-z_i)+\beta \gamma _{G_i^0}^*\left( \frac{1}{\beta }z_i\right) \right\} \nonumber \\&\quad =\left\{ \begin{array}{l@{\quad }l} 0,&{} \text {if } 0< {\bar{\beta }}\le 1,~\gamma _{G_i}(x)\le {\bar{\beta }}, \\ + \infty , &{}\text {otherwise} \end{array} \right. =\left\{ \begin{array}{ll} 0,&{}\quad \text {if } \gamma _{G_i}(x)\le 1\\ + \infty , &{}\quad \text {otherwise}. \end{array} \right. \end{aligned}$$
(22)

If \({\bar{\alpha }} >0\) and \({\bar{\beta }}=0\), then one gets similarly that

$$\begin{aligned}&\min _{0\le \alpha \le 1}\left( \alpha \gamma _{C_i^0}+\delta _{X^*}\right) ^*(x)=\left\{ \begin{array}{l@{\quad }l} 0,&{} \text { if } \gamma _{C_i}(x)\le 1\\ + \infty , &{}\text { otherwise}. \end{array} \right. \end{aligned}$$
(23)

Finally, when \({\bar{\alpha }}>0\) and \({\bar{\beta }}>0\), then

$$\begin{aligned}&\min _{\begin{array}{c} \alpha \ge 0,~\beta \ge 0,\\ \alpha +\beta \le 1 \end{array}}\left( \alpha \gamma _{C_i^0}+\beta \gamma _{G_i^0}\right) ^*(x)=\min _{z_i\in X} \left\{ {\bar{\alpha }}\gamma _{C_i^0}^*\left( \frac{1}{{\bar{\alpha }}}(x-z_i)\right) +{\bar{\beta }}\gamma _{G_i^0}^*\left( \frac{1}{{\bar{\beta }}}z_i\right) \right\} \nonumber \\&\quad =\left\{ \begin{array}{l@{\quad }l} 0,&{} \text { if } \gamma _{C_i}(x-z_i)+\gamma _{G_i}(z_i)\le 1,\\ + \infty , &{}\text { otherwise}. \end{array} \right. \end{aligned}$$
(24)

As \(\gamma _{C_i^0}(x)=\gamma _{C_i^0}(x)=0\Leftrightarrow x=0_X\), it follows from (22), (23) and (24) that

$$\begin{aligned} \min _{\begin{array}{c} \alpha \ge 0,~\beta \ge 0,\\ \alpha +\beta \le 1 \end{array}}\left( \alpha \gamma _{C_i^0}+\beta \gamma _{G_i^0}\right) ^*(x)\!=\!\left\{ \begin{array}{l@{\quad }l} 0,&{} \text { if } \gamma _{C_i}(x-z_i)+\gamma _{G_i}(z_i)\le 1,\nonumber \\ + \infty , &{} \text {otherwise}, \end{array} \right. ~ \quad i=1, \ldots , n.\nonumber \\ \end{aligned}$$
(25)

(iv) Bringing (18), (19) and (25) together allows us consecutively to reformulate the Lagrange dual problem \((D{\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\) as

$$\begin{aligned} \min _{\begin{array}{c} \lambda>0,~x\in X,~y_i\in \Omega _i,~z_i\in X,\\ \gamma _{C_i}(x-y_i-z_i)+\gamma _{G_i}(z_i)\le \lambda ,~i=1, \ldots , n \end{array}}\lambda= & {} \min _{\begin{array}{c} \lambda>0,~x\in X,\\ \min \limits _{y_i\in \Omega _i,~z_i\in X}\left\{ \gamma _{C_i}(x-y_i-z_i)+\gamma _{G_i}(z_i)\right\} \le \lambda ,~i=1, \ldots , n \end{array}}\lambda \\= & {} \min _{\begin{array}{c} \lambda >0,~x\in X,\\ {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\le \lambda ,~i=1, \ldots , n \end{array}}\lambda =\min _{x\in {\mathcal {H}}}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\right\} . \end{aligned}$$

This shows on the one hand that the optimization problem \((P_{\gamma _G,~{\mathcal {T}}})\) has an optimal solution and on the other hand that the Lagrange multipliers \(\lambda >0\) and \(x\in X\) characterize the optimal objective value and the optimal solution to the problem \((P_{\gamma _G,~{\mathcal {T}}})\), respectively. Furthermore, this fact also implies that the relation between the primal problem \((P_{\gamma _G,~{\mathcal {T}}})\), its dual problem \(({\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\) and its bidual problem \((D{\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\) is completely symmetric under the considered hypotheses.

Remark 3.8

As noted in Remark 3.2, the problem \((P_{\gamma _G,~{\mathcal {T}}})\) can also be written as

$$\begin{aligned} (P_{\gamma _G,~{\mathcal {T}}})~~\inf _{\begin{array}{c} x\in X, t\in {\mathbb {R}},\\ {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\le t,~i=1, \ldots , n \end{array}}t= & {} \inf _{\begin{array}{c} x\in X,~t\in {\mathbb {R}},\\ (x,t)\in \text {epi}\, {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i},~i=1, \ldots , n \end{array}}\\ t= & {} \inf _{x\in X,~t\in {\mathbb {R}}}\left\{ t+\sum _{i=1}^n\delta _{\text {epi}{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}}(x,t)\right\} , \end{aligned}$$

to which one can assign the corresponding Fenchel dual problem that can be reduced to

$$\begin{aligned} (D^F_{\gamma _G,~{\mathcal {T}}})&\sup _{\begin{array}{c} x^*_i\in X^*,t^*_i\in {\mathbb {R}},~i=1, \ldots , n,\\ \sum \limits _{i=1}^nx_i^*=0_{X^*},~\sum \limits _{i=1}^nt_i^*=-1 \end{array}}\left\{ -\sum _{i=1}^n\sigma _{\text {epi}{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}}(x_i^*,t_i^*)\right\} . \end{aligned}$$

Now, let us take a careful look at

$$\begin{aligned} \sigma _{\text {epi}{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}}(x^*_i,t_i^*)=\sup _{(x,t)\in X\times {\mathbb {R}},~ {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\le t}\{\langle x^*_i,x\rangle +t_i^*t\}, \end{aligned}$$
(26)

and as for fixed \((x^*_i,t_i^*)\in X^*\times {\mathbb {R}}\) the Slater constraint qualification is obviously fulfilled in the right-hand side of (26) one has

$$\begin{aligned} \sigma _{\text {epi}{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}}(x^*_i,t_i^*)=\min _{\lambda _i\ge 0}\sup _{(x,t)\in X\times {\mathbb {R}}}\left\{ \langle x_i^*,x\rangle +t_i^*t-\lambda _i\left( {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)-t\right) \right\} ,~i=1, \ldots , n. \end{aligned}$$

If for some \(i\in \{1, \ldots , n\}\), \(\lambda _i=0\), then

$$\begin{aligned} \sigma _{\text {epi}{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}}(x^*_i,t_i^*)= \left\{ \begin{array}{l@{\quad }l} 0,&{} \text {if } x_i^*=0_{X^*},~t_i^*=0\\ + \infty , &{} \text {otherwise}, \end{array} \right. \end{aligned}$$

otherwise, it holds for some \(i\in \{1, \ldots , n\}\),

$$\begin{aligned} \sigma _{\text {epi}{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}}(x^*_i,t_i^*)= & {} \min _{\lambda _i>0}\left\{ \lambda _i\sup _{x\in X}\left\{ \left\langle \frac{1}{\lambda _i}x_i^*,x\right\rangle -{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\right\} +\sup _{t\in {\mathbb {R}}}\{t(t_i^*+\lambda _i)\}\right\} \\= & {} \min _{\begin{array}{c} \lambda _i>0,~t_i^*=-\lambda _i,\\ \gamma _{C_i^0}(x_i^*)\le \lambda _i,~\gamma _{G_i^0}(x_i^*)\le \lambda _i \end{array}}\left\{ -\sigma _{\Omega _i}(x_i^*)\right\} . \end{aligned}$$

Therefore, the Fenchel dual problem transforms to

$$\begin{aligned} (D^F_{\gamma _G,~{\mathcal {T}}})~~\sup \limits _{\begin{array}{c} \lambda _i\in {\mathbb {R}},~x_i^{*}\in X^*,~ \gamma _{C_i^0}(x_i^*)\le \lambda _i,~\gamma _{G_i^0}(x_i^*)\le \lambda _i, \\ i=1, \ldots , n,~\sum \limits _{i=1}^n\lambda _i= 1,~\sum \limits _{i=1}^nx_i^{*}=0_{X^*} \end{array}} \left\{ -\sum _{i=1}^n\sigma _{\Omega _i}\left( x_i^{*}\right) \right\} . \end{aligned}$$

Setting \(x_j^*=w_j^{*}\), \(j\in I\), and \(\lambda _i=z_i^{*}\), \(i=1, \ldots , n\), allows to write the Fenchel dual problem as

$$\begin{aligned} (D^F_{\gamma _G,~{\mathcal {T}}})~~\sup \limits _{\begin{array}{c} z_i^*\in {\mathbb {R}},~w_i^{*}\in X^*,~ \gamma _{C_i^0}(w_i^*)\le z^*_i,~\gamma _{G_i^0}(w_i^*)\le z_i^*, \\ i=1, \ldots , n,~\sum \limits _{i=1}^n\lambda _i= 1,~\sum \limits _{i=1}^nw_i^{*}=0_{X^*} \end{array}} \left\{ -\sum _{i=1}^n\sigma _{\Omega _i}\left( w_i^{*}\right) \right\} . \end{aligned}$$

More than that, one has by weak duality \(v(D^F_{\gamma _G,~{\mathcal {T}}})\le v(D_{\gamma _G,~{\mathcal {T}}})= v({\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})=v(P_{\gamma _G,~{\mathcal {T}}})\).

But this approach has two drawbacks. First, for strong duality between \((P_{\gamma _G,~{\mathcal {T}}})\) and \((D^F_{\gamma _G,~{\mathcal {T}}})\) one needs to verify the fulfillment of a regularity condition, see, for instance, [6, Theorem 3.2.8]). Moreover, this Fenchel dual cannot be easily reduced to an optimization problem of the form \(({\widetilde{D}}_{\gamma _G,~{\mathcal {T}}})\).

Remark 3.9

A further dual problem of interests is the (direct) Lagrange dual to \((P_{\gamma _G, \mathcal {T}})\)

$$\begin{aligned}&(D^L_{\gamma _G,~{\mathcal {T}}})\sup _{\lambda _i\ge 0,~i=1, \ldots , n}\inf _{(x,t)\in X\times {\mathbb {R}}}\left\{ t+\sum _{i=1}^n\lambda _i\left( {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)-t\right) \right\} \\&\quad =\sup _{\lambda _i\ge 0,~i=1, \ldots , n}\left\{ \inf _{x\in X}\left\{ \sum _{i=1}^n\lambda _i{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\right\} -\sup _{t\in {\mathbb {R}}}\left\{ t\left( 1-\sum _{i=1}^n\lambda _i\right) \right\} \right\} \\&\quad =\sup _{\lambda _i\ge 0,~i=1, \ldots , n,~\sum \limits _{i=1}^n\lambda _i=1}\inf _{x\in X}\left\{ \sum _{i=1}^n\lambda _i{\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\right\} . \end{aligned}$$

The Slater condition is for the problem \(({\widetilde{P}}_{\gamma _G,~{\mathcal {T}}})\) fulfilled, i.e. it holds strong duality for the primal–dual pair \(({\widetilde{P}}_{\gamma _G,~{\mathcal {T}}})\)-\((D^L_{\gamma _G,~{\mathcal {T}}})\), and from the optimality conditions of Theorem 3.7 follows that \(\overline{\lambda }\in {\mathbb {R}}^n_+\) with \({{\overline{\lambda }}}_i=(1/\beta _i)\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}\), \(i\in {\overline{I}}\), and \({{\overline{\lambda }}}_j=0\), \(j\notin {\overline{I}}\), is an optimal solution to the Lagrange dual \((D^L_{\gamma _G,~{\mathcal {T}}})\).

3.3 Special case II

In this part of the paper, we analyze the special case where \(S=X\), \(a_i=0\), \(h_i(x):=x+\delta _{{\mathbb {R}}_+}(x)\), \(x\in {\mathbb {R}}\), \(f_i(x):=\delta _{L_i}(x)\) and \(L_i\subseteq X\) is a nonempty, closed and convex set for all \(i=1, \ldots , n\), such that the minmax location problem \((P_{h,{\mathcal {T}}}^{S})\) turns into

$$\begin{aligned} (P_{{\mathcal {T}}})~~\inf _{x\in X}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\delta _{L_i}}^{C_i}(x)\right\} , \end{aligned}$$

Remark 3.10

By construction \(v(P_{{\mathcal {T}}})>0\). If \(X={\mathbb {R}}^m\) and the gauges are taken to be the corresponding Euclidean norm, then the problem \((P_{{\mathcal {T}}})\) can be seen as finding a point \({\overline{x}}\in {\mathcal {H}}\) such that the maximal distance to its Euclidean projections onto the target sets \(\Omega _i+L_i\), \(i=1, \ldots , n\), is minimal. If \(n=3\), \(X={\mathbb {R}}^2\), \(\Omega _i=\{0_{{\mathbb {R}}^2}\}\) and \(L_i=\{x\in {\mathbb {R}}^2:\Vert x-p_i\Vert \le a_i \}\), where \(a_i>0\) and \(p_i\in {\mathbb {R}}^2\), \(i=1,2,3\), then this problem is also known as the classical Apollonius problem (see [4, 13, 18]).

The corresponding dual problem \((D_{{\mathcal {T}}})\) to \((P_{{\mathcal {T}}})\) becomes, via Remark 3.4,

$$\begin{aligned} (D_{{\mathcal {T}}})~~\sup \limits _{\begin{array}{c} z_i^{*}\ge 0,~i=1, \ldots , n, \\ I=\left\{ i\in \{1, \ldots , n\}:z_i^{*}>0\right\} ,~w_i^{*}\in X^*, i\in I,\\ \gamma _{C_i^0}(w_i^{*})\le z_i^{*},i\in I,~\sum \limits _{i\in I}z_i^{*}\le 1,~\sum \limits _{i\in I}w_i^{*}=0_{X^*} \end{array}} \left\{ -\sum _{i\in I}\left[ \sigma _{L_i}\left( w_i^{*}\right) +\sigma _{\Omega _i}\left( w_i^{*}\right) \right] \right\} . \end{aligned}$$

Additionally to this dual problem, we consider the following one, that is equivalent to it in the sense that they share the same optimal objective value and this can be proven similarly to the proof of [27, Theorem 9],

$$\begin{aligned} ({\widetilde{D}}_{{\mathcal {T}}})~~\sup \limits _{\begin{array}{c} y_i^{*}\in X^*,~i=1, \ldots , n, I=\left\{ i\in \{1, \ldots , n\}:\gamma _{C_i^0}(y_i^{*})>0\right\} ,\\ \sum \limits _{i\in I}\gamma _{C_i^0}(y_i^{*})\le 1,~\sum \limits _{i\in I}y_i^{*}=0_{X^*} \end{array}} \left\{ -\sum _{i\in I}\left[ \sigma _{L_i}\left( y_i^{*}\right) +\sigma _{\Omega _i}\left( y_i^{*}\right) \right] \right\} . \end{aligned}$$

Remark 3.11

Considering \(({\widetilde{D}}_{{\mathcal {T}}})\) in a finitely dimensional setting as a minimization problem, the following economical interpretation arises, where the objective function can be seen as a cost function. The components of the dual variables \(y_i^*\in {\mathbb {R}}^m\), \(i=1, \ldots , n\), express the expected expenditures on public goods and services, where i can be identified as one of n locations. More precisely, every location i has its own vector of m expenditures. Examples of public goods and services can be, for instance, parks, police stations, fire departments or highways. If a component of any vector is zero, then this means that the market (or the citizens) of this location is (are) saturated regarding this good or service and if a component is negative, then the market is supersaturated. The constraint \(\sum _{i=1}^n \gamma _{C_i^0}(y_i^{*})\le 1\) defines then the limitation of the budget for the public goods and services, while the constraint \(\sum _{i=1}^ny_i^*=0_{{\mathbb {R}}^m}\) describes the substitution character of the goods and services. The latter means that if the market i has expected expenditures on a special good or service, then it is taken from an another location which is supersaturated. Therefore, the dual problem \(({\widetilde{D}}_{{\mathcal {T}}})\) can be understood as a cost minimization problem of the government of n locations, which has to find the optimal allocation of public goods and services \(({\overline{y}}_1^*, \ldots ,{\overline{y}}_n^*)\) for n locations (that can be districts, towns or federal states) such that all expected expenditures can be financed, the demands on public goods and services of the citizens are saturated and the costs minimal. Another scenario appears by considering the dual problem \(({\widetilde{D}}_{{\mathcal {T}}})\) as a cost minimization problem of the World Health Organization (WHO), where the components of a vector \(z_i^*\in {\mathbb {R}}^m\) represents the expected expenditures on medical treatment and health care for m diseases for a region i in the world, \(i=1, \ldots , n\),. Here also the constraint \(\sum _{i=1}^n \gamma _{C_i^0}(y_i^{*})\le 1\) characterizes the budget restrictions of the WHO. Moreover, if in a region i for a disease \(j\in \{1, \ldots ,m\}\) no medical treatment is required and medical staff and products are no longer needed (which means that the associated component \(y^*_{ij}\) is negative), for instance because a certain disease was eradicated there, then these expenditures can be reallocated to other regions which need medical treatment for the disease j. It is important not to waste any expenditures, therefore their sum must be zero, i.e. the constraint \(\sum _{i=1}^ny_i^*=0_{{\mathbb {R}}^m}\) must be fulfilled. For an economical interpretation of \((P_{{\mathcal {T}}})\) we refer to [4].

The following statement is then a direct consequence of Theorem 3.1.

Theorem 3.6

(Strong duality) Between \((P_{{\mathcal {T}}})\) and \(({\widetilde{D}}_{{\mathcal {T}}})\) holds strong duality, i.e. \(v(P_{{\mathcal {T}}}) = v(\widetilde{D}_{\mathcal {T}})\) and the dual problem has an optimal solution \({\overline{y}^*}\in X^n\).

The necessary and sufficient optimality conditions for the primal–dual pair of optimization problems \((P_{{\mathcal {T}}})-({\widetilde{D}}_{{\mathcal {T}}})\) can be derived by Theorem 3.6 similarly to the ones in Theorem 3.2.

Theorem 3.7

(Optimality conditions) (a) Let \({\overline{x}}\in X\) be an optimal solution to the problem \((P_{{\mathcal {T}}})\). Then there exists an optimal solution to \(({\widetilde{D}}_{{\mathcal {T}}})\)\({\overline{y}}^{*} \in (X^*)^n\) with the corresponding index set \({\overline{I}} \subseteq \{1, \ldots , n\}\) such that

  1. (i)

    \(\max \limits _{1\le j\le n}\left\{ {\mathcal {T}}_{\Omega _j,\delta _{L_j}}^{C_j}({\overline{x}})\right\} =\sum \limits _{i\in {\overline{I}}}\gamma _{C_i^0}(\overline{y}_i^*){\mathcal {T}}_{\Omega _i,\delta _{L_i}}^{C_i}({\overline{x}}),\)

  2. (ii)

    \(\gamma _{C_i^0}(\overline{y}_i^*){\mathcal {T}}_{\Omega _i,\delta _{L_i}}^{C_i}({\overline{x}})+ \sigma _{L_i}\left( {\overline{y}}_i^{*}\right) +\sigma _{\Omega _i}\left( {\overline{y}}_i^{*}\right) =\langle {\overline{y}}_i^{*},{\overline{x}}\rangle \), \(i\in {\overline{I}}\),

  3. (iii)

    \(\sum \limits _{i\in {\overline{I}}}{\overline{y}}_i^{*}=0_{X^*}\),

  4. (iv)

    \(\sum \limits _{j\in {\overline{I}}}\gamma _{C_i^0}(\overline{y}_i^*)=1,~{\overline{y}}_i^{*}\in X^*{\setminus }\{0_{X^*}\},~i\in {\overline{I}},\) and \({\overline{y}}_i^{*}=0_{X^*},~i\notin {\overline{I}},\)

  5. (v)

    \({\mathcal {T}}_{\Omega _i,\delta _{L_i}}^{C_i}({\overline{x}})=\max \limits _{1\le j\le n}\left\{ {\mathcal {T}}_{\Omega _j,\delta _{L_j}}^{C_j}({\overline{x}})\right\} ,~ i\in {\overline{I}}\).

(b) If there exists \({\overline{x}\in X}\) such that for some \(({\overline{y}}_1^{*}, \ldots ,{\overline{y}}_n^{*}) \in (X^*)^n\) with the corresponding index set \({\overline{I}}\) the conditions (i)–(v) are fulfilled, then \({\overline{x}}\) is an optimal solution to \((P_{{\mathcal {T}}})\), \(({\overline{y}}_1^{*}, \ldots ,{\overline{y}}_n^{*})\) is an optimal solution to \(({\widetilde{D}}_{{\mathcal {T}}})\) and \(v(P_{{\mathcal {T}}}) = v({\widetilde{D}}_{{\mathcal {T}}})\).

The next statement makes use of Theorem 3.7 in order to provide an exact characterization of the optimal solutions of the optimization problem \((P_{{\mathcal {T}}})\).

Theorem 3.8

Let \({\overline{x}}\in X\) be an optimal solution to the optimization problem \((P_{{\mathcal {T}}})\). If \(({\overline{y}}_1^{*}, \ldots ,{\overline{y}}_n^*) \in (X^*)^n\) is an optimal solution to \(({\widetilde{D}}_{{\mathcal {T}}})\) with the corresponding \({\overline{I}} \subseteq \{1, \ldots , n\}\), then

$$\begin{aligned} {\overline{x}}\in \bigcap \limits _{i\in {\overline{I}}}\left[ \partial \left( v({\widetilde{D}}_{{\mathcal {T}}})\gamma _{C_i^0}\right) ({\overline{y}}_i^*)+\partial \sigma _{L_i}({\overline{y}}_i^*)+ \partial \sigma _{\Omega _i}({\overline{y}}_i^*)\right] . \end{aligned}$$

The proof of Theorem 3.8 is analogous to the one of Theorem 3.5, so we skip it.

In the rest of the paper we assume that \(X={\mathcal {H}}\), where \({\mathcal {H}}\) is a real Hilbert space, \(\beta _i>0\) and \(\gamma _{C_i}(\cdot )=\beta _i\Vert \cdot \Vert _{{\mathcal {H}}}\). Note that in this situation \(C_i=\{x\in {\mathcal {H}}:\beta _i\Vert x\Vert _{{\mathcal {H}}}\le 1\}\), \(X^*={\mathcal {H}}\) and \(\gamma _{C_i^0}(\cdot )=(1/\beta _i)\Vert \cdot \Vert _{{\mathcal {H}}}\), \(i=1, \ldots , n\).

Corollary 3.1

Let \({\overline{x}}\in {\mathcal {H}^n}\) be an optimal solution to the optimization problem \((P_{{\mathcal {T}}})\). If \(({\overline{y}}_1^{*}, \ldots ,{\overline{y}}_n^*) \in \mathcal {H}^n\) is an optimal solution to \(({\widetilde{D}}_{{\mathcal {T}}})\) with the corresponding \({\overline{I}} \subseteq \{1, \ldots , n\}\), then there exist \(\phi _i\in \Omega _i\) and \(\psi _i\in L_i\) fulfilling \(\sigma _{L_i}\left( {\overline{y}}_i^{*}\right) =\langle {\overline{y}}_i^{*},\psi _i\rangle \) and \(\sigma _{\Omega _i}\left( {\overline{y}}_i^{*}\right) =\langle {\overline{y}}_i^{*},\phi _i\rangle \), i.e. \(\psi _i\in \partial \sigma _{L_i}({\overline{y}}_i^{*})\) and \(\phi _i\in \partial \sigma _{\Omega _i}({\overline{y}}_i^{*})\), \(i\in {\overline{I}}\), such that

$$\begin{aligned} {\overline{x}}=\frac{1}{\sum \limits _{i\in {\overline{I}}}\beta _i\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}}\sum \limits _{i\in {\overline{I}}}\beta _i\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}(\phi _i+\psi _i). \end{aligned}$$

Proof

By Theorem 3.8 it holds for each \(i\in {\overline{I}}\)

$$\begin{aligned} {\overline{x}}\in \partial \left( v({\widetilde{D}}_{{\mathcal {T}}})\frac{1}{\beta _i}\Vert \cdot \Vert _{{\mathcal {H}}}\right) ({\overline{y}}_i^*)+\partial \sigma _{L_i}({\overline{y}}_i^*)+ \partial \sigma _{\Omega _i}({\overline{y}}_i^*), \end{aligned}$$

which means that there exist \(\phi _i\in \partial \sigma _{L_i}({\overline{y}}_i^*)\) and \(\psi _i\in \partial \sigma _{\Omega _i}({\overline{y}}_i^*)\) such that

$$\begin{aligned} {\overline{x}}-\phi _i-\psi _i\in v({\widetilde{D}}_{{\mathcal {T}}})\frac{1}{\beta _i}\partial \left( \Vert \cdot \Vert _{{\mathcal {H}}}\right) ({\overline{y}}_i^*). \end{aligned}$$

As \({\overline{y}}_i^*\ne 0_{{\mathcal {H}}}\), it follows, as \(v({\widetilde{D}}_{{\mathcal {T}}})>0\) due to Remark 3.6 and Theorem 3.7, that

$$\begin{aligned} \frac{\beta _i}{v({\widetilde{D}}_{{\mathcal {T}}})}({\overline{x}}-\phi _i-\psi _i)=\frac{1}{\Vert {\overline{y}}_i^*\Vert _{{\mathcal {H}}}}{\overline{y}}_i^*,~i\in {\overline{I}}. \end{aligned}$$
(27)

Now, we take the sum over all \(i\in {\overline{I}}\) in (27) and get that

$$\begin{aligned} \frac{1}{v({\widetilde{D}}_{{\mathcal {T}}})}\sum \limits _{i\in {\overline{I}}}\beta _i\Vert {\overline{y}}_i^*\Vert _{{\mathcal {H}}}{\overline{x}}=\frac{1}{v({\widetilde{D}}_{{\mathcal {T}}})}\sum \limits _{i\in {\overline{I}}}\beta _i\Vert {\overline{y}}_i^*\Vert _{{\mathcal {H}}}(\phi _i+\psi _i)+\sum \limits _{i\in {\overline{I}}}{\overline{y}}_i^*. \end{aligned}$$
(28)

From the optimality condition (iii) of Theorem 3.7 follows that the last term of (28) is equal to zero, which finally yields that

$$\begin{aligned} {\overline{x}}=\frac{1}{\sum \limits _{i\in {\overline{I}}}\beta _i\Vert {\overline{y}}_i^*\Vert _{{\mathcal {H}}}}\sum \limits _{i\in {\overline{I}}}\beta _i\Vert {\overline{y}}_i^*\Vert _{{\mathcal {H}}}(\phi _i+\psi _i). \end{aligned}$$
(29)

\(\square \)

Remark 3.12

If \(\Omega _i={\mathcal {H}}\) and \(L_i=\{p_i\}\), \(i=1, \ldots , n\), where \(p_1, \ldots ,p_n\) are distinct points in \({\mathcal {H}}\), then (see [27, Corollary 1])

$$\begin{aligned} {\overline{x}}=\frac{1}{\sum \limits _{i\in {\overline{I}}}\beta _i\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}}\sum _{i\in {\overline{I}}}\beta _i\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}p_i. \end{aligned}$$

Remark 3.13

Take note that one can prove in the same way as in the proof of [27, Corollary 2] that for a feasible solution \(y^*\) to \(({\widetilde{D}}_{{\mathcal {T}}})\) it holds

$$\begin{aligned} \Vert y_i^*\Vert _{{\mathcal {H}}}\le \frac{\beta _s\beta _i}{\beta _s+\beta _i},~i\in I, \end{aligned}$$

where \(\beta _s:=\max _{1\le i\le n}\{\beta _i\}\).

Remark 3.14

Under the assumption that \(\beta _1= \ldots =\beta _n=1\) and \(\cap _{i=1}^n(\Omega _i+L_i)=\emptyset \), any two nonzero components of an optimal solution \(\overline{y}^*\) to \(({\widetilde{D}}_{{\mathcal {T}}})\) are linearly independent. To see this, let us assume that there exist \(i,j\in {\overline{I}}\), \(i\ne j\), and \(k_j>0\) such that \({\overline{y}}_i^{*}= k_j {\overline{y}}_j^{*}\). Further, it holds by (27) that

$$\begin{aligned} {\overline{y}}_i^{*}=\frac{\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}}{v({\widetilde{D}}_{{\mathcal {T}}})}({\overline{x}}-\phi _i-\psi _i)=k_j {\overline{y}}_j^{*}=k_j\frac{\Vert {\overline{y}}_j^{*}\Vert _{{\mathcal {H}}}}{v({\widetilde{D}}_{{\mathcal {T}}})}({\overline{x}}-\phi _j-\psi _j), \end{aligned}$$

from which follows that

$$\begin{aligned} {\overline{x}}-\phi _i-\psi _i =k_j\frac{\Vert {\overline{y}}_j^{*}\Vert _{{\mathcal {H}}}}{\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}}({\overline{x}}-\phi _j-\psi _j), \end{aligned}$$

thus one gets that \({\overline{x}}-\phi _i-\psi _i={\overline{x}}-\phi _j-\psi _j \Leftrightarrow \phi _i+\psi _i=\phi _j+\psi _j\), which contradicts the assumption that \(\cap _{i=1}^n(\Omega _i+L_i)=\emptyset \). Therefore, \({\overline{y}}_i^{*}\ne k_j {\overline{y}}_j^{*}\), \(k_j>0\), \(i\ne j\), for all \(i,j\in {\overline{I}}\).

Remark 3.15

(i) Clearly, if \(\partial \sigma _{\Omega _i+L_i}({\overline{y}}_i^{*})\) is a singleton for some \(i\in {\overline{I}}\) (which is the situation, when for instance the set \(\Omega _i+L_i\) is strictly convex or its indicator function is Gâteaux-differentiable at \({\overline{y}}_i^{*}\)), then the optimal solution \({\overline{x}}\) of \((P_{{\mathcal {T}}})\) can be determined immediately by the formula (27), i.e.

$$\begin{aligned} {\overline{x}}=\frac{v({\widetilde{D}}_{{\mathcal {T}}})}{\beta _i\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}}{\overline{y}}_i^{*}+\phi _i+\psi _i. \end{aligned}$$

(ii) Recall moreover that \(v({\widetilde{D}}_{{\mathcal {T}}})=v(P_{{\mathcal {T}}})\) due to Theorem 3.6. For instance, take for an index \(i\in {\overline{I}}\) the sets \(\Omega _i:=\{0_{{\mathcal {H}}}\}\) and \(L_i:=\{x\in {\mathcal {H}}:\Vert x-p_i\Vert _{{\mathcal {H}}}\le a_i\}\), where \(p_i\in {\mathcal {H}}\) and \(a_i>0\), then

$$\begin{aligned}&\psi _i\in \partial \sigma _{L_i}({\overline{y}}_i^{*})\Leftrightarrow \psi _i\in \partial \left( \frac{1}{a_i}\Vert \cdot \Vert _{{\mathcal {H}}}\right) ({\overline{y}}_i^{*})+p_i\Leftrightarrow a_i(\psi _i-p_i)\in \partial \left( {\Vert }\cdot {\Vert }_{{\mathcal {H}}}\right) ({\overline{y}}_i^{*})\\&\quad \Leftrightarrow a_i(\psi _i-p_i)=\frac{1}{\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}}{\overline{y}}_i^{*}\Leftrightarrow \psi _i=\frac{1}{a_i\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}}{\overline{y}}_i^{*}+p_i, \end{aligned}$$

hence

$$\begin{aligned} {\overline{x}}=\left( \frac{v({\widetilde{D}}_{{\mathcal {T}}})}{\beta _i}+ \frac{1}{a_i}\right) \frac{1}{\Vert {\overline{y}}_i^{*}\Vert _{{\mathcal {H}}}}{\overline{y}}_i^{*}+p_i. \end{aligned}$$

(iii) Let us consider another situation where \({\mathcal {H}}={\mathbb {R}}^m\), \(\Vert \cdot \Vert _{\infty }\) is the \(\infty \)-norm, \(\Vert \cdot \Vert _1\) is the \(l_1\)-norm, \(\Omega _i:=\{0_{{\mathbb {R}}^m}\}\) and \(L_i:=\{x\in {\mathbb {R}}^m:\Vert x-p_i\Vert _{\infty }\le a_i\}\). Further, let \({\overline{x}}=({\overline{x}}_1, \ldots ,{\overline{x}}_m)^T\in {\mathbb {R}}^m\), \({\overline{y}}_i^{*}=({\overline{y}}_{i1}^*, \ldots ,{\overline{y}}_{im}^*)^T\in {\mathbb {R}}^m\), \(\psi _i=(\psi _{i1}, \ldots ,\psi _{im})^T\in {\mathbb {R}}^m\) and \(p_i=(p_{i1}, \ldots ,p_{im})^T\in {\mathbb {R}}^m\). One has

$$\begin{aligned}&\psi _i\in \partial \sigma _{L_i}({\overline{y}}_i^{*})=\partial \left( a_i\Vert \cdot \Vert _1\right) ({\overline{y}}_i^{*})+p_i\nonumber \\&\quad \Leftrightarrow \frac{1}{a_i}(\psi _i-p_i)\in \partial (\Vert \cdot \Vert _1)({\overline{y}}_i^{*})\Leftrightarrow \frac{1}{a_i}(\psi _{ij}-p_{ij})\in {\left\{ \begin{array}{ll} \{1\}, &{} \text {if }~ {\overline{y}}_{ij}^*>0,\\ \{-1\}, &{} \text {if }~{\overline{y}}_{ij}^*<0,\\ {[}-1,1], &{}\text {if }~{\overline{y}}_{ij}^*=0, \end{array}\right. } \end{aligned}$$
(30)

\(i\in {\overline{I}}\), \(j=1, \ldots ,m\). Now, we define the following index set \(J=\{j\in \{1, \ldots ,m\}:{\overline{y}}_{ij}^*=0 \text { for all } i\in {\overline{I}}\}\), then it follows for \(j\in J\) from (30) that

$$\begin{aligned} \frac{1}{a_i}(\psi _{ij}-p_{ij})\in [-1,1] \Leftrightarrow \psi _{ij}\in p_{ij}+\left[ -a_i, a_i\right] . \end{aligned}$$
(31)

Combining (27) and (31) implies that under the corresponding hypotheses it holds

$$\begin{aligned} {\overline{x}}_j\in \bigcap \limits _{i\in {\overline{I}}}\left( p_{ij}+\left[ -a_i, a_i\right] \right) ,~j\in J. \end{aligned}$$

Otherwise, if \(j\notin J\), there exists \(i\in {\overline{I}}\) such that \({\overline{y}}_{ij}^*>0\) and by (30) it holds

$$\begin{aligned}&\psi _{ij} ={\left\{ \begin{array}{ll} a_i+p_{ij}, &{} \text {if }~{\overline{y}}_{ij}^*>0,\\ -a_i+p_{ij}, &{} \text {if }~{\overline{y}}_{ij}^*<0, \end{array}\right. } \end{aligned}$$
(32)

and hence, one gets by (27) and (32) that

$$\begin{aligned}&{\overline{x}}_j ={\left\{ \begin{array}{ll} \frac{v({\widetilde{D}}_{{\mathcal {T}}})}{\beta _i\Vert {\overline{y}}_{i}^*\Vert }{\overline{y}}_{ij}^*+a_i+p_{ij}, &{} \text {if } {\overline{y}}_{ij}^*>0,\\ \frac{v({\widetilde{D}}_{{\mathcal {T}}})}{\beta _i\Vert {\overline{y}}_{i}^*\Vert }{\overline{y}}_{ij}^*-a_i+p_{ij}, &{} \text {if } {\overline{y}}_{ij}^*<0. \end{array}\right. } \end{aligned}$$

(iv) Finally, when \(m=2\), \(|{\overline{I}}|>2\) and \(\beta _1= \ldots =\beta _n=1\), it follows by Remark 3.14 that \(J=\emptyset \) and hence, there exist \(i,j\in {\overline{I}}\) with \({\overline{y}}^*_{i1}\ne 0\) and \({\overline{y}}^*_{j2}\ne 0\) such that

$$\begin{aligned} {\overline{x}}_1 ={\left\{ \begin{array}{ll} \frac{v({\widetilde{D}}_{{\mathcal {T}}})}{\Vert {\overline{y}}_{i}^*\Vert }{\overline{y}}_{i1}^*+a_i+p_{i1}, &{} \text {if } {\overline{y}}_{i1}^*>0,\nonumber \\ \frac{v({\widetilde{D}}_{{\mathcal {T}}})}{\Vert {\overline{y}}_{i}^*\Vert }{\overline{y}}_{i1}^*-a_i+p_{i1}, &{} \text {if } {\overline{y}}_{i1}^*<0, \end{array}\right. } \text { and } {\overline{x}}_2 ={\left\{ \begin{array}{ll} \frac{v({\widetilde{D}}_{{\mathcal {T}}})}{\Vert {\overline{y}}_{j}^*\Vert }{\overline{y}}_{j2}^*+a_j+p_{j2}, &{} \text {if } {\overline{y}}_{j2}^*>0,\nonumber \\ \frac{v({\widetilde{D}}_{{\mathcal {T}}})}{\Vert {\overline{y}}_{j}^*\Vert }{\overline{y}}_{j2}^*-a_j+p_{j2}, &{} \text {if } {\overline{y}}_{j2}^*<0. \end{array}\right. } \end{aligned}$$

Remark 3.16

The optimality conditions of Theorem 3.7 allow to give the following geometrical interpretation of the set of optimal solutions to the conjugate dual problem \(({\widetilde{D}}_{{\mathcal {T}}})\) in the situation when \({\mathcal {H}}={\mathbb {R}}^m\). When \(\gamma _{C_i}(\cdot )=\Vert \cdot \Vert \) one gets from condition (ii) of Theorem 3.7 via the Young–Fenchel inequality \( \Vert {\overline{y}}_i^{*}\Vert \Vert {\overline{x}}-\phi _i-\psi _i\Vert =\langle {\overline{y}}_i^{*},{\overline{x}}-\phi _i-\psi _i\rangle \), which means that the vectors \({\overline{y}}_i^{*}\) and \({\overline{x}}-\phi _i-\psi _i\), \(i\in {\overline{I}}\), are parallel and directed to \({\overline{x}}\). Further, from the optimal condition (v) of Theorem 3.7 one gets that \(i\in {\overline{I}}\), i.e. \({\overline{y}}_i^{*}\ne 0_{{\mathcal {H}}}\), if the points \(\phi _i+\psi _i\in \Omega _i+L_i\) are lying on the boundary of the ball centered at \({\overline{x}}\) with radius \(({\widetilde{D}}_{{\mathcal {T}}})\). If \(i\notin {\overline{I}}\), i.e. \({\overline{y}}_i^{*}=0_{{\mathcal {H}}}\), then the points \(\phi _i+\psi _i\in \Omega _i+L_i\) are lying on the border of a ball centered at \({\overline{x}}\) with radius \(t<v({\widetilde{D}}_{{\mathcal {T}}})\). Hence, the vectors \({\overline{y}}_i^{*}\), \(i\in {\overline{I}}\), can be interpreted as force vectors, which pull the sets \(\Omega _i+L_i\), \(i\in {\overline{I}}\), in direction of the center, the gravity point \({\overline{x}}\), to reduce the minimal time needed to reach the farthest set(s) (see Fig. 2).

We close this section with a statement which gives a formula of the projection operator onto the epigraph of the maximum of norms needed for our numerical experiments.

Theorem 3.9

Let \(\gamma _C:{\mathcal {H}}_1\times \ldots \times {\mathcal {H}}_n\rightarrow {\mathbb {R}}\) be defined by \(\gamma _C(x_1,\ldots ,x_n):=\max _{1\le i\le n}\{(\Vert x_i\Vert _{{\mathcal {H}}_i})/w_i\}\), with \(w_i>0\), \(i=1, \hdots , n\), then it holds

$$\begin{aligned} {{\,\mathrm{P}\,}}_{\text {epi}\gamma _C}(x_1,\ldots ,x_n,\xi )=\left\{ \begin{array}{l@{\quad }l} (x_1,\ldots ,x_n),&{} \text {if }\max \limits _{1\le i\le n}\left\{ \frac{1}{w_i}\Vert x_i\Vert _{{\mathcal {H}}_i}\right\} \le \xi ,\\ (0_{{\mathcal {H}}_1},\ldots ,0_{{\mathcal {H}}_n},0),&{} \text {if } \xi <0\text { and }\sum \limits _{i=1}^n w_i\Vert x_i\Vert _{{\mathcal {H}}_i}\le -\xi ,\\ ({\overline{y}}_1,\ldots ,{\overline{y}}_n,{\overline{\theta }}),&{}\text {otherwise}, \end{array} \right. \end{aligned}$$

where

$$\begin{aligned} {\overline{y}}_i=x_i-\frac{\max \{\Vert x_i\Vert _{{\mathcal {H}}_i}-({\overline{\kappa }}+\xi )w_i,0\}}{\Vert x_i\Vert _{{\mathcal {H}}_i}}x_i,~i=1,\ldots ,n,\text { and }{\overline{\theta }}=\frac{\sum \nolimits _{i=k+1}^nw^2_i\tau _i+\xi }{\sum \nolimits _{i=k+1}^nw_i^2+1} \end{aligned}$$

with

$$\begin{aligned} {\overline{\kappa }}=\frac{\sum \nolimits _{i=k+1}^nw_i^2\tau _i-\xi \sum \nolimits _{i=k+1}^nw_i^2}{\sum \nolimits _{i=k+1}^n w_i^2+1} \end{aligned}$$
(33)

and \(k\in \{0,1,\ldots ,n-1\}\) is the unique integer such that \(\tau _k+\xi \le {\overline{\kappa }}\le \tau _{k+1}+\xi \), where the values \(\tau _0,\ldots ,\tau _n\) are defined by \(\tau _0:=0\) and \(\tau _i:=(\Vert x_i\Vert _{{\mathcal {H}}_i})/w_i\), \(i=1,\ldots ,n\), and in ascending order.

Proof

As \(C=\{(x_1,\ldots ,x_n):{\mathcal {H}}_1\times \ldots \times {\mathcal {H}}_n:\max _{1\le i\le n}\{(1/w_i)\Vert x_i\Vert _{{\mathcal {H}}_i}\}\}\le 1\) (see [26, Remark 1]), [28, Corollary 2.5] reveals that

$$\begin{aligned} {{\,\mathrm{P}\,}}_{\text {epi}\gamma _C}(x_1,\ldots ,x_n,\xi )={\left\{ \begin{array}{ll} (x_1,\ldots ,x_n,\xi ),&{} \text {if } \max \limits _{1\le i\le n}\{\frac{1}{w_i}\Vert x_i\Vert _{{\mathcal {H}}_i}\}\le \xi ,\\ ({\overline{y}}_1,\ldots ,{\overline{y}}_n,{\overline{\theta }}),&{} \text {otherwise,} \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} ({\overline{y}}_1,\ldots ,,{\overline{y}}_n)=(x_1,\ldots ,x_n)-{\overline{\kappa }}{{\,\mathrm{P}\,}}_{C^0}\left( \frac{1}{{\overline{\kappa }}}(x_1,\ldots ,x_n)\right) ,~{\overline{\theta }}={\overline{\kappa }}+\xi \text { and } {\overline{\kappa }}>0. \end{aligned}$$

By [29, Lemma 4.5] the polar set of C looks like \(C^0=\{(x_1,\ldots ,x_n)\in {\mathcal {H}}_1\times \ldots \times {\mathcal {H}}_n:\sum _{i=1}^nw_i\Vert x_i\Vert _{{\mathcal {H}}_i}\le 1\}\) and from [28, Lemma 1.1] we derive that \({{\,\mathrm{P}\,}}_{C^0}\left( \frac{1}{{\overline{\kappa }}}(x_1,\ldots ,x_n)\right) =(x_1,\ldots ,x_n)\) if \(\sum _{i=1}^nw_i\Vert x_i\Vert _{{\mathcal {H}}_i}\le {\overline{\kappa }}\), i.e. \(({\overline{y}}_1,\ldots ,{\overline{y}}_n)=(0_{{\mathcal {H}}},\ldots ,0_{{\mathcal {H}}})\), which implies that \(\max _{1\le i\le n}\{{\overline{y}}_i\}=0={\overline{\theta }}={\overline{\kappa }}+\xi \) and hence, \({\overline{\kappa }}=-\xi \).

Otherwise, one has by [28, Lemma 1.1]

$$\begin{aligned} {{\,\mathrm{P}\,}}_{C^0}\left( \frac{1}{{\overline{\kappa }}}(x_1,\ldots ,x_n)\right) =({\overline{z}}_1,\ldots ,{\overline{z}}_n)\in {\mathcal {H}}_1\times \ldots \times {\mathcal {H}}_n, \end{aligned}$$

where

$$\begin{aligned} {\overline{z}}_i=\frac{\max \{\Vert x_i\Vert _{{\mathcal {H}}_i}-{\overline{\kappa }}{\overline{\mu }}w_i,0\}}{{\overline{\kappa }}\Vert x_i\Vert _{{\mathcal {H}}_i}}x_i,~i=1,\ldots ,n, \end{aligned}$$

and \({\overline{\mu }}>0\) is a solution of the equation [see (11) of the proof of [28, Lemma 1.1]]

$$\begin{aligned} \sum _{i=1}^nw_i\max \{\Vert x_i\Vert _{{\mathcal {H}}_i}-{\overline{\kappa }}{\overline{\mu }}w_i,0\}={\overline{\kappa }}. \end{aligned}$$
(34)

Therefore, it follows

$$\begin{aligned} {\overline{y}}_i=x_i-\frac{\max \{\Vert x_i\Vert _{{\mathcal {H}}_i}-{\overline{\kappa }}{\overline{\mu }}w_i,0\}}{\Vert x_i\Vert _{{\mathcal {H}}_i}}x_i= \frac{\Vert x_i\Vert _{{\mathcal {H}}_i}-\max \{\Vert x_i\Vert _{{\mathcal {H}}_i}-{\overline{\kappa }}{\overline{\mu }}w_i,0\}}{\Vert x_i\Vert _{{\mathcal {H}}_i}}x_i,~i=1,\ldots ,n, \end{aligned}$$

and as for \(\Vert x_i\Vert _{{\mathcal {H}}_i}-{\overline{\kappa }}{\overline{\mu }}w_i\le 0\) one gets \({\overline{y}}_i=x_i\), i.e. \(\Vert {\overline{y}}_i\Vert _{{\mathcal {H}}_i}=\Vert x_i\Vert _{{\mathcal {H}}_i}\) and for \(\Vert x_i\Vert _{{\mathcal {H}}_i}-{\overline{\kappa }}{\overline{\mu }}w_i> 0\), \({\overline{y}}_i=({\overline{\kappa }}{\overline{\mu }}w_i/\Vert x_i\Vert _{{\mathcal {H}}_i})x_i\), i.e. \(\Vert {\overline{y}}_i\Vert _{{\mathcal {H}}_i}={\overline{\kappa }}{\overline{\mu }}w_i\), \(i=1,\ldots ,n\), we obtain

$$\begin{aligned} \gamma _C({\overline{y}}_1,\ldots ,{\overline{y}}_n)=\max _{1\le i\le n}\left\{ \frac{1}{w_i}\Vert {\overline{y}}_i\Vert _{{\mathcal {H}}_i}\right\} ={\overline{\kappa }}{\overline{\mu }}={\overline{\kappa }}+\xi . \end{aligned}$$
(35)

Bringing (34) and (35) together yields

$$\begin{aligned} \sum \limits _{i=1}^nw_i\max \left\{ \Vert x_i\Vert _{{\mathcal {H}}_i}-({\overline{\kappa }}+\xi )w_i,0\right\} ={\overline{\kappa }}. \end{aligned}$$
(36)

Clearly, if \(\Vert x_i\Vert _{{\mathcal {H}}_i}-\xi w_i\le 0\) for all \(i=1,\ldots ,n\), i.e. \(\max _{1\le i\le n}\{\Vert x_i\Vert _{{\mathcal {H}}_i}/w_i\}\le \xi \), then \(\Vert x_i\Vert _{{\mathcal {H}}_i}-\xi w_i-{\overline{\kappa }} w_i\le 0\) for all \(i=1,\ldots ,n\), and one gets by (36) that

$$\begin{aligned} {\overline{\kappa }}=\sum \limits _{i=1}^nw_i\max \left\{ \Vert x_i\Vert _{{\mathcal {H}}_i}-({\overline{\kappa }}+\xi )w_i,0\right\} =0, \end{aligned}$$

which means that \({\overline{y}}_i=x_i\) for all \(i=1,\ldots ,n\), and \({\overline{\theta }}=\xi \).

Now, we define the function \(g:{\mathbb {R}}\rightarrow {\mathbb {R}}\) by

$$\begin{aligned} g(\kappa )=\sum _{i=1}^nw_i^2\max \left\{ \tau _i-(\kappa +\xi ),0\right\} -\overline{\kappa } \end{aligned}$$
(37)

Take note, that there exists \(i\in \{1,\ldots ,n\}\) such that \(\tau _i>0\) and so,

$$\begin{aligned} g(\tau _n-\xi )=\sum \limits _{i=1}^nw_i^2\max \left\{ \tau _i-\tau _n,0\right\} -\overline{\kappa }=-\overline{\kappa }<0. \end{aligned}$$

Moreover, as g is a piecewise linear function, one has, similarly to [28, Corollary 2.1], to find the unique integer \(k\in \{0,1,\ldots ,n-1\}\) such that \(g(\tau _k-\xi )\ge 0\) and \(g(\tau _{k+1}-\xi )\le 0\). This leads to

$$\begin{aligned} \sum \limits _{i=k+1}^nw^2_i\tau _i-\xi \sum \limits _{i=k+1}^nw_i^2-{\overline{\kappa }} \sum \limits _{i=k+1}^n(w_i^2+1)=0\Leftrightarrow {\overline{\kappa }}=\frac{\sum \nolimits _{i=k+1}^nw_i^2\tau _i-\xi \sum \nolimits _{i=k+1}^nw_i^2}{\sum \nolimits _{i=k+1}^n w_i^2+1} \end{aligned}$$

and hence, \({\overline{\theta }}={\overline{\kappa }}+\xi =(\sum _{i=k+1}^nw_i^2\tau _i+\xi )/(\sum _{i=k+1}^nw_i^2+1)\).\(\square \)

Remark 3.17

In [8] the formula in the previous corollary was given for the case where \({\mathcal {H}}_i={\mathbb {R}}\), \(i=1,\ldots ,n\), in other words, where \(\gamma _C\) is the weighted \(l_{\infty }\)-norm.

Remark 3.18

The proof of the previous theorem allows us to construct an algorithm to determine \({\overline{\kappa }}\) of Theorem 3.9.

Algorithm

  1. 1.

    If \(\max \limits _{1\le i\le n}\left\{ \frac{1}{w_i}\Vert x_i\Vert _{{\mathcal {H}}_i}\right\} \le \xi \), then \({\overline{\kappa }}=0\).

  2. 2.

    If \(\xi <0\) and \(\sum \limits _{i=1}^n w_i\Vert x_i\Vert _{{\mathcal {H}}_i}\le -\xi \), then \({\overline{\kappa }}=-\xi \).

  3. 3.

    Otherwise, define \(\tau _0:=0\), \(\tau _i:=\Vert x_i\Vert _{{\mathcal {H}}_i}/w_i\), \(i=1,\ldots ,n\), and sort \(\tau _0,\ldots ,\tau _n\) in ascending order.

  4. 4.

    Determine the values of g defined in (37) at \(\kappa =\tau _i+\xi \), \(i=0,\ldots ,n\).

  5. 5.

    Find the unique \(k\in \{0,\ldots ,n-1\}\) such that \(g(\tau _k-\xi )\ge 0\) and \(g(\tau _{k+1}-\xi )\le 0\).

  6. 6.

    Calculate \({\overline{\kappa }}\) by (33).

4 Numerical experiments

The aim in this section is to solve numerically several types of concrete location problems and their associated duals discussed in the previous section as well as to discuss the results generated via matlab. Here we set \(X={\mathbb {R}}^d\) and used for our numerical experiments a PC with an Intel Core i5 2400 CPU with 3.10 GHz and 8GB RAM. Note that in the previous sections we have considered for the theoretical investigations very general frameworks, however, in order to get closer to the real world applications, our numerical experiments are performed in finitely dimensional spaces.

First, we consider a location problem of the type analyzed in Sect. 3.3. To solve this kind of a location problem and its dual, rewritten as unconstrained optimization problems, we implemented in matlab the parallel splitting algorithm presented in [3, Proposition 27.8]. Note also that other recent proximal splitting methods could prove to be suitable for these problems, too, and comparisons of their performances on the problems we solve below could constitute an interesting follow-up of our investigations.

Theorem 4.1

(Parallel splitting algorithm) Let n be an integer such that \(n\ge 2\) and \(f_i:{\mathbb {R}}^d\rightarrow \overline{{\mathbb {R}}}\) be a proper, lower semicontinuous and convex function for \(i=1,\ldots ,n\). Suppose that the problem

$$\begin{aligned} (P^{DR})~~\min _{x \in {\mathbb {R}}^s}\left\{ \sum _{i=1}^n f_i(x)\right\} \end{aligned}$$

has at least one solution and that \(\text {dom}\, f_1\cap \bigcap _{i=2}^n\text {int }\text {dom}\, f_i\ne \emptyset \). Let \((\mu _k)_{k\in {\mathbb {N}}}\) be a sequence in [0, 2] such that \(\sum _{k\in {\mathbb {N}}}\mu _k(2-\mu _k)=+\infty \), let \(\nu >0\), and let \((x_{i,0})_{i=1}^n\in {\mathbb {R}}^d\times \ldots \times {\mathbb {R}}^d\). Set

figure a

Then \((r_k)_{k\in {\mathbb {N}}}\) converges to an optimal solution to \((P^{DR})\).

Take note, that for this purpose it is necessary to bring the location problem and its dual problem into the form of an unconstrained optimization problem where the objective function is a sum of proper, lower semicontinuous and convex functions. By following these ideas, we rewrite the location problem (see Remark 3.7)

$$\begin{aligned}&(P_{\gamma _G,~{\mathcal {T}}})\inf _{x\in X}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}(x)\right\} =\min _{\begin{array}{c} t> 0,~x\in {\mathbb {R}}^d,~z_i\in {\mathbb {R}}^d,\\ \gamma _{C_i}(x-p_i-z_i)+\gamma _{G_i}(z_i)\le t,~i=1,\ldots ,n \end{array}}t\\&\quad = \min _{\begin{array}{c} t>0,~x\in {\mathbb {R}}^d,~z_i\in {\mathbb {R}}^d,~\alpha _i\ge 0,~\beta _i\ge 0,\\ \gamma _{C_i}(x-p_i-z_i)\le \alpha _i,~\gamma _{G_i}(z_i)\le \beta _i,\\ \alpha _i+\beta _i=t,~i=1,\ldots ,n \end{array}}t=\min _{\begin{array}{c} t>0,~x\in {\mathbb {R}}^d,~z_i\in {\mathbb {R}}^d,~\alpha _i\ge 0,~\beta _i\ge 0,\\ (x-p_i-z_i, \alpha _i)\in \text {epi}\,\gamma _{C_i},~(z_i,\beta _i)\in \text {epi}\,\gamma _{G_i},\\ \alpha _i+\beta _i=t,~i=1,\ldots ,n \end{array}}t, \end{aligned}$$

where \(C_i\) and \(G_i\) are closed and convex subsets of \({\mathbb {R}}^d\) with \(0_{{\mathbb {R}}^d}\in \text {int }C_i,~0_{{\mathbb {R}}^d}\in \text {int }G_i\), and \(\Omega _i=\{p_i\}\) with \(p_i\in {\mathbb {R}}^d\), \(i=1,\ldots ,n\), as follows

$$\begin{aligned} (P_{\gamma _G,~{\mathcal {T}}})~~ \min _{\begin{array}{c} t>0,~x\in {\mathbb {R}}^d,~z_i\in {\mathbb {R}}^d,\\ \alpha _i\ge 0,~\beta _i\ge 0,~i=1,\ldots ,n \end{array}}&\left\{ t+\sum _{i=1}^n\left[ \delta _{\text {epi}\gamma _{C_i}}(x-p_i-z_i,\alpha _i)\right. \right. \\&\left. +\delta _{\text {epi}\gamma _{G_i}}(z_i,\beta _i)\right] +\delta _{H}(\alpha ,\beta ,t)\Bigg \}, \end{aligned}$$

where \(\alpha =(\alpha _1,\ldots ,\alpha _n)^\top \), \(\beta =(\beta _1,\ldots ,\beta _n)^\top \) and \(H=\{(\alpha ,\beta ,t)^\top :\alpha _i+\beta _i=t,~i=1,\ldots ,n\}\). Similarly, the dual problem

$$\begin{aligned} (D_{\gamma _G,~{\mathcal {T}}})~~\max \limits _{\begin{array}{c} z_i^{*}\ge 0,~w_i^{*}\in {\mathbb {R}}^d, \gamma _{C_i^0}(w_i^{*})\le z_i^{*},\\ \gamma _{G_i^0}(w_i^{*})\le z_i^{*},~i=1,\ldots ,n,~\sum \limits _{i=1}^nz_i^{*}\le 1,~\sum \limits _{i=1}^nw_i^{*}=0_{{\mathbb {R}}^d} \end{array}} \left\{ -\sum _{i\in I}\sigma _{\Omega _i}\left( w_i^{*}\right) \right\} , \end{aligned}$$

can be equivalently rewritten as

$$\begin{aligned} (D_{\gamma _G,~{\mathcal {T}}})~-\min _{\begin{array}{c} z_i^*\ge 0,~w_i^*\in {\mathbb {R}}^d\\ i=1,\ldots ,n \end{array}}&\left\{ \sum _{i=1}^n\left[ p_i^\top w_i^* +\delta _{\text {epi}\gamma _{C_i^0}}(w_i^*,z_i^*)+\delta _{\text {epi}\gamma _{G_i^0}}(w_i^*,z_i^*)\right] \right. \\&\quad +\, \delta _D(z^*)+\delta _E(w^*)\Bigg \}, \end{aligned}$$

where \(z^*=(z_1^*,\ldots ,z_n^*)^\top \), \(w^*=(w_1^*,\ldots ,w_n^*)\), \(D=\{z^*\in {\mathbb {R}}_+^n:\sum _{i=1}^nz_i^*\le 1\}\) and \(E=\{w^*\in {\mathbb {R}}^d\times \ldots \times {\mathbb {R}}^d:\sum _{i=1}^nw_i^*=0_{{\mathbb {R}}^d}\}\). For both these optimization problems the nonnegativity constraints can be omitted because they are implicitly contained in the indicator functions of the epigraphs of gauges from the objective functions and one can verify that the hypotheses of Theorem 4.1 are fulfilled. Moreover, for the full implementation of this algorithm for solving these problems numerically, one requires also formulae of the proximal mappings of the functions involved in the objective function of the primal and the dual problem, which can be found for instance in [3, 28].

Example 4.1

Take \(\gamma _{C_i}=\Vert \cdot \Vert \) and \(\gamma _{G_i}=\Vert \cdot \Vert _{\infty }\), \(i=1,\ldots ,5\), and set \(d=2\), \(\Omega _i=\{p_i\}\), \(i=1,\ldots ,5\), with \(p_1=(-8,-9)^T\), \(p_2=(10,0)^T\), \(p_3=(11,5)^T\), \(p_4=(-12,10)^T\) and \(p_5=(4,13)^T\). To compute the required proximal and projection points in the primal and dual problem we used Theorem 3.9 , [28, Lemma 2.1], [28, Lemma 2.2], [28, Corollary 2.1], [3, Proposition 23.32] and [3, Example 28.14 (iii)]. We ran our matlab programs for various step sizes \(\nu \) and chose always the origin as the starting point and set the initialization parameters to the value 1. The best performance results of our tests are illustrated in Table 1. As stopping criterion for the iteration of both programs we used the values \(\varepsilon _1 =10^{-4}\) and \(\varepsilon _2 =10^{-8}\), which define the maximum bounds from the optimal solution, respectively. matlab computed for the location problem (primal) an optimal location at \({\overline{x}}=(-0.5,2.2878)^T\) with

$$\begin{aligned}&{\overline{z}}_1=(5.9851,5.9851)^T,~{\overline{z}}_2=(-3.3359, 3.0337)^T,~{\overline{z}}_3=(-3.7516,-2.7122)^T,\\&{\overline{z}}_4=(7.7122,-7.7122)^T,~{\overline{z}}_5=(-3.4694,-3.4402)^T,\\&({\overline{\alpha }}_1,\ldots .,{\overline{\alpha }}_5)^T= (5.5149,7.6765,7.7484,3.7878,7.6526)^T\\&({\overline{\beta }}_1,\ldots ,{\overline{\beta }}_5)^T= (5.9851,3.8235,3.7516,7.7122,3.8474)^T, \end{aligned}$$

where the optimal objective value was \(v(P_{\gamma _G,~{\mathcal {T}}})=11.5\). Note that the optimal solution of the location problem is not unique and may differ for each chosen step size and starting point.

For the dual problem the following optimal solution was computed \({\overline{w}}_1^*={\overline{w}}_2^*={\overline{w}}_5^*=(0,0)^T\), \({\overline{w}}_3^*=(-0.5,0)^T\) and \({\overline{w}}_4^*=(0.5,0)^T\) with the objective function value \(v(D_{\gamma _G,~{\mathcal {T}}})=11.5\), i.e. \(v(P_{\gamma _G,~{\mathcal {T}}})=v(D_{\gamma _G,~{\mathcal {T}}})\). Note that, similar to Remark 3.16, one can understand the vectors \({\overline{w}}_i^*\), \(i=1,\ldots ,n\), as force vectors fulfilling the optimality conditions of Theorem 3.4 and increasing the maximum norm balls centered at the given points and the Euclidean norm balls centered at \({\overline{x}}\) until their intersection is non-empty. Especially, it follows from the optimality conditions (iv) and (vi) that an index i belongs to the optimal index set \({\overline{I}}\), if the value of the associated extended minimal time function \({\mathcal {T}}_{\Omega _i,\gamma _{G_i}}^{C_i}\) at \({\overline{x}}\) is equal to \({\overline{\lambda }}\), which is exactly the case when the corresponding vector \({\overline{w}}_i^*\) is unequal to the zero vector (in our example \({\overline{I}}=\{3,4\}\), see Fig. 1). At this point it is also important to say, that for a better visualization we multiplied the vectors, characterizing the optimal solution of the dual problem, in all figures with the value 3.

Table 1 Performance evaluation for 5 points in \({\mathbb {R}}^2\)
Fig. 1
figure 1

Visualization of the optimal solutions of the location problem \((P_{\gamma _G,~{\mathcal {T}}})\) and its dual problem \(({D}_{\gamma _G,~{\mathcal {T}}})\)

One can note in Table 1 that the primal method needs less iterations, while the dual method generates faster a solution which is within the maximum bound from the optimal solution. By using the formula from Remark 3.6, the optimal location can be determined immediately:

$$\begin{aligned} {\overline{x}}=\frac{1}{0.5+0.5}\left( 0.5\cdot (11,5)^T+0.5\cdot (-12,10)^T\right) =(-0.5,7.5)^T. \end{aligned}$$

We also considered primal and dual problems defined by 20 given points. The computational information can be seen in Table 2 and draws a similar picture as in the previous situation. If we increase the accuracy to \(\varepsilon _2 = 10^{-8}\) the dual method is faster than the primal method, which could be especially a benefit for location problems with a large number of given points.

The second scenario of our numerical approach relies on the location problems discussed in Sect. 3.3. In this situation the location problem can be rewritten as follows

$$\begin{aligned}&(P_{{\mathcal {T}}})\min _{x\in {\mathbb {R}}^d}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\delta _{L_i}}^{C_i}(x)\right\} =\min _{\begin{array}{c} \tiny t\ge 0,~x\in {\mathcal {H}},\\ \min \limits _{y_i\in \Omega _i,~z_i\in L_i}\left\{ \gamma _{C_i}(x-y_i-z_i)\right\} \le t,\\ i=1,\ldots ,n \end{array}}t=\min _{\begin{array}{c} t\ge 0,~x\in {\mathbb {R}}^d,~y_i\in \Omega _i,~z_i\in L_i,\\ (x-y_i-z_i,t)\in \text {epi}\, \gamma _{C_i},~i=1,\ldots ,n \end{array}}\\&\quad t =\min _{\begin{array}{c} t\ge 0,~x,y_i,z_i\in {\mathbb {R}}^d,\\ i=1,\ldots ,n \end{array}}\left\{ t+\sum _{i=1}^n\left[ \delta _{\text {epi}\gamma _{C_i}}(x-y_i-z_i,t)+\delta _{\Omega _i}(y_i)+\delta _{L_i}(z_i)\right] \right\} , \end{aligned}$$
Table 2 Performance evaluation for 20 points in \({\mathbb {R}}^2\)

where \(C_i,~L_i\subseteq {\mathbb {R}}^d\) are closed and convex sets with \(0_{{\mathbb {R}}^d}\in \text {int }C_i\) and \(\Omega _i\subseteq {\mathbb {R}}^d\) are convex and compact sets, \(i=1,\ldots ,n\), and likewise one gets for its dual problem

$$\begin{aligned}&({\widetilde{D}}_{{\mathcal {T}}})\max \limits _{\begin{array}{c} y_i^{*}\in {\mathbb {R}}^d,~i=1,\ldots ,n, \\ \sum \limits _{i=1}^n\gamma _{C_i^0}(y_i^*)\le 1,~\sum \limits _{i=1}^n y_i^{*}=0_{{\mathbb {R}}^d} \end{array}} \left\{ -\sum _{i=1}^n\left[ \sigma _{L_i}\left( y_i^{*}\right) +\sigma _{\Omega _i}\left( y_i^{*}\right) \right] \right\} \\&\quad = -\min \limits _{y_i^{*}\in {\mathbb {R}}^d,~i=1,\ldots ,n}\left\{ \sum _{i=1}^n\left[ \sigma _{L_i}\left( y^{*}_i\right) +\sigma _{\Omega _i}\left( y^{*}_i\right) \right] +\delta _{F}(y^*_i)+\delta _{E}(y^*_i)\right\} , \end{aligned}$$

where \(y^*=(y_1^*,\ldots ,y_n^*)\), \(E=\{y^*\in {\mathbb {R}}^d\times \ldots \times {\mathbb {R}}^d:\sum _{i=1}^ny_i^*=0_{{\mathbb {R}}^d}\}\) and \(F=\{y^*\in {\mathbb {R}}^d\times \ldots \times {\mathbb {R}}^d: \sum _{i=1}^n\gamma _{C_i^0}(y_i^*)\le 1\}\). The nonnegativity constraint of \((P_{{\mathcal {T}}})\) can be omitted because it is implicitly contained in some indicator functions from the objective functions and one can then verify that the hypotheses of Theorem 4.1 are fulfilled.

Example 4.2

Let \(d=2,~p_1=(-8,8)^T,~p_2=(-7,0)^T,~p_3=(-4,-1)^T,~p_4=(2, 0)^T,~p_5=(2, -6)^T,~p_6=(7, 1)^T,~p_7=(6, 5)^T,~a_1=1,~a_2=2,~a_3=3,~a_4=0.5,~a_5=2,~a_6=1,~a_7=1,~b_1=0.5,~b_2=2,~b_3=0.6,~b_4=1,~b_5=1.5,~b_6=1,~b_7=0.5,~\Omega _i=\{x\in {\mathbb {R}}^2:\Vert x-p_i\Vert _{\infty }\le a_i\}\), \(L_i=\{x\in {\mathbb {R}}^2:\Vert x\Vert \le b_i\}\) and \(\gamma _{C_i}=\Vert \cdot \Vert \), \(i=1,\ldots ,7\). Note that in this case \(\sigma _{L_i}=\Vert \cdot \Vert \) and \(\sigma _{\Omega _i}=\Vert \cdot \Vert _1\), \(i=1,\ldots ,7\). Using the formulae given in [28, Corollary 2.3], [28, Corollary 2.1] and [28, Lemma 1.1] to compute the proximal and projection points regarding the location problem and its dual, we tested various step sizes \(\nu \), where the starting point was always the origin and the initialization parameters were set to the value 1. The best performance results are presented in Table 3 and visualized in Fig. 2 (note, that for a better visualization we multiplied the vectors, characterizing the optimal solution of the dual problem, with the value 3). The cancellation criterion for ending the iteration for both programs were the values \(\varepsilon _1=10^{-4}\) and \(\varepsilon _2=10^{-8}\), the maximum bounds from the optimal solution. The optimal location we obtained is \({\overline{x}}=(-1.0765,3.7039)^T\) and the optimal objective value is \(v(P_{{\mathcal {T}}})=6.2788\). Let us remark that the optimal solution of the location problem is not unique and may differ for each chosen step size. The optimal solution of the dual problem was found at \({\overline{y}}^*_1=(0.4072,-0.2266)^T\), \({\overline{y}}^*_2={\overline{y}}^*_3={\overline{y}}^*_4=(0,0)^T\), \({\overline{y}}^*_5=(-0.0186,0.1330)^T\), \({\overline{y}}^*_6=(-0.3886,0.0936)^T\) and \({\overline{y}}^*_7=(0,0)^T\), while the objective function value was \(v(\widetilde{D}_{\mathcal {T}})=6.2788\), which means that \(v(P_{{\mathcal {T}}})=v(\widetilde{D}_{\mathcal {T}})\). In Table 3 one can note that the dual method needed less CPU time as well as fewer iterations to determine a solution which is within the maximum bound from the optimal solution compared to the method which solves the location problem directly. The optimal location can be reconstructed by using the formulae given in Remark 3.15.

Table 3 Performance evaluation for 7 points in \({\mathbb {R}}^2\)
Fig. 2
figure 2

Visualization of the optimal solutions of the location problem \((P_{{\mathcal {T}}})\) and its dual problem \(({\widetilde{D}}_{{\mathcal {T}}})\)

Setting \(L_i=\{0_{{\mathbb {R}}^d}\}\), \(i=1,\ldots , n\), we were able to compare our two methods with the one presented in [17, Theorem 4.1] (or [13, Theorem 4.69]) that employs the subgradient method for solving the following generalized Sylvester problem

$$\begin{aligned} (P_{{\mathcal {T}}})&\min _{x\in {\mathbb {R}}^d}\max _{1\le i\le n}\left\{ {\mathcal {T}}_{\Omega _i,\delta _{\{0_{{\mathbb {R}}^d}\}}}^{C_i}(x)\right\} =\min _{\begin{array}{c} t\ge 0,~x,y_i\in {\mathbb {R}}^d,\\ i=1,\ldots ,n \end{array}}\left\{ t+\sum _{i=1}^n\left[ \delta _{\text {epi}\gamma _{C_i}}(x-y_i,t)+\delta _{\Omega _i}(y_i)\right] \right\} . \end{aligned}$$

The corresponding dual problem looks then as follows

$$\begin{aligned}&({\widetilde{D}}_{{\mathcal {T}}}) \max \limits _{\begin{array}{c} y_i^{*}\in {\mathbb {R}}^d,~i=1,\ldots ,n, \\ \sum \limits _{i=1}^n\gamma _{C_i^0}(y_i^*)\le 1,~\sum \limits _{i=1}^n y_i^{*}=0_{{\mathbb {R}}^d} \end{array}} \left\{ -\sum _{i=1}^n\sigma _{\Omega _i}\left( y_i^{*}\right) \right\} \\&\quad =-\min \limits _{\begin{array}{c} y_i^{*}\in {\mathbb {R}}^d,\\ i=1,\ldots ,n \end{array}}\left\{ \sum _{i=1}^n\sigma _{\Omega _i}\left( y^{*}\right) +\delta _{F}(y^*)+\delta _{E}(y^*)\right\} . \end{aligned}$$

Theorem 4.2

(cf. [13, Theorem 4.69]) Let \({\mathcal {H}}={\mathbb {R}}^m\), fix \(x_1\in {\mathbb {R}}^m\) and define the sequences of iterates by

$$\begin{aligned} x_{k+1}:=x_k-\alpha _k v_k,~k\in {\mathbb {N}}, \end{aligned}$$

where \(\{\alpha _k\}\) are positive numbers, and where

$$\begin{aligned} v_k\in {\left\{ \begin{array}{ll} \{0_{{\mathbb {R}}^m}\}, &{} \text {if } x_k \in \Omega _i+L_i, \\ {[}-\partial \Vert \cdot \Vert (w_k-x_k)]\cap N_{\Omega _i+L_i}(w_k), &{} \text {if } x_k \notin \Omega _i+L_i, \end{array}\right. } \end{aligned}$$

where \(w_k= {{\,\mathrm{P}\,}}_{\Omega _{i} + L_i}(x_k)\)  for some \(i\in I=\left\{ j=1,\ldots ,n:{\mathcal {T}}_{\Omega _j,\delta _{L_j}}^{C_j}(x)=\max \limits _{1\le l\le n}\left\{ {\mathcal {T}}_{\Omega _l,\delta _{L_l}}^{C_l}(x)\right\} \right\} \). Define the value sequence

$$\begin{aligned} V_k:=\min \left\{ \max \limits _{1\le l\le n}\left\{ {\mathcal {T}}_{\Omega _l,\delta _{L_l}}^{C_l}(x)\right\} \right\} . \end{aligned}$$

If the sequence \(\{\alpha _k\}\) satisfies \(\sum _{k=1}^\infty \alpha _k=\infty \) and \(\sum _{k=1}^\infty \alpha _k^2<\infty \), then \(\{V_k\}\) converges to the optimal value \({\overline{V}}\) and \(\{x_k\}\) converges to an optimal solution \({\overline{x}}\) to the problem \((P_{{\mathcal {T}}})\).

Example 4.3

(cf. [17, Example 4.3]) Let \(d=2,~p_1=(-8,8)^T,~p_2=(-7,0)^T,~p_3=(-4,-1)^T,~p_4=(2, 0)^T,~p_5=(2, -6)^T,~p_6=(7, 1)^T,~p_7=(6, 5)^T,~a_1=1,~a_2=2,~a_3=3,~a_4=0.5,~a_5=2,~a_6=1,~a_7=1,~\Omega _i=\{x\in {\mathbb {R}}^2:\Vert x-p_i\Vert _{\infty }\le a_i\},~L_i=\{0_{{\mathbb {R}}^2}\},~\gamma _{C_i}=\Vert \cdot \Vert ,~i=1,\ldots ,7\). We tested the subgradient method for the sequence \(\alpha _{k_1}=1/k\), which also was used in [17], as well as for the sequence \(\alpha _{k_1}=1/\sqrt{k+1}\) (see [20, Section 3.2.3]). Note that the subgradient method is simple to implement and can be employed for solving minmax location problems generated by various norms and generalized distances. The algorithms considered in this paper seem at the first look to be more complicated, also due to the necessity of determining some epigraphical projections, however, as seen below, they work faster and cheaper, and, taking for instance the results from Sect. 3, they can be employed for solving minmax location problems generated by various norms and generalized distances as well.

Fig. 3
figure 3

Visualization of the optimal solutions of the location problem \((P_{{\mathcal {T}}})\) and its dual problem \(({\widetilde{D}}_{{\mathcal {T}}})\)

For our numerical experiments we again used for all programs as starting point the origin and for the cancellation criterion for ending the iteration the values \(\varepsilon _1=10^{-4}\) and \(\varepsilon _2=10^{-8}\). Especially, we set again in the parallel splitting algorithms the initialization parameters to the value 1 and tested these methods for various step sizes \(\nu \). The best performance results were reached for the step sizes \(\nu =24\) and \(\nu =0.076\) for the location problem (primal) and its associated dual problem, respectively. The optimal location was found to be \({\overline{x}}=(-1.0556,3.0556)^T\) and for the optimal objective value we got the value \(v(P_{{\mathcal {T}}})=7.1340\). The optimal solution of the dual problem was determined as \({\overline{y}}_1^*=(0.3755,-0.2491)^T,~{\overline{y}}_2^*={\overline{y}}_3^*={\overline{y}}_4^*={\overline{y}}_7^*=(0,0)^T,~{\overline{y}}_5^*=(-0.0295,0.1974)^T,~ {\overline{y}}_6^*=(-0.3459,0.0518)^T\) and the objective value \(v({\widetilde{D}}_{{\mathcal {T}}})=7.1340\), i.e. \(v(P_{{\mathcal {T}}})=v({\widetilde{D}}_{{\mathcal {T}}})\) (see Fig. 3). As the Tables 4 and 5 demonstrate, the dual method performs once again very well, especially for the accuracy \(\varepsilon _2 = 10^{-8}\). While the subgradient method is the fastest one for the accuracy \(\varepsilon _1 = 10^{-4}\) and the sequence \(\alpha _{k_1}=1/k\), it has not reached the precision \(\varepsilon _2 = 10^{-8}\) after passing 500,000 iterations.

In Tables 6 and 7 we present the computational results obtained while solving a location problem defined by 50 points, where the dual method performs once again very well and is faster as the primal method. For the sequence \(\alpha _{k_1}=1/k\) the subgradient method has not reached machine precision after passing 500,000 iterations, whereas the sequence \(\alpha _{k_2}=1/\sqrt{k+1}\) performs surprisingly very well for \(\varepsilon _2=10^{-8}\). Hence, for the accuracy \(\varepsilon _2 = 10^{-8}\) the sequence \(\alpha _{k_2}= 1/\sqrt{k+1}\) is the optimal strategy for the subgradient method, as it also was shown in [20, Section 3.2.3] under the additional assumption that the objective function is Lipschitz continuous.

Table 4 Performance evaluation for 7 points in \({\mathbb {R}}^2\) with \(\varepsilon _1=10^{-4}\)
Table 5 Performance evaluation for 7 points in \({\mathbb {R}}^2\) with \(\varepsilon _2=10^{-8}\)
Table 6 Performance evaluation for 50 points in \({\mathbb {R}}^2\) with \(\varepsilon _1=10^{-4}\)
Table 7 Performance evaluation for 50 points in \({\mathbb {R}}^2\) with \(\varepsilon _2=10^{-8}\)

In the next, we present an example in the three-dimensional space, where we compared the two parallel splitting algorithms for the location problem and its dual, respectively.

Example 4.4

Let \(d=3,~p_1=(-8,8,8)^T,~p_2=(-7,0,0)^T,~p_3=(-4,-1,1)^T,~p_4=(2, 0,2)^T,~p_5=(2, -6,2)^T,~p_6=(7, 1,1)^T,~p_7=(6, 5,4)^T,~a_1=\ldots =a_7=0.5,~\Omega _i=\{x\in {\mathbb {R}}^3:\Vert x-p_i\Vert _{\infty }\le a_i\},~L_i=\{0_{{\mathbb {R}}^3}\},~\gamma _{C_i}=\Vert \cdot \Vert ,~i=1,\ldots ,7\). For the numerical tests we used here the same values for the initialization parameters, starting point and stopping criterion for the iteration as in the previous example. The performance results were determined for the step sizes \(\nu =10\) and \(\nu =0.055\) for the location problem and its associated dual problem, respectively, and are presented in Table 8. The optimal location was identified at \({\overline{x}}=(-1.4350,2.2492,4.5693)^T\) and for the optimal objective value we got the value \(v(P_{{\mathcal {T}}})=8.5408\). The optimal solution of the dual problem was determined as \({\overline{y}}_1^*=(0.3289,-0.2848,-0.1589)^T\), \({\overline{y}}_2^*={\overline{y}}_3^*={\overline{y}}_4^*={\overline{y}}_7^*=(0,0,0)^T\), \({\overline{y}}_5^*=(-0.0997,0.2632,0.0703)^T\) and \({\overline{y}}_6^*=( -0.2292,0.0216,0.0887)^T\) and the objective value \(v(D_{{\mathcal {T}}})=8.5408\), i.e. \(v(P_{{\mathcal {T}}})=v(D_{{\mathcal {T}}})\) (see Fig. 4).

Table 8 Performance evaluation for 7 points in \({\mathbb {R}}^3\)
Fig. 4
figure 4

Visualization of the optimal solutions of the location problem \((P_{{\mathcal {T}}})\) and its dual problem \(({\widetilde{D}}_{{\mathcal {T}}})\)

As one can notice in Table 8, the dual algorithm needs roughly half of the time and the number of iterations needed by the primal one to solve the problem.

We close this section by a comparison of the fastest method of the above considered solving strategies, i.e. the dual method, and the one introduced in [1] regarding speed and especially precision in high dimensions.

Example 4.5

We consider the location problem \((P_{{\mathcal {T}}})\) as well as its associated dual one \(({\widetilde{D}}_{{\mathcal {T}}})\) in the setting where \(\Omega _i=\{x\in {\mathbb {R}}^d:\Vert x-p_i\Vert _{\infty }\le a_i\}\), \(L_i=\{0_{{\mathbb {R}}^d}\}\) and \(\gamma _{C_i}=\Vert \cdot \Vert \), \(i=1,\ldots ,n\), and compare the dual method with the numerical algorithm build on the log–exponential smoothing technique and Nesterov’s accelerated gradient method (log–exp), which was developed in [1] for solving generalized Sylvester problems of the kind of \((P_{{\mathcal {T}}})\).

We implemented this algorithm in matlab and used for our numerical tests the same settings as in [1, Remark 5.1] and [1, Example 6.4] (i.e. \(\varepsilon =10^{-6}\), \({\widetilde{\varepsilon }}=10^{-5}\), \(p_0=5\) and \(\gamma _0=0.5\)). We considered four situations where the test results are printed in the Tables 9, 10, 11 and 12. Notice, that in all situations the starting point was the origin and the points \(p_1,\ldots ,p_n\) were generated by the command randn, while the corresponding radii \(a_1,\ldots ,a_n\) were given by rand. As mentioned, we are interested in an analysis regarding the precision of these two methods in such a way that the calculated objective values are exact up to six decimal places, which is especially important in the situation when these calculations are a part of a larger problem where the aim is to reduce rounding errors.

So, to guarantee this desired precision in all four situations we tested the log–exponential smoothing algorithm for various numbers of iterations N. For the scenario in Table 9 we received a solution such that the objective value was exact up to six decimal places for \(N=35{,}000\) and saved the calculated solution as the optimal solution \({\overline{x}}\) to \((P_{{\mathcal {T}}})\), where the corresponding objective value was \(v(P_{{\mathcal {T}}})=3.099896\). Then we ran the log–exponential algorithm a second time and noticed the number of iterations, the time needed to generate a solution which is within the maximum bound of \(10^{-6}\) from the optimal solution \({\overline{x}}\) and the associated objective value. Note that, if we reduce the number of iterations in the log–exponential smoothing algorithm then the speed to generate a solution such that the objective value is close to the optimal objective value increases, but the accuracy decreases, i.e. then the algorithm fails to calculate a solution such that the objective value is exact up to six decimal places. For the scenarios in the Tables 10, 11 and 12 we proceed in the same manner, where the corresponding values for \(N,v(P_{{\mathcal {T}}})\) and CPU time are also presented.

For the dual method we set in all situations N to 100,000 and saved the determined solutions for the second run, where then the number of iterations and the CPU time needed to get a solution which is within the maximum bound of \(10^{-6}\) were noticed. The corresponding objective values of the dual problem were also recorded.

As you may see in the Tables 9, 10, 11 and 12 the dual method performs again very well regarding speed and also precision in all four situations, which makes it a good candidate not only for problems where precision is of great importance but also for problems in high dimensions, that appear for instance in machine learning. Take also note, that if one has an optimal solution to the dual problem, then the optimal solution to the primal one can be reconstructed by using the formulae given in Remark 3.15.

Table 9 Performance evaluation for 10 points in \({\mathbb {R}}^{10}\)
Table 10 Performance evaluation for 50 points in \({\mathbb {R}}^{50}\)
Table 11 Performance evaluation for 100 points in \({\mathbb {R}}^{100}\)
Table 12 Performance evaluation for 100 points in \({\mathbb {R}}^{1000}\)

Remark 4.1

The examples investigated in this section reveal that the origin seems to be a good starting point for running the proposed splitting proximal point method on the dual problem of a given nonlinear minmax location problem. This is actually not very surprising when one analyzes the constraints of the dual problems that do not allow (all) the components of the feasible dual solutions to wander far away from the origin.

5 Conclusions

We investigate nonlinear minmax location problems (that generalize the classical Sylvester problem) formulated by means of an extended perturbed minimal time function introduced in this paper as well. The motivation to study such problems is not only intrinsic but comes from various areas of research and real life, such as geometry, physics, economics or health management, applications from these fields being mentioned in our paper as possible interpretations of our results. A conjugate duality approach based on rewriting the original problems as multi-composed optimization ones is considered, necessary and sufficient optimality conditions being delivered together with characterizations of the optimal solutions in some particular instances. A parallel splitting proximal point algorithm from [3] is then applied on some concrete location problems and on their duals in matlab, delivering optimal solutions to the considered optimization problem faster and with reduced costs than the existing methods in the literature. The tests show that employing the method on the dual is the fastest (and usually also the cheapest) way to solve a given nonlinear minmax location problem. Worth noticing is that this conclusion can be reached regardless of the magnitude of the considered data sets and of the dimension of the involved vectors, suggesting possible employment of the considered method for solving big data problems arising, for instance, in machine learning, by means of support vector techniques. Another idea for future developments of this contribution consists in employing other recent proximal splitting methods for solving nonlinear minmax location problems.