Skip to main content
Log in

An alternative proof of a PTAS for fixed-degree polynomial optimization over the simplex

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

The problem of minimizing a polynomial over the standard simplex is one of the basic NP-hard nonlinear optimization problems—it contains the maximum clique problem in graphs as a special case. It is known that the problem allows a polynomial-time approximation scheme (PTAS) for polynomials of fixed degree, which is based on polynomial evaluations at the points of a sequence of regular grids. In this paper, we provide an alternative proof of the PTAS property. The proof relies on the properties of Bernstein approximation on the simplex. We also refine a known error bound for the scheme for polynomials of degree three. The main contribution of the paper is to provide new insight into the PTAS by establishing precise links with Bernstein approximation and the multinomial distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Altomare, F., Campiti, M.: Korovkin-type approximation theory and its applications. De Guyter studies in mathematics, vol. 17. Walter de Guyter publishers, Berlin (1994)

    Book  Google Scholar 

  2. Bellare, M., Rogaway, P.: The complexity of approximating a nonlinear program. Math. Program. 69, 429–441 (1995)

    MATH  MathSciNet  Google Scholar 

  3. Bomze, I.M., de Klerk, E.: Solving standard quadratic optimization problems via semidefinite and copositive programming. J. Global Optim. 24(2), 163–185 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  4. Bomze, I.M., Gollowitzer, S., Yildirim, E.A.: Rounding on the standard simplex: regular grids for global optimization. J. Global Optim. 59(2–3), 243–258 (2014)

  5. Bos, L.P.: Bounding the Lebesque function for Lagrange interpolation in a simplex. J. Approx. Theory 38, 43–59 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  6. Ditzian, Z.: Inverse theorems for multidimensional Bernstein operators. Pac. J. Math. 121(2), 293–319 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  7. Ditzian, Z.: Best polynomial approximation and Bernstein polynomial approximation on a simplex. Indag. Math. 92, 243–256 (1989)

    Article  MathSciNet  Google Scholar 

  8. Ditzian, Z., Zhou, X.: Optimal approximation class for multivariate Bernstein operators. Pac. J. Math. 158, 93–120 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  9. Johnson, N.L., Kotz, S., Balakrishnan, N.: Discrete Multivariate Distributions. Wiley, New York (1997)

    MATH  Google Scholar 

  10. de Klerk, E., den Hertog, D., Elabwabi, G.: On the complexity of optimization over the standard simplex. Eur. J. Oper. Res. 191(3), 773–785 (2008)

    Article  MATH  Google Scholar 

  11. de Klerk, E., Laurent, M., Parrilo, P.: A PTAS for the minimization of polynomials of fixed degree over the simplex. Theor. Comput. Sci. 361(2–3), 210–225 (2006)

    Article  MATH  Google Scholar 

  12. de Klerk, E., Laurent, M.: Error bounds for some semidefinite programming approaches to polynomial minimization over the hypercube. SIAM J. Optim. 20(6), 3104–3120 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  13. Knoblauch, A.: Closed-form expressions for the moments of the binomial probability distribution. SIAM J. Appl. Math. 69(1), 197–204 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  14. Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11, 796–817 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  15. Motzkin, T.S., Straus, E.G.: Maxima for graphs and a new proof of a theorem of Túran. Can. J. Math. 17, 533–540 (1965)

    Article  MATH  MathSciNet  Google Scholar 

  16. Nesterov, Y.: Random Walk in a Simplex and Quadratic Optimization over Convex Polytopes. CORE Discussion Paper 2003/71, CORE-UCL, Louvain-La-Neuve (2003)

  17. Nesterov, Y., Wolkowicz, H., Ye, Y.: Semidefinite programming relaxations of nonconvex quadratic optimization. In: Wolkowicz, H., Saigal, R., Vandenberghe, L. (eds.) Handbook of Semidefinite Programming, pp. 361–419. Kluwer Academic Publishers, Norwell (2000)

    Chapter  Google Scholar 

  18. Sagol, G., Yildirim, E.A.: Analysis of Copositive Optimization Based Bounds on Standard Quadratic Optimization. Technical Report. Department of Industrial Engineering, Koc University, Sariyer, Istanbul, Turkey (2013)

  19. Vavasis, S.: Approximation algorithms for concave quadratic programming. In: Floudas, C.A., Pardalos, P. (eds.) Recent Advances in Global Optimization, pp. 3–18. Princeton University Press, Princeton (1998)

    Google Scholar 

  20. Yildirim, E.A.: On the accuracy of uniform polyhedral approximations of the copositive cone. Optim. Methods Softw. 27(1), 155–173 (2012)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

The first author thanks Immanuel Bomze for providing the reference [9], and Dima Pasechnik for valuable comments on the approach by Nesterov [16]. We also thank two anonymous reviewers for their comments that helped us improve the presentation of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhao Sun.

Appendix: Proof of Theorem 7

Appendix: Proof of Theorem 7

In this Appendix, we give a self-contained proof for Theorem 7, which provides an explicit description of the moments of the multinomial distribution in terms of the Stirling numbers of the second kind (and as a direct application the explicit formulation for the Bernstein approximation on the simplex from Corollary 4).

Given \(x\in \varDelta _n\), we consider the multinomial distribution with \(n\) categories and \(r\) independent trials, where the probability of drawing the \(i\)-th category is given by \(x_i\). Hence, given \(\alpha \in I(n,r)\), the probability of drawing \(\alpha _i\) times the \(i\)-th category for each \(i\in [n]\) is equal to \({r!\over \alpha !}x^\alpha \) and, for any \(\beta \in {\mathbb N}^n\), the \(\beta \)-th moment of the multinomial distribution is given by

$$\begin{aligned} m^\beta _{(n,r)}:=\sum _{\alpha \in I(n,r)} \alpha ^\beta {r!\over \alpha !} x^\alpha . \end{aligned}$$
(16)

Our objective is to show the following reformulaton of the \(\beta \)-th moment in terms of the Stirling numbers of the second kind:

$$\begin{aligned} m^\beta _{(n,r)} = \sum _{\alpha \in {\mathbb N}^n: \alpha \le \beta }r^{\underline{|\alpha |}}x^{\alpha }\prod _{i=1}^nS(\beta _i,\alpha _i). \end{aligned}$$
(17)

Our proof is elementary in the sense that we will obtain the moments of the multinomial distribution using its moment generating function. One of the ingredients which we will use is the fact that the identity (17) holds for the case \(n=2\) of the binomial distribution when \(\beta \in {\mathbb N}^2\) is of the form \(\beta =(\beta _1,0)\). Namely, the following identity is shown in [13] (see Theorem 2.2 and relation (3.1) there).

Lemma 5

[13] Given \(\beta _1\in {\mathbb N}\) and \(x_1\in {\mathbb R}\) such that \(0\le x_1\le 1\), one has

$$\begin{aligned} m^{(\beta _1,0)}_{(2,r)}= \sum _{\alpha _1=0}^{r} \alpha _1^{\beta _1} {r\atopwithdelims ()\alpha _1} x_1^{\alpha _1}(1-x_1)^{r-\alpha _1} = \sum _{\alpha _1=0}^{\beta _1} r^{\underline{\alpha _1}} x_1^{\alpha _1} S(\beta _1,\alpha _1). \end{aligned}$$

This implies that the identity (17) holds for the moments of the multinomial distribution when the order \(\beta \) has a single non-zero coordinate, i.e., \(\beta \) is of the form \(\beta =\beta _ie_i\) with \(\beta _i\in {\mathbb N}\).

Corollary 6

Given \(\beta _i\in {\mathbb N}\) and \(x\in \varDelta _n\), one has

$$\begin{aligned} m^{\beta _ie_i}_{(n,r)} = \sum _{\alpha _i=0}^{\beta _i} r^{\underline{\alpha _i}} x_i^{\alpha _i} S(\beta _i,\alpha _i). \end{aligned}$$

Proof

By (16), we have

$$\begin{aligned} m^{(\beta _ie_i)}_{(n,r)}&= \sum _{\alpha \in I(n,r)} \alpha _i^{\beta _i} {r!\over \alpha !} x^\alpha = \sum _{\alpha _i=0}^r {\alpha _i}^{\beta _i}{r!\over \alpha _i!(r-\alpha _i)!}x_i^{\alpha _i}\left( \sum _{\overline{\alpha }\in I(n-1,r-\alpha _i)}{(r-\alpha _i)!\over {\overline{\alpha }}!}x^{\overline{\alpha }}\right) \\&= \sum _{\alpha _i=0}^r{\alpha _i}^{\beta _i}{r\atopwithdelims ()\alpha _i} x_i^{\alpha _i}\left( \sum _{j\ne i} x_j\right) ^{r-\alpha _i} = \sum _{\alpha _i=0}^r{\alpha _i}^{\beta _i}{r \atopwithdelims ()\alpha _i}x_i^{\alpha _i}(1-x_i)^{r-\alpha _i},\\ \end{aligned}$$

which is equal to \( \sum _{\alpha _i=0}^{\beta _i} r^{\underline{\alpha _i}} x_i^{\alpha _i} S(\beta _i,\alpha _i)\) by Lemma 5.\(\square \)

In order to determine the moments of the multinomial distribution we use its moment generating function

$$\begin{aligned} t\in {\mathbb R}^n \mapsto M_{x}^{r}(t):=\left( \sum _{i=1}^nx_ie^{t_i}\right) ^r. \end{aligned}$$

Then, for \(\beta \in {\mathbb N}^n\), the \(\beta \)-th moment of the multinomial distribution is equal to the \(\beta \)-th derivative of the moment generating function evaluated at \(t=0\). Namely,

$$\begin{aligned} m^{\beta }_{(n,r)}=\left. {\frac{\partial ^{|\beta |}M_{x}^{r}(t)}{\partial t_1^{\beta _1}\ldots \partial t_{n}^{\beta _n} }}\right| _{t=0}. \end{aligned}$$
(18)

By Corollary 6 we know that, for any \(\beta _i\in {\mathbb N}\),

$$\begin{aligned} m^{(\beta _ie_i)}_{(n,r)}=\left. {\frac{\partial ^{\beta _i}M_{x}^{r}(t)}{\partial t_i^{\beta _i}}}\right| _{t=0}=\sum _{\alpha _i=0}^{\beta _i}S(\beta _i,\alpha _i)r^{\underline{\alpha _i}}x_i^{\alpha _i}. \end{aligned}$$
(19)

Next we show an analogue of the above relation (19) for the evaluation of the \(\beta _ie_i\)-th derivative of the moment generating function at any point \(t\in {\mathbb R}^n\).

Lemma 6

For \(x\in \varDelta _n, \beta _i\in {\mathbb N}\) and \(t\in {\mathbb R}^n\), one has

$$\begin{aligned} {\frac{\partial ^{\beta _i}M_{x}^{r}(t)}{\partial t_i^{\beta _i}}} =\sum _{\alpha _i=0}^{\beta _i}S(\beta _i,\alpha _i)r^{\underline{\alpha _i}}x_i^{\alpha _i}e^{\alpha _it_i}M_{x}^{r-\alpha _i}(t). \end{aligned}$$

For the proof we will use the following recursive relation for the Stirling numbers of the second kind.

Lemma 7

For any integers \(\beta \ge 0\) and \(\alpha \ge 1\), one has

$$\begin{aligned} S(\beta +1,\alpha )= S(\beta ,\alpha -1) + \alpha S(\beta ,\alpha ). \end{aligned}$$
(20)

Proof

This well known fact can be easily checked as follows. By definition, \(S(\beta +1,\alpha )\) counts the number of ways of partitioning the set \(\{1,\ldots ,\beta ,\beta +1\}\) into \(\alpha \) nonempty subsets. Considering the last element \(\beta +1\), one can either put it in a singleton subset (so that there are \(S(\beta ,\alpha -1)\) such partitions), or partition \(\{1,\ldots ,\beta \}\) into \(\alpha \) nonempty subsets and then assign the last element \(\beta +1\) to one of them (so that there are \(\alpha S(\beta ,\alpha )\) such partitions). This shows the result.\(\square \)

Proof

(of Lemma 6) To simplify notation we set \(M^r=M_{x}^{r}(t)\). We show the result using induction on \(\beta _i\ge 0\). The result holds clearly for \(\beta _i=0\) and also for \(\beta _i=1\) in which case we have

$$\begin{aligned} \frac{\partial M^r}{\partial t_i}=rx_ie^{t_i}M^{r-1}. \end{aligned}$$
(21)

We now assume that the result holds for \(\beta _i\) and we show that it also holds for \(\beta _i+1\). For this, using the induction assumption, we obtain

$$\begin{aligned} {\frac{\partial ^{\beta _i+1}M^{r}}{\partial t_i^{{\beta _i}+1}}}&= { \partial \over \partial t_i}{ \partial ^{{\beta _i}} M^{r}\over \partial t_i^{{\beta _i}}} = { \partial \over \partial t_i}\left( \sum _{\alpha _i=0}^{\beta _i}S(\beta _i,\alpha _i)r^{\underline{\alpha _i}}x_i^{\alpha _i}e^{\alpha _it_i}M^{r-\alpha _i}\right) \nonumber \\&= \sum _{\alpha _i=0}^{\beta _i}S(\beta _i,\alpha _i)r^{\underline{\alpha _i}}x_i^{\alpha _i} { \partial \over \partial t_i} ( e^{\alpha _it_i}M^{r-\alpha _i}). \end{aligned}$$
(22)

Now, using (21), we can compute the last term as follows:

$$\begin{aligned} { \partial \over \partial t_i} ( e^{\alpha _it_i}M^{r-\alpha _i}) = \alpha _i e^{\alpha _it_i} M^{r-\alpha _i} + (r-\alpha _i)x_i e^{(\alpha _i+1)t_i} M^{r-\alpha _i-1}. \end{aligned}$$

Plugging this into relation (22), we deduce

$$\begin{aligned} {\frac{\partial ^{\beta _i+1}M^{r}}{\partial t_i^{{\beta _i}+1}}}&= \sum _{\alpha _i=0}^{\beta _i} \alpha _iS(\beta _i,\alpha _i) r^{\underline{\alpha _i}} x_i^{\alpha _i} e^{\alpha _it_i} M^{r-\alpha _i}\\&\quad +\sum _{\alpha _i=0}^{\beta _i} S(\beta _i,\alpha _i) r^{\underline{\alpha _i+1}} x_i^{\alpha _i+1} e^{(\alpha _i+1)t_i} M^{r-\alpha _i-1}\\&= \sum _{\alpha _i=0}^{\beta _i} \alpha _iS(\beta _i,\alpha _i) r^{\underline{\alpha _i}} x_i^{\alpha _i} e^{\alpha _it_i} M^{r-\alpha _i} +\sum _{\alpha _i'=1}^{\beta _i+1} S(\beta _i,\alpha '_i-1) r^{\underline{\alpha '_i}} x_i^{\alpha '_i} e^{\alpha '_it_i} M^{r-\alpha '_i}\\&= \sum _{\alpha _i=0}^{\beta _i} (\underbrace{\alpha _i S(\beta _i,\alpha _i)+ S(\beta _i,\alpha _i-1)}_{= S(\beta _i+1,\alpha _i) \ \text { by } (20)}) r^{\underline{\alpha _i}} x_i^{\alpha _i} e^{\alpha _it_i} M^{r-\alpha _i}\\&\quad + r^{\underline{\beta _i+1}} x_i^{\beta _i+1} e^{(\beta _i+1)t_i}M^{r-\beta _i-1}\\&= \sum _{\alpha _i=0} ^{\beta _i+1} S(\beta _i+1,\alpha _i) r^{\underline{\alpha _i}} x_i^{\alpha _i} e^{\alpha _it_i} M^{r-\alpha _i}, \end{aligned}$$

which concludes the proof.\(\square \)

We now extend the result of Lemma 6 to an arbitrary derivative of the moment generating function.

Theorem 9

For any \(x\in \varDelta _n, \beta \in {\mathbb N}^n\) and \(t\in {\mathbb R}^n\), one has

$$\begin{aligned} {\frac{\partial ^{|\beta |}M_{x}^{r}(t)}{\partial t_1^{\beta _1}\ldots \partial t_{n}^{\beta _n} }} =\sum _{\alpha \in {\mathbb N}^n: \alpha \le \beta }r^{\underline{|\alpha |}}x^{\alpha }M_{x}^{r-|\alpha |}(t) \left( \prod _{i=1}^ne^{\alpha _it_i}S(\beta _i,\alpha _i)\right) . \end{aligned}$$

Proof

We show the result using induction on the size \(k\) of the support of \(\beta \), i.e., \(k=|\{i\in [n]:\beta _i\ne 0\}|.\) The result holds clearly for \(k=0\) and, for \(k=1\), the result holds by Lemma 6. We now assume that the result holds for \(k\) and we show that it also holds for \(k+1\). For this, consider the sequences \(\beta '=(\beta _1,\ldots ,\beta _k,\beta _{k+1},0,\ldots ,0)\) and \(\beta =(\beta _1,\ldots ,\beta _k,0,0,\ldots ,0)\in {\mathbb N}^n\), where \(\beta _1,\ldots ,\beta _{k+1}\ge 1\). By the induction assumption we know that

$$\begin{aligned} \frac{\partial ^{|\beta |}M^r}{\partial t_1^{\beta _1}\ldots \partial t_{k}^{\beta _k}}= \sum _{0\le \alpha \le \beta }r^{\underline{|\alpha |}}x^{\alpha }M^{r-|\alpha |} \left( \prod _{i=1}^ne^{\alpha _it_i}S(\beta _i,\alpha _i)\right) , \end{aligned}$$
(23)

setting again \(M^r=M^r_x(t)\) for simplicity. Using (23), we obtain

$$\begin{aligned} \frac{\partial ^{|\beta '|}M^r}{\partial t_1^{\beta _1}\ldots \partial t_{k+1}^{\beta _{k+1}}}&= {\partial ^{\beta _{k+1}} \over \partial t_{k+1}^{\beta _{k+1}}} \frac{\partial ^{|\beta |}M^r}{\partial t_1^{\beta _1}\ldots \partial t_{k}^{\beta _{k}}}\nonumber \\&= {\partial ^{\beta _{k+1}} \over \partial t_{k+1}^{\beta _{k+1}}} \left( \sum _{0\le \alpha \le \beta }r^{\underline{|\alpha |}}x^{\alpha }M^{r-|\alpha |} \left( \prod _{i=1}^ne^{\alpha _it_i}S(\beta _i,\alpha _i)\right) \right) . \end{aligned}$$
(24)

Note that \(\alpha _{k+1}=0\) since \(\alpha _{k+1}\le \beta _{k+1}\) and \(\beta _{k+1}=0\). Hence, \(M^{r-|\alpha |}\) is the only term containing the variable \(t_{k+1}\) and thus (24) implies

$$\begin{aligned} \frac{\partial ^{|\beta '|}M^r}{\partial t_1^{\beta _1}\ldots \partial t_{k+1}^{\beta _{k+1}}}&= \sum _{0\le \alpha \le \beta }r^{\underline{|\alpha |}}x^{\alpha } \left( \prod _{i=1}^ne^{\alpha _it_i}S(\beta _i,\alpha _i)\right) {\partial ^{\beta _{k+1}}M^{r-|\alpha |}\over \partial t_{k+1}^{\beta _{k+1}}}. \end{aligned}$$
(25)

We now use Lemma 6 to compute the last term:

$$\begin{aligned} {\partial ^{\beta _{k+1}}M^{r-|\alpha |}\over \partial t_{k+1}^{\beta _{k+1}}}= \sum _{\theta _{k+1}=0}^{\beta _{k+1}} S(\beta _{k+1},\theta _{k+1})(r-|\alpha |)^{\underline{\theta _{k+1}}}x_{k+1}^{\theta _{k+1}}e^{\theta _{k+1}t_{k+1}}M^{r-|\alpha |-\theta _{k+1}}. \end{aligned}$$
(26)

Plugging (26) into (25) we obtain

$$\begin{aligned} \frac{\partial ^{|\beta '|}M^r}{\partial t_1^{\beta _1}\ldots \partial t_{k+1}^{\beta _{k+1}}}&= \sum _{0\le \alpha \le \beta }r^{\underline{|\alpha |}}x^{\alpha } \left( \sum _{\theta _{k+1}=0}^{\beta _{k+1}} S(\beta _{k+1},\theta _{k+1})(r-|\alpha |)^{\underline{\theta _{k+1}}}x_{k+1}^{\theta _{k+1}}e^{\theta _{k+1}t_{k+1}}M^{r-|\alpha |-\theta _{k+1}}\right) \\&\quad \times \left( \prod _{i=1}^ne^{\alpha _it_i}S(\beta _i,\alpha _i)\right) \\&= \sum _{0\le \alpha \le \beta } \sum _{\theta _{k+1}=0}^{\beta _{k+1}} r^{\underline{|\alpha |+\theta _{k+1}}}x^{\alpha +e_{k+1}\theta _{k+1}} S(\beta _{k+1},\theta _{k+1})e^{\theta _{k+1}t_{k+1}}M^{r-(|\alpha |+\theta _{k+1})}\\&\quad \times \left( \prod _{i=1}^ne^{\alpha _it_i}S(\beta _i,\alpha _i)\right) \\&= \sum _{0\le \alpha '\le \beta '} r^{\underline{|\alpha '|}}x^{\alpha '}M^{r-|\alpha '|} \left( \prod _{i=1}^ne^{\alpha _i't_i}S(\beta _i',\alpha _i')\right) , \end{aligned}$$

after setting \(\alpha '=\alpha +e_{k+1}\theta _{k+1}\). This concludes the proof of Theorem 9. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Klerk, E., Laurent, M. & Sun, Z. An alternative proof of a PTAS for fixed-degree polynomial optimization over the simplex. Math. Program. 151, 433–457 (2015). https://doi.org/10.1007/s10107-014-0825-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-014-0825-6

Keywords

Mathematics Subject Classification

Navigation