Skip to main content
Log in

Block-Coordinate Gradient Descent Method for Linearly Constrained Nonsmooth Separable Optimization

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

We consider the problem of minimizing the weighted sum of a smooth function f and a convex function P of n real variables subject to m linear equality constraints. We propose a block-coordinate gradient descent method for solving this problem, with the coordinate block chosen by a Gauss-Southwell-q rule based on sufficient predicted descent. We establish global convergence to first-order stationarity for this method and, under a local error bound assumption, linear rate of convergence. If f is convex with Lipschitz continuous gradient, then the method terminates in O(n 2/ε) iterations with an ε-optimal solution. If P is separable, then the Gauss-Southwell-q rule is implementable in O(n) operations when m=1 and in O(n 2) operations when m>1. In the special case of support vector machines training, for which f is convex quadratic, P is separable, and m=1, this complexity bound is comparable to the best known bound for decomposition methods. If f is convex, then, by gradually reducing the weight on P to zero, the method can be adapted to solve the bilevel problem of minimizing P over the set of minima of f+δ X , where X denotes the closure of the feasible set. This has application in the least 1-norm solution of maximum-likelihood estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    MATH  Google Scholar 

  2. Fukushima, M.: A successive quadratic programming method for a class of constrained nonsmooth optimization problems. Math. Program. 49, 231–251 (1990–1991)

    Article  MathSciNet  Google Scholar 

  3. Fukushima, M., Mine, H.: A generalized proximal point algorithm for certain non-convex minimization problems. Int. J. Syst. Sci. 12, 989–1000 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  4. Kiwiel, K.C.: A method for minimizing the sum of a convex function and a continuously differentiable function. J. Optim. Theory Appl. 48, 437–449 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  5. Mine, H., Fukushima, M.: A minimization method for the sum of a convex function and a continuously differentiable function. J. Optim. Theory Appl. 33, 9–23 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  6. Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. B 117, 387–423 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  7. Rockafellar, R.T.: Network Flows and Monotropic Optimization. Wiley, New York (1984). Republished by Athena Scientific, Belmont (1998)

    MATH  Google Scholar 

  8. Bertsekas, D.P.: Nonlinear Programming. 2nd edn. Athena Scientific, Belmont (1999)

    MATH  Google Scholar 

  9. Fletcher, R.: Practical Methods of Optimization. 2nd edn. Wiley, New York (1987)

    MATH  Google Scholar 

  10. Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, New York (1981)

    MATH  Google Scholar 

  11. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1999)

    Book  MATH  Google Scholar 

  12. Candés, E., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52, 489–509 (2006)

    Article  Google Scholar 

  13. Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis pursuit. SIAM Rev. 43, 129–159 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  14. Donoho, D.L., Johnstone, I.M.: Ideal spatial adaptation by wavelet shrinkage. Biometrika 81, 425–455 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  15. Friedlander, M.P., Tseng, P.: Exact regularization of convex programs. SIAM J. Optim. 18, 1326–1350 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  16. Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Stat. 1, 302–332 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  17. Sardy, S., Tseng, P.: AMlet, RAMlet, and GAMlet: automatic nonlinear fitting of additive models, robust and generalized, with wavelets. J. Comput. Graph. Stat. 13, 283–309 (2004)

    Article  MathSciNet  Google Scholar 

  18. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  19. Tseng, P.: Convergence of block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 473–492 (2001)

    Article  MathSciNet  Google Scholar 

  20. Joachims, T.: Making large-scale SVM learning practical. In: Schólkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods—Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)

    Google Scholar 

  21. Platt, J.: Sequential minimal optimization: A fast algorithm for training support vector machines. In: Schólkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods—Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  22. Censor, Y., Zenios, S.A.: Parallel Optimization: Theory, Algorithms, and Applications. Oxford University Press, London (1997)

    MATH  Google Scholar 

  23. Chen, P.-H., Fan, R.-E., Lin, C.-J.: A study on SMO-type decomposition methods for support vector machines. IEEE Trans. Neural Netw. 17, 893–908 (2006)

    Article  Google Scholar 

  24. Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss-Seidel method under convex constraints. Oper. Res. Lett. 26, 127–136 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  25. Lin, C.-J., Lucidi, S., Palagi, L., Risi, A., Sciandrone, M.: A decomposition algorithm model for singly linearly constrained problems subject to lower and upper bounds. Technical Report, DIS-Università di Roma “La Sapienza”, Rome (2007). J. Optim. Theory Appl. (to appear)

  26. List, N., Simon, H.U.: General Polynomial Time Decomposition Algorithms. Lecture Notes in Computer Science, vol. 3559, pp. 308–322. Springer, Berlin (2005)

    Google Scholar 

  27. Luo, Z.Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30, 408–425 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  28. Mangasarian, O.L., Musicant, D.R.: Successive overrelaxation for support vector machines. IEEE Trans. Neural Netw. 10, 1032–1037 (1999)

    Article  Google Scholar 

  29. Tseng, P., Yun, S.: A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Report, Department of Mathematics, University of Washington, Seattle (2007). Comput. Optim. Appl. (to appear)

  30. Meier, L., van de Geer, S., Bühlmann, P.: The group Lasso for logistic regression. J. R. Stat. Soc. B 70, 53–71 (2008)

    MATH  Google Scholar 

  31. Rockafellar, R.T.: The elementary vectors of a subspace of R N. In: Bose, R.C., Dowling, T.A. (eds.) Combinatorial Mathematics and its Applications. Proc. of the Chapel Hill Conference 1967, pp. 104–127. Univ. North Carolina Press, Chapel Hill (1969)

    Google Scholar 

  32. Hush, D., Scovel, C.: Polynomial-time decomposition algorithms for support vector machines. Mach. Learn. 51, 51–71 (2003)

    Article  MATH  Google Scholar 

  33. Hush, D., Kelly, P., Scovel, C., Steinwart, I.: QP algorithms with guaranteed accuracy and run time for support vector machines. J. Mach. Learn. Res. 7, 733–769 (2006)

    MathSciNet  Google Scholar 

  34. Rockafellar, R.T., Wets, R.J.B.: Variational Analysis. Springer, New York (1998)

    MATH  Google Scholar 

  35. Luo, Z.Q., Tseng, P.: On the convergence rate of dual ascent methods for linearly constrained convex minimization. Math. Oper. Res. 18, 846–867 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  36. Luo, Z.Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)

    Article  MathSciNet  Google Scholar 

  37. Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. SIAM, Philadelphia (2000)

    MATH  Google Scholar 

  38. Berman, P., Kovoor, N., Pardalos, P.M.: Algorithms for the least distance problem. In: Pardalos, P.M. (ed.) Complexity in Numerical Optimization, pp. 33–56. World Scientific, Singapore (1993)

    Google Scholar 

  39. Megiddo, N., Tamir, A.: Linear time algorithms for some separable quadratic programming problems. Oper. Res. Lett. 13, 203–211 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  40. Brucker, P.: An O(n) algorithm for quadratic knapsack problems. Oper. Res. Lett. 3, 163–166 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  41. Kiwiel, K.C.: On linear time algorithms for the continuous quadratic knapsack problem. J. Optim. Theory Appl. 134, 549–554 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  42. Tseng, P.: An ε-out-of-kilter method for monotropic programming problems. Math. Oper. Res. 26, 221–233 (2001)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Tseng.

Additional information

Communicated by A. Miele.

This research was supported by the National Science Foundation, Grant No. DMS-0511283.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tseng, P., Yun, S. Block-Coordinate Gradient Descent Method for Linearly Constrained Nonsmooth Separable Optimization. J Optim Theory Appl 140, 513–535 (2009). https://doi.org/10.1007/s10957-008-9458-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-008-9458-3

Keywords

Navigation