Skip to main content
Top

2021 | OriginalPaper | Chapter

Efficiency of Coordinate Descent Methods for Structured Nonconvex Optimization

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We propose novel coordinate descent (CD) methods for minimizing nonconvex functions comprising three terms: (i) a continuously differentiable term, (ii) a simple convex term, and (iii) a concave and continuous term. First, by extending randomized CD to nonsmooth nonconvex settings, we develop a coordinate subgradient method that randomly updates block-coordinate variables by using block composite subgradient mapping. This method converges asymptotically to critical points with proven sublinear convergence rate for certain optimality measure. Second, we develop a randomly permuted CD method with two alternating steps: linearizing the concave part and cycling through variables. We prove asymptotic convergence to critical points and establish sublinear complexity rate for objectives with both smooth and concave components. Third, we develop a randomized proximal difference-of-convex (DC) algorithm whereby we solve the subproblem inexactly by accelerated coordinate descent (ACD). Convergence is guaranteed with at most a few number of ACD iterations for each DC subproblem, and convergence complexity is established for identifying certain approximate critical points. Fourth, we further develop the third method to minimize smooth and composite weakly convex functions, and show advantages of the proposed method over gradient methods for ill-conditioned nonconvex functions, namely the weakly convex functions with high Lipschitz constant to negative curvature ratios. Finally, an empirical study on sparsity-inducing learning models demonstrates that CD methods are superior to gradient-based methods for certain large-scale problems.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ahn, M., Pang, J.S., Xin, J.: Difference-of-convex learning: directional stationarity, optimality, and sparsity. SIAM J. Optim. 27(3), 1637–1665 (2017)MathSciNetCrossRef Ahn, M., Pang, J.S., Xin, J.: Difference-of-convex learning: directional stationarity, optimality, and sparsity. SIAM J. Optim. 27(3), 1637–1665 (2017)MathSciNetCrossRef
2.
go back to reference Allen-Zhu, Z., Qu, Z., Richtárik, P., Yuan, Y.: Even faster accelerated coordinate descent using non-uniform sampling. In: International Conference on Machine Learning, pp. 1110–1119 (2016) Allen-Zhu, Z., Qu, Z., Richtárik, P., Yuan, Y.: Even faster accelerated coordinate descent using non-uniform sampling. In: International Conference on Machine Learning, pp. 1110–1119 (2016)
3.
go back to reference An, N.T., Nam, N.M.: Convergence analysis of a proximal point algorithm for minimizing differences of functions. Optimization 66(1), 129–147 (2017)MathSciNetCrossRef An, N.T., Nam, N.M.: Convergence analysis of a proximal point algorithm for minimizing differences of functions. Optimization 66(1), 129–147 (2017)MathSciNetCrossRef
5.
go back to reference Beck, A., Tetruashvili, L.: On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4), 2037–2060 (2013)MathSciNetCrossRef Beck, A., Tetruashvili, L.: On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4), 2037–2060 (2013)MathSciNetCrossRef
6.
go back to reference Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: Accelerated methods for non-convex optimization. arXiv preprint arXiv:1611.00756 (2016) Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: Accelerated methods for non-convex optimization. arXiv preprint arXiv:​1611.​00756 (2016)
7.
go back to reference Dang, C.D., Lan, G.: Stochastic block mirror descent methods for nonsmooth and stochastic optimization. SIAM J. Optim. 25(2), 856–881 (2015)MathSciNetCrossRef Dang, C.D., Lan, G.: Stochastic block mirror descent methods for nonsmooth and stochastic optimization. SIAM J. Optim. 25(2), 856–881 (2015)MathSciNetCrossRef
8.
go back to reference Davis, D., Grimmer, B.: Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems. SIAM J. Optim. 29(3), 1908–1930 (2019)MathSciNetCrossRef Davis, D., Grimmer, B.: Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems. SIAM J. Optim. 29(3), 1908–1930 (2019)MathSciNetCrossRef
10.
go back to reference Dua, D., Graff, C.: UCI machine learning repository (2017) Dua, D., Graff, C.: UCI machine learning repository (2017)
11.
go back to reference Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its Oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)MathSciNetCrossRef Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its Oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)MathSciNetCrossRef
12.
go back to reference Gong, P., Zhang, C., Lu, Z., Huang, J.Z., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. Int. Conf. Mach. Learn. 28(2), 37–45 (2013) Gong, P., Zhang, C., Lu, Z., Huang, J.Z., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. Int. Conf. Mach. Learn. 28(2), 37–45 (2013)
14.
go back to reference Hien, L.T.K., Gillis, N., Patrinos, P.: Inertial block mirror descent method for non-convex non-smooth optimization. arXiv preprint arXiv:1903.01818 (2019) Hien, L.T.K., Gillis, N., Patrinos, P.: Inertial block mirror descent method for non-convex non-smooth optimization. arXiv preprint arXiv:​1903.​01818 (2019)
15.
go back to reference Hong, M., Razaviyayn, M., Luo, Z.Q., Pang, J.S.: A unified algorithmic framework for block-structured optimization involving big data: with applications in machine learning and signal processing. IEEE Signal Process. Mag. 33(1), 57–77 (2016)CrossRef Hong, M., Razaviyayn, M., Luo, Z.Q., Pang, J.S.: A unified algorithmic framework for block-structured optimization involving big data: with applications in machine learning and signal processing. IEEE Signal Process. Mag. 33(1), 57–77 (2016)CrossRef
16.
go back to reference Khamaru, K., Wainwright, M.J.: Convergence guarantees for a class of non-convex and non-smooth optimization problems. In: International Conference on Machine Learning, pp. 2606–2615 (2018) Khamaru, K., Wainwright, M.J.: Convergence guarantees for a class of non-convex and non-smooth optimization problems. In: International Conference on Machine Learning, pp. 2606–2615 (2018)
17.
go back to reference Kong, W., Melo, J.G., Monteiro, R.D.: Complexity of a quadratic penalty accelerated inexact proximal point method for solving linearly constrained nonconvex composite programs. arXiv preprint arXiv:1802.03504 (2018) Kong, W., Melo, J.G., Monteiro, R.D.: Complexity of a quadratic penalty accelerated inexact proximal point method for solving linearly constrained nonconvex composite programs. arXiv preprint arXiv:​1802.​03504 (2018)
18.
go back to reference Lan, G., Yang, Y.: Accelerated stochastic algorithms for nonconvex finite-sum and multi-block optimization. arXiv preprint arXiv:1805.05411 (2018) Lan, G., Yang, Y.: Accelerated stochastic algorithms for nonconvex finite-sum and multi-block optimization. arXiv preprint arXiv:​1805.​05411 (2018)
19.
go back to reference Lin, Q., Lu, Z., Xiao, L.: An accelerated randomized proximal coordinate gradient method and its application to regularized empirical risk minimization. SIAM J. Optim. 25(4), 2244–2273 (2015)MathSciNetCrossRef Lin, Q., Lu, Z., Xiao, L.: An accelerated randomized proximal coordinate gradient method and its application to regularized empirical risk minimization. SIAM J. Optim. 25(4), 2244–2273 (2015)MathSciNetCrossRef
20.
go back to reference Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)MathSciNetCrossRef Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)MathSciNetCrossRef
24.
go back to reference Thi, H.L., Dinh, T.P., Le, H., Vo, X.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)MathSciNetCrossRef Thi, H.L., Dinh, T.P., Le, H., Vo, X.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)MathSciNetCrossRef
27.
go back to reference Xu, Y., Yin, W.: A globally convergent algorithm for nonconvex optimization based on block coordinate update. J. Sci. Comput. 72(2), 700–734 (2017)MathSciNetCrossRef Xu, Y., Yin, W.: A globally convergent algorithm for nonconvex optimization based on block coordinate update. J. Sci. Comput. 72(2), 700–734 (2017)MathSciNetCrossRef
28.
go back to reference Xu, Y., Qi, Q., Lin, Q., Jin, R., Yang, T.: Stochastic optimization for DC functions and non-smooth non-convex regularizers with non-asymptotic convergence. In: International Conference on Machine Learning, pp. 6942–6951 (2019) Xu, Y., Qi, Q., Lin, Q., Jin, R., Yang, T.: Stochastic optimization for DC functions and non-smooth non-convex regularizers with non-asymptotic convergence. In: International Conference on Machine Learning, pp. 6942–6951 (2019)
29.
go back to reference Yuille, A.L., Rangarajan, A.: The concave-convex procedure (CCCP). Adv. Neural Inf. Process. Syst. 14, 1033–1040 (2002) Yuille, A.L., Rangarajan, A.: The concave-convex procedure (CCCP). Adv. Neural Inf. Process. Syst. 14, 1033–1040 (2002)
30.
go back to reference Zhang, C.H., Zhang, T.: A general theory of concave regularization for high-dimensional sparse estimation problems. Stat. Sci. 27(4), 576–593 (2012)MathSciNetCrossRef Zhang, C.H., Zhang, T.: A general theory of concave regularization for high-dimensional sparse estimation problems. Stat. Sci. 27(4), 576–593 (2012)MathSciNetCrossRef
Metadata
Title
Efficiency of Coordinate Descent Methods for Structured Nonconvex Optimization
Authors
Qi Deng
Chenghao Lan
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-67664-3_5

Premium Partner