Accelerated Optimization for Machine Learning | springerprofessional.de

Springer Professional

Top

2020 | Book

Read chapter Read first chapter

Accelerated Optimization for Machine Learning

First-Order Algorithms

Authors: Prof. Zhouchen Lin, Dr. Huan Li, Dr. Cong Fang

Publisher: Springer Singapore

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Login to get access

About this book

This book on optimization includes forewords by Michael I. Jordan, Zongben Xu and Zhi-Quan Luo. Machine learning relies heavily on optimization to solve problems with its learning models, and first-order optimization algorithms are the mainstream approaches. The acceleration of first-order optimization algorithms is crucial for the efficiency of machine learning.

Written by leading experts in the field, this book provides a comprehensive introduction to, and state-of-the-art review of accelerated first-order optimization algorithms for machine learning. It discusses a variety of methods, including deterministic and stochastic algorithms, where the algorithms can be synchronous or asynchronous, for unconstrained and constrained problems, which can be convex or non-convex. Offering a rich blend of ideas, theories and proofs, the book is up-to-date and self-contained. It is an excellent reference resource for users who are seeking faster optimization algorithms, as well as for graduate students and researchers wanting to grasp the frontiers of optimization in machine learning in a short time.

Advertisement

Table of Contents

Frontmatter

Chapter 1. Introduction

Abstract

This chapter gives several examples of optimization problems in machine learning and briefly overviews the representative works on accelerated first-order algorithms. It also gives a brief introduction to the content of the monograph.

Zhouchen Lin, Huan Li, Cong Fang

Chapter 2. Accelerated Algorithms for Unconstrained Convex Optimization

Abstract

This chapter reviews the representative accelerated first-order algorithms for deterministic unconstrained convex optimization. We start with introducing the accelerated methods for smooth problems with Lipschitz continuous gradients, then concentrate on the methods for composite problems and specially study the case when the proximal mapping and the gradient are inexactly computed, and at last introduce the acceleration on highly smooth problems when the high order derivative is Lipschitz continuous. This chapter also provides the smoothing technique for nonsmooth problems, the restart technique for non-strongly convex problems and the explanation of the mechanism of acceleration from the variational perspective.

Zhouchen Lin, Huan Li, Cong Fang

Chapter 3. Accelerated Algorithms for Constrained Convex Optimization

Abstract

This chapter reviews the representative accelerated algorithms for deterministic constrained convex optimization. We overview the accelerated penalty method, accelerated Lagrange multiplier method, and the accelerated augmented Lagrange multiplier method. In particular, we concentrate on two widely used algorithms, namely the alternating direction method of multiplier (ADMM) and the primal-dual method. For ADMM, we study four scenarios, namely the generally convex and nonsmooth case, the strongly convex and nonsmooth case, the generally convex and smooth case, and the strongly convex and smooth case. We also introduce its non-ergodic accelerated variant. For the primal-dual method, we study three scenarios: both the two functions are generally convex, both are strongly convex, and one is generally convex, while the other is strongly convex. Finally, we introduce the Frank–Wolfe algorithm under the condition of strongly convex constraint set.

Zhouchen Lin, Huan Li, Cong Fang

Chapter 4. Accelerated Algorithms for Nonconvex Optimization

Abstract

This chapter reviews the representative accelerated algorithms for deterministic unconstrained nonconvex optimization. We start with introducing the proximal gradient method with momentum for general nonconvex problems, including its convergence to the critical points and convergence rates under the Kurdyka–Łojasiewicz condition. Then, we study the provable faster convergence of the accelerated gradient method to the critical points, which is equipped with negative curvature exploitation. At last, we further introduce the accelerated algorithms with the faster convergence guarantee to the approximate local optima, which escape from the saddle points quickly.

Zhouchen Lin, Huan Li, Cong Fang

Chapter 5. Accelerated Stochastic Algorithms

Abstract

This chapter reviews the representative accelerated stochastic algorithms, categorized into individually convex case (IC, each function is convex), individually nonconvex case (INC, each function can be nonconvex but their summation is convex), nonconvex case, constrained case, and the infinite case (online case). For the IC problems, we review the accelerated stochastic coordinate descent, variance reduction and its acceleration, and the black-box acceleration framework, namely, the Catalyst framework. For the INC problems, we introduce the results attained by the combination of variance reduction and Catalyst. For the nonconvex problems, we introduce a method named SPIDER. For the constrained problems, we introduce the accelerated stochastic ADMM. For the infinite case, we show that the momentum technique can enlarge the mini-batch size.

Zhouchen Lin, Huan Li, Cong Fang

Chapter 6. Accelerated Parallel Algorithms

Abstract

This chapter reviews the accelerated parallel algorithms. We first introduce the accelerated asynchronous algorithms, including accelerated asynchronous gradient descent and accelerated asynchronous coordinate descent. Then we concentrate on the accelerated distributed algorithms, categorized into the centralized topology and decentralized topology. For both topologies, we introduce the communication-efficient accelerated stochastic dual coordinate ascent. Specially, we concentrate on the stochastic variant, where at each iteration only parts of samples are used in each agent.

Zhouchen Lin, Huan Li, Cong Fang

Chapter 7. Conclusions

Abstract

This section briefly discusses the practical choice between accelerated and non-accelerated algorithms. It further discusses the future direction in boosting convergence of optimization algorithms using machine learning techniques, i.e., learning-based optimization, which encourages the readers to work on.

Zhouchen Lin, Huan Li, Cong Fang

Backmatter

Title: Accelerated Optimization for Machine Learning
Authors: Prof. Zhouchen Lin
Dr. Huan Li
Dr. Cong Fang
Copyright Year: 2020
Publisher: Springer Singapore
Electronic ISBN: 978-981-15-2910-8
Print ISBN: 978-981-15-2909-2
DOI: https://doi.org/10.1007/978-981-15-2910-8

Premium Partner