On the stable equilibrium points of gradient systems

https://doi.org/10.1016/j.sysconle.2006.01.002Get rights and content

Abstract

This paper studies the relations between the local minima of a cost function f and the stable equilibria of the gradient descent flow of f. In particular, it is shown that, under the assumption that f is real analytic, local minimality is necessary and sufficient for stability. Under the weaker assumption that f is indefinitely continuously differentiable, local minimality is neither necessary nor sufficient for stability.

Introduction

Gradient flows are useful in solving various optimization-related problems. Recent examples deal with principal component analysis [21], [15], optimal control [20], [9], balanced realizations [7], ocean sampling [3], noise reduction [16], pose estimation [4] or the Procrustes problem [18]. The underlying idea is that the gradient-descent flow will converge to a local minimum of the cost function. It is, however, well known that this property does not hold in general: the initial condition can, e.g. belong to the stable manifold of a saddle point. Not as well known is the fact that, even assuming that the cost function is a C function, the local minima of the cost function are not necessarily stable equilibria of the gradient-descent system, and vice versa. The main purpose of this paper is to shed some light on this issue.

Specifically, let f be a real, continuously differentiable function on Rn and consider the continuous-time gradient-descent systemx˙(t)=-f(x(t)),where f(x) denotes the Euclidean gradient of f at x. Define stability and minimality in the standard way:

Definition 1

A point zRn is a local minimum of f if there exists ε>0 such that f(x)f(z) for all x such that x-z<ε. If f(x)>f(z) for all x such that 0<x-z<ε, then z is a strict local minimum of f. An equilibrium point z of (1) is (Lyapunov) stable if, for each ε>0, there is δ=δ(ε)>0 such that x(0)-z<δx(t)-z<εt0.It is asymptotically stable if it is stable, and δ can be chosen such that x(0)<δlimtx(t)=z.

Then we have:

Proposition 2

(i) There exist a function fC and a point zRn such that z is a local minimum of f and z is not a stable equilibrium point of (1). (ii) There exist a function fC and a point zRn such that z is not a local minimum of f and z is a stable equilibrium point of (1).

The proof given in Section 2 consists in producing functions f that satisfy the required properties.

After smoothness, the next stronger condition one may think of imposing on the cost function f is real analyticity (a real function is analytic if it possesses derivatives of all orders and agrees with its Taylor series in the neighbourhood of every point). The main result of this paper is that under the analyticity assumption, local minimality becomes a necessary and sufficient condition for stability.

Theorem 3 Main result

Let f be real analytic in a neighbourhood of zRn. Then, z is a stable equilibrium point of (1) if and only if it is a local minimum of f.

The proof of this theorem, given in Section 3, relies on an inequality by Łojasiewicz that yields bounds on the length of solution curves of the gradient system (1).

Moreover, we give in Section 4 a complete characterization of the relations between (isolated, strict) local minima and (asymptotically) stable equilibria for gradient flows of both C and analytic cost functions. Final remarks are presented in Section 5.

Section snippets

Smooth cost function

In this section we prove Proposition 2. Consider f:RnR defined byf(x,y)=11+x2g(y)h(y),whereg(y)=e-1/y2ify0,0ify=0,and h(y)=y2+1+sin1y2ify0,1ify=0.This function is qualitatively illustrated in Fig. 1. We show that this function f satisfies the properties of point (i) of Proposition 2 with z=(0,0). It is routine to check that fC, and it is clear that the origin is a local minimum of f, since f is nonnegative and f(0)=0. The gradient system (1) becomesx˙=2x(1+x2)2g(y)h(y),y˙=-11+x2g(y)y3m(y),

Analytic cost function

This section is dedicated to proving Theorem 3. We assume throughout, without loss of generality, that f:RnR is analytic on an open set U containing the origin, that f(0)=0 and that f(0)=0, and we study the stability of the equilibrium point 0 of the gradient system (1).

The proof relies on the following fundamental property of analytic functions.

Lemma 4 Łojasiewicz's inequality

Let f be a real analytic function on a neighbourhood of z in Rn. Then there are constants c>0 and ρ[0,1) such that f(x)c|f(x)-f(z)|ρin some

Strict minimality and asymptotic stability

The previous results were concerned with (simple) Lyapunov stability and (nonstrict) minimality. In this section, we also consider asymptotic stability and strict minimality. The relations between Lyapunov stability, asymptotic stability, and various notions of minimality are displayed in Fig. 2, with the following notation (see Definition 1 for details). LM: local minimum; SLM: strict local minimum; ILM: isolated local minimum; LMICP: local minimum and isolated critical point; SE: stable

Final remarks

For a general cost function fCp, p{2,3,}{}, the classical way of studying the stability of an equilibrium point (say x=0) of the gradient-descent flow (1) is to consider the Hessian of f at x=0. If the Hessian is positive definite, then x=0 is a local minimum and an isolated critical point of f (LCICP in Fig. 2); it follows from Fig. 2 that the origin is asymptotically stable. But the converse is not true, as the simple example f(x)=x4 shows (the origin is asymptotically stable but the

Acknowledgements

The authors thank A.L. Tits and R. Sepulchre for several useful comments. Part of this work was done while the first author was visiting the second author at the Laboratoire de Mathématiques de l’Université de Savoie. The hospitality of the members of the Laboratory is gratefully acknowledged.

References (21)

There are more references available in the full text version of this article.

Cited by (127)

  • Łojasiewicz gradient inequalities for polynomial functions and some applications

    2022, Journal of Mathematical Analysis and Applications
View all citing articles on Scopus

This work was initiated while the first author was a Research Fellow with the Belgian National Fund for Scientific Research (Aspirant du F.N.R.S.) at the University of Liège. This work was supported in part by the School of Computational Science of Florida State University through a postdoctoral fellowship, and by Microsoft Research through a Research Fellowship at Peterhouse, Cambridge.

View full text