Elsevier

Information Processing Letters

Volume 114, Issues 1–2, January–February 2014, Pages 38-41
Information Processing Letters

Fitness levels with tail bounds for the analysis of randomized search heuristics

https://doi.org/10.1016/j.ipl.2013.09.013Get rights and content

Highlights

  • We supplement the fitness-level method from the analysis of randomized search heuristics with tail bounds.

  • The upper tail improves an existing result considerably and the lower tail is new.

  • As a proof of concept, we apply the new bounds to analyze randomized local search on a standard benchmark problem.

Abstract

The fitness-level method, also called the method of f-based partitions, is an intuitive and widely used technique for the running time analysis of randomized search heuristics. It was originally defined to prove upper and lower bounds on the expected running time. Recently, upper tail bounds were added to the technique; however, these tail bounds only apply to running times that are at least twice as large as the expectation.

We remove this restriction and supplement the fitness-level method with sharp tail bounds, including lower tails. As an exemplary application, we prove that the running time of randomized local search on OneMax is sharply concentrated around nlnn0.1159...n.

Introduction

The running time analysis of randomized search heuristics, including evolutionary algorithms, ant colony optimization and particle swarm optimization, is a vivid research area where many results have been obtained in the last 15 years. Different methods for the analysis were developed as the research area grew. For an overview of the state of the art in the area see the books by Auger and Doerr [1], Neumann and Witt [8] and Jansen [4].

The fitness-level method, also called the method of f-based partitions, is a classical and intuitive method for running time analysis, first formalized by Wegener [11]. It applies to the case that the total running time of a search heuristic can be represented as (or bounded by) a sum of geometrically distributed waiting times, where the waiting times account for the number of steps spent on certain levels of the search space. Wegener [11] presented both upper and lower bounds on the running time of randomized search heuristics using the fitness-level method. The lower bounds relied on the assumption that no level was allowed to be skipped. Sudholt [10] significantly relaxed this assumption and presented a very general lower-bound version of the fitness-level method that allows levels to be skipped with some probability.

Only recently, the focus in running time analysis turned to tail bounds, also called concentration inequalities. Zhou, Luo, Lu, Han [12] were the first to add tail bounds to the fitness-level method. Roughly speaking, they prove w.r.t. the running time T that Pr(T>2E(T)+2δh)=eδ holds, where h is the worst-case expected waiting time over all fitness levels and δ>0 is arbitrary. An obvious open question was whether the factor 2 in front of the expected value could be “removed” from the tail bound, i.e., replaced with 1; Zhou et al. [12] only remark that the factor 2 can be replaced with 1.883.

In this article, we give a positive answer to this question and supplement the fitness-level method also with lower tail bounds. Roughly speaking, we prove in Section 2 that Pr(TE(T)δ)eδ2/(2s) and Pr(TE(T)+δ)eδ4min{δs,h}, where s is the sum of the squares of the waiting times over all fitness levels. We apply the technique to a classical benchmark problem, more precisely to the running time analysis of randomized local search (RLS) on OneMax in Section 3, and prove a very sharp concentration of the running time around nlnn0.1159...n. We finish with some conclusions and a pointer to related work.

Section snippets

New tail bounds for fitness levels

Miscellaneous authors [6] on the Internet discussed tail bounds for a special case of our problem, namely the coupon collector problem (Motwani and Raghavan [7, Chapter 3.6]). Inspired by this discussion, we present our main result in Theorem 1 below. It applies to the scenario that a random variable (e.g., a running time) is given as a sum of geometrically distributed independent random variables (e.g., waiting times on fitness levels). A concrete application will be presented in Section 3.

Theorem 1

Let

Application to RLS on OneMax

As a proof of concept, we apply Theorem 2 to a classical benchmark problem in the analysis of randomized search heuristics, more precisely the running time of RLS on OneMax. RLS is a well-studied randomized search heuristic, defined in Algorithm 1. The function ONEMAX :{0,1}nR is defined by ONEMAX (x1,,xn)=x1++xn, and the running time is understood as the first hitting time of the all-ones string (plus 1 to count the initialization step). We obtain a very sharp concentration, as stated in

Conclusions

We have supplemented upper and lower tail bounds to the fitness-level method. The lower tails are novel contributions and the upper tails improve an existing result from the literature significantly. As a proof of concept, we have applied the fitness levels with tail bounds to the analysis of RLS on OneMax and obtained a very sharp concentration result.

If the stochastic process under consideration is allowed to skip fitness levels, which is often the case with globally searching algorithms such

Acknowledgement

The author thanks Per Kristian Lehre for useful discussions.

References (12)

There are more references available in the full text version of this article.

Cited by (52)

  • Error analysis of elitist randomized search heuristics

    2021, Swarm and Evolutionary Computation
View all citing articles on Scopus
View full text