Erschienen in:

Open Access 2023 | OriginalPaper | Buchkapitel

11. Heuristics Design

verfasst von : Éric D. Taillard

Erschienen in: Design of Heuristic Algorithms for Hard Optimization

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

The last chapter of the book gives some advice on developing heuristics. It discusses the difficulty of modeling the problem and gives an example of decomposing the problem into a chain of more manageable sub-problems. Secondly, it proposes an approach for the design of a specific heuristic. Finally, some techniques for parameter tuning and comparing the efficiency of algorithms are reviewed.

This chapter gives some tips for developing heuristics. It goes back to the modeling of the problem and gives an example of decomposing the problem into a chain of sub-problems that are easier to treat. It then proposes an approach to designing a specific heuristic. Finally, techniques for parameter tuning and comparison of algorithms are discussed.

11.1 Problem Modeling

Now that we have reviewed the key ingredients of heuristics, let us try to propose an approach to design one. The first thing to determine is whether a heuristic approach is absolutely necessary. Indeed, the “no free lunch” theorem [5] informs us that no optimization heuristics outperforms all others! This result follows from the fact that in the infinite variety of instance data, most of them have no exploitable structure. Any heuristic, no matter how sophisticated, selects therefore an unfortunate choice for a given data set. Among the infinite number of imaginable heuristics, there is at least one that does not include this inappropriate choice.

If one has to solve a concrete optimization problem, it is therefore necessary to “cheat.” Examples are the set bipartition and the knapsack problems discussed in Sect. 1.2.3.5. If we know that the data do not contain large values, it is useless to design a heuristic because we can solve this type of problem exactly by dynamic programming.

The way in which the problem is modeled is crucial to its successful resolution. Especially when dealing with a concrete logistics problem, it can be time-consuming and tedious to capture all the wishes of a manager who is not used to the narrow view of a optimization engineer, who thinks in terms of variables, objective function, and constraints.

To do this, one must first identify the variables of the problem. One must determine what can be modified and what is intangible and part of the data. Then, the constraints must be discussed, specifying those that are hard and must be respected for a solution to be operational. Often, constraints presented as indispensable correspond rather to good practices, from which it is possible to depart from time to time. These soft constraints are generally integrated into an objective with a penalty factor. As we have seen with the Lagrangian relaxation technique, hard constraints can also be introduced into an objective, but probably with a higher penalty weighting. Finally, in practice, there are several objectives to be optimized. However, a manager may not be very happy if provided with a huge set of Pareto optimal solutions (see Problem 11.1) and has to examine all of them before choosing one. On the other hand, it will be easier to prioritize the objectives. It remains to be seen whether these objectives should be treated in a hierarchical manner (the optimum of the highest priority objective is sought before optimizing the next one) or by scalarization (all the objectives are aggregated into one, with weights in relation to their priority).

Once the problem has been properly identified, the designer of a heuristic algorithm must choose a model that is appropriate for the solution method. The following section illustrates two remarkably similar models of the same problem that can lead to the design of very different algorithms.

11.1.1 Model Choice

To reconstruct an unknown genetic sequence, a DNA microarray chip, able to react to all k-nucleotide sequences, is exposed to the gene to be discovered. Once revealed, this chip allows knowing all subsequences of k-nucleotides present in the gene to be analysed.

The data can be modeled using de Bruijn graphs. These graphs represent the superposition of symbol chains.

A first model associates an arc with each k-nucleotide detected. So, the 3-nucleotide AAC is represented by an arc connecting the vertices AA → AC, due to the middle A superposition. If m is the number of k nucleotides detected, we have a graph with m edges. The reconstruction problem is to find an Eulerian path (passing through all the arcs) in this graph. This problem is easy, it can be solved in linear time.

The other model associates a vertex with each k-nucleotide detected. An arc connects two vertices if the associated k-nucleotides have a common subsequence of k − 1 nucleotides. For instance, if the 3-nucleotides AAC, ACA and ACG are detected, then both arcs AAC → ACA and AAC → ACG are present in the graph due to the common AC superposition. The graph is a directed version of the line graph of the previous representation. The reconstruction problem is to discover a Hamiltonian path (passing once through all the vertices). This second modeling thus requires the resolution of an NP-complete problem.

That said, this second model is not necessarily to be discarded in practice. Indeed, a concrete sequencing problem is likely to possess peculiarities and constraints that might be more difficult to deal with using the first model. For example, a genetic sequence may include repeated subsequences, a more or less reliable quantification of the number of times a k-nucleotide appears, etc.

Figure 11.1 shows the graphs that can be constructed with these two models for a gene that has activated 11 subsequences of 3-nucleotides. In this example, it is not possible to unambiguously reconstruct the gene.

The choice of a model is often sensitive and depends on the techniques to be implemented. During the design of a heuristic, it is frequent to realize that another model is more appropriate. In any case, it is noteworthy to keep in mind that good solutions are located at the boundary between feasible and unfeasible ones. To highlight the point, the 2-opt neighborhood for the TSP is restricted to examining feasible solutions. Despite constituting the primary operations of the Lin-Kernighan neighborhood, it is much less efficient. The success of the latter is undoubtedly due to the fact that one examines reference structures that are not tours. In a way, feasible solutions are approached from outside the domain of definition.

When a few constraints are numerical, the possibility of implementing the Lagrangian relaxation technique discussed in Sect. 2.8 should be studied. By adjusting the value of the penalties associated with the violation of the relaxed constraints, the heuristic focuses its search in the vicinity of the boundary of feasible solutions.

Conversely, artificial constraints can also be added to implement the diversification technique outlined in Sect. 9.2. For example, the quality of the solutions generated by Algorithms 2.7 and 2.8 for clustering can be significantly improved by adding a soft constraint imposing that the groups must contain the same number of elements. This constraint is relaxed and introduced in the fitness function with a penalty parameter that decreases during the iterations, just like the temperature of a simulated annealing. Thus, by performing a few more iterations than the original method, the solution is perturbed by making the largest groups smaller and the very small groups larger. At the end of the algorithm, the penalties being very low, we return to the initial objective, but the centers have managed to better relocate. The local optimum thus obtained is considerably better than that of the original method which does not move significantly enough from the initial random solution.

11.1.2 Decomposition into a Series of Sub-problems

Another step in modeling is to assess whether it is possible to apply the divide and conquer principle. Rather than trying to design a complex problem model with multiple interrelationships, one can try breaking it down into a series of more easily addressed sub-problems.

An example is the vehicle routing problem and its extension to several warehouses that need to be positioned. Rather than solving the positioning of the warehouses and the construction of the routes simultaneously, one can initially only deal with the customers constituting natural groups. The creation of the routes can be done in a second step and then finally the positioning of the warehouses. Figure 11.2 illustrates the process of this decomposition into a succession of sub-problems. With this approach, it was possible to handle examples with millions of elements with a reasonable computational effort. At the time these results were obtained, the instances in the literature were 1000 or 10,000 times smaller.

11.2 Algorithmic Construction

Once the problem has been modeled and a fitness function has been chosen, the construction of an algorithm can start. The first step is to construct a solution. Chapter 4 suggests diverse ideas to realize this step. Beforehand, if the instance size is significant, it is necessary to consider another problem partitioning, by the data.

11.2.1 Data Slicing

A partition of the problem items should be considered if the volume of data is very large or if a procedure of high algorithmic complexity is to be implemented. For the location-routing problem discussed in the previous section, a proximity graph presented in Sect. 6.3.1 was used. The size of the clusters created was chosen according to the problem data, so that their volume was close to a (small) multiple of the volume of the vehicles. The routing heuristic has also been designed in anticipation of optimizing a few tours at a time. It may therefore be appropriate to slice the data even for relatively small instances.

11.2.2 Local Search Design

Purely constructive heuristic algorithms rarely produce solutions of acceptable quality for difficult optimization problems. Even when combined with learning methods such as pheromone trails in artificial ant colonies or population management with genetic algorithms, convergence can be slow and the resulting solutions of insufficient quality. However, some software libraries can automatically generate a genetic algorithm with only the decoding of a solution from a binary vector and the calculation of the fitness function. In certain situations, the coding phase of the algorithm can be significantly reduced.

Generally, it is essential to move on to the subsequent phase and devise a neighborhood for the problem to be solved. The success of a heuristic algorithm frequently depends on the design of the envisaged neighborhood(s). Metaheuristics are sometimes described as processes guiding a local search. This explains the relative length of Chap. 5 and the richness of the various techniques allowing their limitation or extension. However, we are consciously aware that not everyone possesses the genius of Lin and Kernighan to design such an efficient ejection chain for the traveling salesman problem. Perhaps the latter is an exception, as it does not need to rely on other concepts to achieve excellent quality solutions.

As it is usually not possible to find a neighborhood with all the appropriate characteristics (connectivity, small diameter, fast evaluation, etc.), a local search often uses several different neighborhoods. Each of them corrects a weakness of the others.

This implies thinking about strategies for their use: one has to decide whether the local search should evaluate all these neighborhoods at each iteration or whether one should alternate phases of intensification, using one neighborhood, and diversification of the search, using other neighborhoods.

11.3 Heuristics Tuning

Theoretically, a programmer who is not an expert on the problem to be solved should be able to design a heuristic based on the key principles of metaheuristics discussed in the previous chapters. In practice, during the algorithm development, the programmer is going to gain some experience on how to achieve good solutions for the specific type of data to be processed.

Indeed, the most time-consuming work in the design of a heuristic algorithm consists in trying to understand why the constructive method goes wrong and produces outliers, why the “genial” neighborhood does not give the predicted results or why the “infallible” learning method fails…

11.3.1 Instance Selection

The design of a heuristic algorithm begins with the selection of the problem instances that it should be able to successfully tackle. Indeed, the “no free lunch” theorem tells us it is an illusion to try to design a universally successful method.

When addressing an industrial problem, once we have managed to translate the customer’s wishes into a model that seems reasonable, we still need to obtain concrete numerical data. This sometimes problematic phase can require a disproportionate investment, especially if the model developed is too complex, with poor decomposition and not sufficiently focused on the core of the problem. In practice, it frequently occurs that constraints described as essential are not imperative but result from a fear of altering too drastically the current operating mode and habits. Conversely, the first feedback from solutions provided by a heuristic algorithm may also reveal constraints that have not been explicitly stated. In this case, it is necessary to go back to the modeling stage and repeat an iteration…

For an academic problem, there are usually libraries with many numerical data. In this case, a selection of the instances must be considered so as not to invest an infinite amount of time tuning the algorithm. To evaluate the proper implementation of the developed heuristic, a first selection should consider moderate size instances for which an optimal or a very good solution is identified.

The instance selection should also be able to highlight the pathological cases for the developed heuristic. This choice must also be governed by the interest of these examples for practical cases. One example is the case of the satisfiability problems with 3 literals per clause (3SAT). If the number of randomly generated clauses is significant compared to the number of variables, then the instances are easy: the probability that there is a feasible assignment of the variables tends very quickly to 0. Conversely, if the number of clauses is limited compared to the number of variables, then the examples are equally easy: there is a probability tending very quickly toward 1 that there is a feasible assignment. It was determined that the transition between intractable and simple instances occurs when the number of clauses is 4.24 times higher than the number of variables. This result is interesting in itself, and while an efficient heuristic is developed for this type of instances, it does not guarantee it will be efficient for practical applications.

Finally, results for problem instances with very diverse characteristics should be separately reported. Indeed, multiplying the number of favorable (e.g., very small) or unfavorable instances would lead to biased results.

11.3.2 Graphical Representation

When possible, a graphical representation of the solutions helps to perceive how a heuristic works and to correct its flaws. Indeed, it is frequent to imagine bad causes explaining poor results. By visualizing the output of the method as something other than a series of numbers, it is sometimes very simple to explain this poor performance.

For some problems, a graphical representation is natural, as for the Euclidean TSP. This certainly explains the excellent efficiency of the heuristic and exact methods that have been developed for this problem.

11.3.3 Parameter and Option Tuning

The most time-consuming step in the design of a heuristic is the selection of its ingredients and parameter tuning. When developing a heuristic, one desires it to provide results of the highest possible quality on as wide an instance range as possible.

Initially, the programmer can proceed intuitively to find parameter values that fulfil this purpose. Typically, a small instance is chosen and the heuristic is executed by varying parameter values or changing options. The most promising evolutions are favored. Put differently, the tuning consists in applying a heuristic to another problem whose variables are the parameter values and whose objective function is the result of the run of the program implementing the algorithm under development.

In principle, the search space for tuning variables is considerably smaller than that of the instance to be solved. If not, the question arises as to the relevance of a heuristic whose design could be more complicated than the problem to be solved.

Several illustrations in this book provide the results of extensive experiments on the influence of the value of one or two parameters for some heuristics. For example, Fig. 9.3 shows that for a TSP instance with n = 127 cities, the appropriate combinations of the parameters d _min and Δ seem to be such that Δ + 2d _min = n, provided that one performs 10n iterations of the taboo search.

These results are not intended to provide definitive values. They are presented so that the reader can get an idea of the appropriate values, but they are not necessarily generalizable. The production of such a figure requires a disproportionate effort (more than 10,000 executions of the heuristic and then production of the diagram) compared to the information that can be obtained. However, it does allow us to observe significant random fluctuations in the results obtained.

If the heuristic has more than half a dozen parameters and options, a rough intuitive tuning is likely to be biased:

Given the effort involved, few alternatives are tested.
The instance set is limited.
The heuristic only works well on a limited instance set.
Outstanding or bad results focus attention.
Results are neither reproducible nor statistically supported.

It is therefore recommended to use automated methods to calibrate the parameters, providing these methods with a sufficiently representative instance set. They have the advantage of not being subjective, focusing on a set that is very favorable or leaving out a very unfavorable instance. As a parameter adjustment software, we can quote, among others, iRace proposed by [2].

11.3.4 Measure Criterion

The design of heuristics is habitually a multi-objective process. Indeed, the vast majority of the framework algorithms discussed in Part III of this book include a parameter that directly influences the number of repetitions of a general loop. Consequently, one can choose the computational effort to solve a problem quite freely. Furthermore, the quality of the solution produced depends directly on this effort. An extreme case is given by simulated annealing, which almost certainly produces the best possible solution, provided that one accepts an infinite effort!

A compromise must therefore be achieved between the computational time and the quality of the solutions produced.

11.3.4.1 Success Rate

A first measure of the quality of a heuristic is its success rate in producing target solutions. These may be the optimum, if known, or solutions of a given quality. If the value of the optimum is unknown, a bound can be derived using a relaxation, from which a certain deviation can be accepted.

The simplest case for comparing success rates occurs when the heuristics have no parameters, or when the parameters have been fixed. In this case, we want to answer the question: does heuristic \(\mathcal {A}\) find more target solutions than heuristic \(\mathcal {B}\)? The answer to this question is univocal: we run \(\mathcal {A}\) and \(\mathcal {B}\) on the same set of instances and we count the number of respective successes. Obviously, for this to make sense, the instances must be chosen prior to the experiment, and not according to the results obtained by one or the other method.

As in any experiment of this type, a subsidiary question must be answered: is the observed difference in success rates significant? Indeed, if the heuristics include a random component, or if the instance set is randomly selected, the difference could be due to chance and not to a distinct solving performance between the heuristics.

In this case, a statistical test can be carried out, with the null hypothesis that both methods have exactly the same probability p of success [4]. To conduct such a test, the independence of the experiments should be guaranteed. Under these conditions, relatively few numerical experiments can reveal a significant difference.

Table 11.1 provides the values for which it can be stated with 99% confidence that one proportion is significantly higher than another. This table can be used as follows: suppose we want to compare the \(\mathcal {A}\) and \(\mathcal {B}\) methods on instances drawn at random (e.g., TSPs with 100 cities uniformly generated in the unit square). Suppose that the \(\mathcal {B}\) method was able to find a solution of given quality only twice on n _b = 5 runs. Suppose that, out of n _a = 10 runs of the method \(\mathcal {A}\), 9 were able to achieve such quality. In the corresponding row and column of Table 11.1, we find the couple (10, 2). In other words, a proportion of at least 10∕10 should have been reached to conclude that the \(\mathcal {A}\) method is superior to the \(\mathcal {B}\) method.

Table 11.1

Pairs (a, b) for which a success rate ≥ a∕n _a is significantly higher than a rate ≤ b∕n _b, for a confidence level of 99%

		n _a
		2	3	4	5	6	7	8	9	10	11	12	13	14	15
n _b	2						(7,0)	(8,0)	(9,0)	(10,0)	(11,0)	(12,0)	(12,0)	(13,0)	(14,0)
	3			(4,0)	(5,0)	(6,0)	(7,0)	(7,0)	(8,0)	(9,0)	(9,0)	(10,0)	(11,0)	(11,0)	(12,0)
												(12,1)	(13,1)	(14,1)	(15,1)
	4		(3,0)	(4,0)	(5,0)	(5,0)	(6,0)	(6,0)	(7,0)	(8,0)	(8,0)	(9,0)	(9,0)	(10,0)	(11,0)
						(6,1)	(7,1)	(8,1)	(9,1)	(10,1)	(11,1)	(11,1)	(12,1)	(13,1)	(14,1)
	5		(3,0)	(4,0)	(4,0)	(5,0)	(5,0)	(6,0)	(6,0)	(7,0)	(7,0)	(8,0)	(8,0)	(9,0)	(9,0)
					(5,1)	(6,1)	(7,1)	(7,1)	(8,1)	(9,1)	(10,1)	(10,1)	(11,1)	(12,1)	(12,1)
									(9,2)	(10,2)	(11,2)	(12,2)	(13,2)	(14,2)	(14,2)
	6		(3,0)	(3,0)	(4,0)	(4,0)	(5,0)	(5,0)	(6,0)	(6,0)	(7,0)	(7,0)	(8,0)	(8,0)	(9,0)
				(4,1)	(5,1)	(6,1)	(6,1)	(7,1)	(8,1)	(8,1)	(9,1)	(9,1)	(10,1)	(11,1)	(11,1)
						(6,2)	(7,2)	(8,2)	(9,2)	(10,2)	(10,2)	(11,2)	(12,2)	(13,2)	(13,2)
											(11,3)	(12,3)	(13,3)	(14,3)	(15,3)
	7	(2,0)	(3,0)	(3,0)	(4,0)	(4,0)	(5,0)	(5,0)	(5,0)	(6,0)	(6,0)	(7,0)	(7,0)	(7,0)	(8,0)
				(4,1)	(5,1)	(5,1)	(6,1)	(6,1)	(7,1)	(8,1)	(8,1)	(9,1)	(9,1)	(10,1)	(10,1)
					(5,2)	(6,2)	(7,2)	(8,2)	(8,2)	(9,2)	(10,2)	(10,2)	(11,2)	(12,2)	(12,2)
								(8,3)	(9,3)	(10,3)	(11,3)	(12,3)	(12,3)	(13,3)	(14,3)
													(13,4)	(14,4)	(15,4)

Table 11.1

8	(2,0)	(3,0)	(3,0)	(4,0)	(4,0)	(4,0)	(5,0)	(5,0)	(5,0)	(6,0)	(6,0)	(6,0)	(7,0)	(7,0)
		(3,1)	(4,1)	(4,1)	(5,1)	(6,1)	(6,1)	(7,1)	(7,1)	(8,1)	(8,1)	(9,1)	(9,1)	(10,1)
			(4,2)	(5,2)	(6,2)	(6,2)	(7,2)	(8,2)	(8,2)	(9,2)	(10,2)	(10,2)	(11,2)	(12,2)
					(6,3)	(7,3)	(8,3)	(9,3)	(9,3)	(10,3)	(11,3)	(12,3)	(12,3)	(13,3)
								(9,4)	(10,4)	(11,4)	(12,4)	(13,4)	(14,4)	(14,4)
														(15,5)
9	(2,0)	(3,0)	(3,0)	(3,0)	(4,0)	(4,0)	(4,0)	(5,0)	(5,0)	(5,0)	(6,0)	(6,0)	(6,0)	(7,0)
		(3,1)	(4,1)	(4,1)	(5,1)	(5,1)	(6,1)	(6,1)	(7,1)	(7,1)	(8,1)	(8,1)	(9,1)	(9,1)
			(4,2)	(5,2)	(6,2)	(6,2)	(7,2)	(7,2)	(8,2)	(9,2)	(9,2)	(10,2)	(10,2)	(11,2)
				(5,3)	(6,3)	(7,3)	(8,3)	(8,3)	(9,3)	(10,3)	(10,3)	(11,3)	(12,3)	(12,3)
						(7,4)	(8,4)	(9,4)	(10,4)	(11,4)	(11,4)	(12,4)	(13,4)	(14,4)
									(10,5)	(11,5)	(12,5)	(13,5)	(14,5)	(15,5)
10	(2,0)	(3,0)	(3,0)	(3,0)	(4,0)	(4,0)	(4,0)	(4,0)	(5,0)	(5,0)	(5,0)	(6,0)	(6,0)	(6,0)
		(3,1)	(4,1)	(4,1)	(5,1)	(5,1)	(5,1)	(6,1)	(6,1)	(7,1)	(7,1)	(8,1)	(8,1)	(9,1)
			(4,2)	(5,2)	(5,2)	(6,2)	(6,2)	(7,2)	(8,2)	(8,2)	(9,2)	(9,2)	(10,2)	(10,2)
				(5,3)	(6,3)	(7,3)	(7,3)	(8,3)	(9,3)	(9,3)	(10,3)	(10,3)	(11,3)	(12,3)
					(6,4)	(7,4)	(8,4)	(9,4)	(9,4)	(10,4)	(11,4)	(12,4)	(12,4)	(13,4)
							(8,5)	(9,5)	(10,5)	(11,5)	(12,5)	(13,5)	(13,5)	(14,5)
											(12,6)	(13,6)	(14,6)	(15,6)
11	(2,0)	(3,0)	(3,0)	(3,0)	(3,0)	(4,0)	(4,0)	(4,0)	(5,0)	(5,0)	(5,0)	(5,0)	(6,0)	(6,0)
		(3,1)	(4,1)	(4,1)	(4,1)	(5,1)	(5,1)	(6,1)	(6,1)	(6,1)	(7,1)	(7,1)	(8,1)	(8,1)
		(3,2)	(4,2)	(5,2)	(5,2)	(6,2)	(6,2)	(7,2)	(7,2)	(8,2)	(8,2)	(9,2)	(9,2)	(10,2)
			(4,3)	(5,3)	(6,3)	(6,3)	(7,3)	(8,3)	(8,3)	(9,3)	(9,3)	(10,3)	(11,3)	(11,3)
				(5,4)	(6,4)	(7,4)	(8,4)	(8,4)	(9,4)	(10,4)	(10,4)	(11,4)	(12,4)	(12,4)
						(7,5)	(8,5)	(9,5)	(10,5)	(10,5)	(11,5)	(12,5)	(13,5)	(13,5)
								(9,6)	(10,6)	(11,6)	(12,6)	(13,6)	(14,6)	(14,6)
												(13,7)	(14,7)	(15,7)

Put differently, there is a probability p of success such that in more than 1% of cases, we can observe nine successes out of ten or two successes out of five. This result is counterintuitive, as the observed success rates vary by a factor larger than two.

The situation becomes more complex if the success rate depends on the computational effort involved to reach a target solution. One possibility is to plot the results as a proportion of success versus effort (time-to-target plot or TTT-plot). Figure 11.3 illustrates this for three different heuristics with a small TSP instance. For this figure, the reference for an iteration represents a call to Code 12.3. The generation time of the starting solution has been neglected here. The last is generated either in a purely random way, or with an artificial pheromone matrix, or with a randomized greedy algorithm. We also ignore that the local search may take more or less time to complete, depending on the starting solution.

The success rate curve for the method repeating local searches from randomly generated solutions was obtained by estimating the probability of a successful run. This estimation required 100,000 executions of the method, with only 14 achieving the target.

The success rate of x executions was therefore estimated to be 1 − (1 − 0.00014)^x. However, this mode of representation is questionable, as the target value and the instance chosen can greatly influence the results. Moreover, the compared methods should require an approximately identical computational effort for each iteration.

11.3.4.2 Computational Time Measure

Whenever possible, one should always favor an absolute measure of the computational effort, for example, by counting a number of iterations. Obviously, one must specify what can influence the computational effort, typically the size of the data to be processed. The algorithmic complexity of an iteration should therefore be indicated.

This complexity is sometimes not clearly identifiable, or its theoretical expression has nothing to do with practical observations. The simplex algorithm for linear programming has already been mentioned, which can theoretically perform an exponential number of pivots, but in practice stops after an almost linear number of steps.

To compare the speed of heuristics, one is sometimes forced to use a relative measure, the computational time. For the same algorithm using the same data structures with the same algorithmic complexity, the computation time depends on many factors, among which:

The programming language used for its implementation
The hardware (processor, memory, etc.)
The programming style
The interpreter or compiler
The interpretation or compilation options
The operating system
Running the system in energy-saving mode
Other independent processes running in parallel
The BIOS configuration

Achieving reliable computing times can represent a challenge. For example, the motherboards of personal computers are often configured from the factory to run in “turbo” mode. In practice, when the processor is not heavily used, the clock frequency drops, which reduces energy consumption. When starting intensive computations, the first iterations can therefore take much longer than the following ones, although they perform the same number of operations. The maximum clock frequency may depend on the fact that a laptop works on battery or with an external power supply.

Thus, a factor of 2 can indeed be observed for the execution of a procedure on two machines with the same hardware (or even on the same machine). The factor can rise to more than 100 if we compare two similar implementations but not using the same programming language.

To obtain meaningful results, it is frequently necessary to repeat runs for the same set of parameters if the measured times are less than 1 second. In all cases, it should be kept in mind that the computational time remains a relative measure. What is important is the evolution of time according to the characteristics of the problems being solved. One essential characteristic is the data size.

In Fig. 11.4, we have plotted the running time of some codes proposed in this book as a function of the number of cities in the problem. This figure uses two logarithmic scales. In this way, a polynomial growth of the computational time is represented, asymptotically, by a straight line. The slope of this line indicates the degree of the polynomial. It can be noted that all methods behave polynomially.

The reader who wishes to reproduce this figure should be able to do so without too much difficulty using the codes provided. The degree of the polynomials observed is likely to be close to that presented, but the time scale may be very different depending on the programming language and the configuration of the computer used.

11.3.4.3 Solution Quality Measure

Comparing the quality of solutions produced by fully determined heuristics (with no free parameters) is relatively simple. Each heuristic is run on one or more instance sets with similar characteristics, and the value of the objective functions produced is recorded. The most standard measure is certainly the average. However, to know if the average of the values produced by two heuristics is significantly different requires a statistical test. If we can reasonably assume that the values produced are normally distributed, there are standard tests that are relatively simple to implement, such as student’s t test. If there are more than two methods to be compared, a more elaborate test, typically an analysis of variance (ANOVA), must be conducted.

If the values produced do not follow a normal distribution, particular caution should be observed, as a single measure can significantly alter an average. In this case, it is more reliable to use another measure, for example, the median, with Friedman’s analysis of variance. With this test, a rank is assigned for each run of a method and the null hypothesis states that all samples are from a population with the same median.

Bootstrapping is a very general statistical technique that is fairly simple to implement and is particularly suitable for comparing methods for which it is not reasonable to obtain a large number of runs. The estimation of a quantity such as the mean, median, or their confidence interval is done by drawing a large number of samples from the small number of observations made. The quantity to be estimated for each of these samples is calculated, and the mean of the samples provides an estimator of this quantity. To obtain a confidence interval, it is sufficient to identify the quantiles of the resampling distribution.

When the heuristics are not completely determined, for example, if one wishes to provide a more complete picture of the evolution of the results as a function of the computational effort, the statistical tests mentioned above must be repeated for each computational effort. There are convenient tools to perform this automatically and provide diagrams with confidence intervals.

Figure 11.5 illustrates the evolution of the objective function for the same methods as in Fig. 11.3. As all three methods were run on the same machine, it is possible to compare them on the basis of computational time. This figure gives a double scale for the abscissa—Computational time, number of calls to the descent to a local optimum. For this diagram, the reference scale is time. The iteration scale refers to the first method, GRASP-PR.

This allows observing an increase in time of a few percent for the execution of the path relinking method and a decrease for the fast ant system, because the solutions generated with the pheromone trails are closer to local optima, which speed up the descent method. This diagram presents a significantly different insight into the behavior of these methods. Indeed, a misinterpretation of Fig. 11.3 would suggest that up to 1000 iterations, the FANT method is better than GRASP-PR and, beyond that, the latter is the best. Repeating the improvement methods from random solutions is much worse.

Figure 11.5 shows that up to about 50 iterations, the 3 methods do not produce solutions of statistically different value. Only from 300 iterations onward can we clearly state that multiple descents are less efficient than GRASP-PR. The curves of the latter two methods cross at several points, but it is not possible to state that one produces solutions with a significantly lower average value than the other for a number of iterations less than 300. For more details on the generation of this type of diagrams, the reader can refer to [3] and to [1] for bootstrapping techniques.

Problems

11.1

Data Structure Options for Multi-objective Optimization

It is argued in Sect. 5.5.3 that using a simple linked list to store the Pareto set may be inefficient. Is the more complicated KD-tree implementation really justified? To answer this question, evaluate the number of solutions produced by the Pareto local search Code 12.8, as well as the number of times one has to compare a neighbor solution to one of the solutions stored in this set. Deleting an element from a KD-tree can also be costly as a whole subtree has to be examined, and this can lead to cascading deletions. With a linked list, deleting a given element is done in constant time. Also assess the extra work involved.

11.2

Comparison of a True Simulated Annealing and a kind of SA with Systematic Neighborhood Evaluation

Compare the simulated annealing Code 7.1 and the noising method Code 7.2 when executed under the following conditions: instance with 50 cities and random distances generated uniformly between 1 and 99 (call to rand_sym_matrix function); start with a random solution (rand_permutation function); initial temperature: tour length∕50; and final temperature: tour length∕2500; α = 0.999.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Springer Professional