nach oben

Erschienen in:

Open Access 2023 | OriginalPaper | Buchkapitel

5. Local Search

verfasst von : Éric D. Taillard

Erschienen in: Design of Heuristic Algorithms for Hard Optimization

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

Improvement methods constitute the backbone of most metaheuristics. These methods repeatedly perform slight, local modifications on a current solution to the problem. Hence, for any solution, a set of neighbor solutions must be defined. Clearly, the definition of this set depends on the problem modeling. However, a natural neighborhood may turn out to be either too small to lead to quality solutions or too large, inducing prohibitive calculation times. Various approaches have been proposed to enlarge the neighborhood, such as the filter and fan method or the ejection chains. For reducing the neighborhood size, typical strategies are the granular search and the candidate list.

Improvement techniques are a cornerstone in the design of heuristics. As will be seen later, most metaheuristics incorporate a local search.

By examining the solutions produced by greedy constructive heuristics like those presented in the previous chapter, we immediately notice they are not optimal. For example, the solution of the Euclidean traveling salesman problem obtained by the nearest neighbor heuristic, shown in Fig. 4.3, has intersecting edges, which is obviously suboptimal. Indeed, it is possible to replace the two intersecting edges by two others whose sum of the lengths is lower while preserving a tour. Replacing two edges by two others that are shorter can then be repeated until a solution cannot be improved by the same process as shown in Fig. 5.1.

5.1 Local Search Framework

The general idea is therefore to start from a solution obtained using a constructive method and improve it locally. The process is repeated until no further improvement is achieved. This frame is well known in continuous optimization, to seek an optimum of a differentiable function with gradient methods. The gradient concept for finding an improving direction does not exist in discrete optimization; it is replaced by the definition of “minor” changes of the solution called moves or the concept of neighborhood .

Algorithm 5.1: General framework of a local improvement method. It is assumed to have a solution to the problem as well as a method able to generate, from any solution, a number of other ones

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figa_HTML.png

To be able to apply Algorithm 5.1, it is necessary to define how to obtain the neighbor solutions. Formally, a neighborhood set N(s) ⊂ S must be defined for any solution s ∈ S. Therefore, the search for a modification of s implies enumerating the solutions of N(s) to extract one of them, s′, which is better than s.

A convenient way to define a neighborhood N(s) is to specify the modifications, commonly called the moves, that can be applied to the solution s. In the example of Fig. 5.1 for the TSP, a move m can be specified by a pair of cities [i, j]. It consists in replacing the edges [i, s _i] and [j, s _j] by the edges [i, j] and [s _i, s _j], where s _i and s _j are, respectively, the cities that follow i and j in the solution s.

This neighborhood of replacing two edges by two others is known in the literature as 2-exchange or 2-opt [2]. The set M(s) of 2-opt moves that can be applied to the solution s can be formally defined by M(s) = {[i, j], i, j ∈ s, i ≠ j, j ≠ s _i, i ≠ s _j}. Applying a move m ∈ M(s) to the solution s is sometimes noted s ⊕ m.

The definition of the neighborhood can be obtained with the definition of the set of moves: N(s) = {s′|s′ = s ⊕ m, m ∈ M(s)}. The size of the 2-opt neighborhood is |N(s)| = Θ(|s|²). The application of an improvement method to the TSP can therefore be reasonably achieved by enumerating the neighbor solutions. This enumeration can be done according to two policies, either the first improvement or the best improvement.

5.1.1 First Improvement Heuristic

With this policy, the current solution is immediately changed as soon as an improving move is identified. The neighborhood is therefore not thoroughly examined at each iteration. This policy is therefore aggressive. It allows a solution to be improved quickly. It generally leads to a greater number of changes to the initial solution than the best improvement policy. The framework of the first improvement method is provided by Algorithm 5.2.

Algorithm 5.2: Framework of the first improvement heuristic

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figb_HTML.png

5.1.2 Best Improvement Heuristic

With the best improvement policy, the neighborhood is thoroughly examined at each iteration. The best neighbor solution identified is the current one for the subsequent iteration. Algorithm 5.3 formalizes this policy.

Algorithm 5.3: Framework of the best improvement method

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figc_HTML.png

It performs more work between each change to the solution. The improvements are therefore larger and fewer in number. This policy is less frequently used in metaheuristics. We will see later that taboo search is based on this framework. Indeed, this technique tries to learn how to modify a solution smartly. It is therefore appropriate to examine the neighborhood thoroughly rather than rushing to the first small improvement encountered.

However, there are problems where there is an interest in exploiting this policy. For the quadratic assignment problem, it can be shown that evaluating a neighbor solution takes a time proportional to O(n), whereas it is possible to evaluate the set of O(n ²) neighbor solutions in O(n ²). With the same computational effort, it is therefore possible to evaluate more solutions with the best improvement policy.

Sometimes, the set N(s) is so large that its enumeration is done either implicitly to extract the best neighbor solution or heuristically; we will come back to this in particular in Chap. 6 with the large neighborhood search technique Sect. 6.4.1.

Code 5.1 implements the best improvement framework for the TSP. The algorithm seeks the best replacement of two edges by two others.

Code 5.1 tsp_2opt_best.pyImplementation of a best improvement method for the TSP with 2-opt neighborhood

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figaaa_HTML.png

5.1.3 Local Optima

By applying Algorithm 5.1, whether one of its variants (5.2 or 5.3), a locally optimal solution is obtained with respect to the neighborhood used. Indeed, there is no guarantee that the returned solution is the best for the particular instance. Globally optimal solutions are therefore opposed to those which are only locally optimal. It should be emphasized that a local optimum relative to one neighborhood is not necessarily a local optimum for another neighborhood (see Problem 5.1). A plateau is a set of neighboring solutions all having the same value. Figure 5.2 illustrates the notion of local optimum, plateau, and global optimum.

Objective functions such as \(\min (\max (\dots ))\) generate many plateaus with many solutions. Their optimization with a local search is therefore difficult. Indeed, all solutions in a plateau are local optima.

However, for some problems, that frame provides globally optimal solutions. These include, in particular, the shortest path problem with the Bellman-Ford Algorithm 2.4 and linear programming with the simplex algorithm.

Since the set of solutions to a combinatorial problem is finite and Algorithms 5.2 and 5.3 only modify the solution if it is strictly improved, we deduce these algorithms end after a finite time. By cons, their calculation time is not necessarily polynomial, even if the size of the neighborhood is. In practice, as with the simplex algorithm, we do not observe such a degeneration.

Figure 5.3 shows the evolution of the average computing time for two methods applied to the TSP (the best improvement and first improvement policies) as a function of the number of cities. The starting solution is randomly generated, and the instances are selected from the 2D Euclidean ones of the TSPLIB [11].

5.1.3.1 TSP 3-Opt

The beginning of this chapter presents a local search for the TSP based on replacing two edges by two others. It is naturally possible to define other neighborhoods, for instance, replacing three edges (or arcs in the case of a non-symmetrical problem) by three others. Figure 5.4 shows this type of move, called 3-opt.

Code 5.2 tsp_3opt_first.pyImplementation of an improvement method based on 3-opt neighborhood. In practice, its complexity in Ω(N³) is excessive for tackling instances with several hundred cities

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figaab_HTML.png

An attractive property of this neighborhood is not to change the path direction between the three nodes whose successors are modified. In the case of symmetrical problems, there are several ways to reconnect the three sub-paths (see Problem 5.4).

Representing a solution by means of a permutation s whose element s _i indicates the city visited just after the city i, a 3-opt move can be implemented in constant time. Checking that a solution is 3-optimal can be done in O(n ³). Hence, without using neighborhood reduction techniques, it can only manage relatively small instances. Code 5.2 implements a local search for the TSP with 3-opt moves. It applies the first improvement policy.

5.1.3.2 TSP Or-Opt

Another type of neighborhood proposed by Or [8] is to modify the visiting order of a few consecutive cities. The idea is to examine whether it is pertinent to place three successive cities somewhere else in the current tour.

The originality of the method proposed by Or is exploiting several neighborhoods. Once it is no longer possible to improve the solution by changing the visiting order of three cities, we try to change only two. As soon as changing two cities improves the solution, we return to the first neighborhood. When the solution is locally optimal with respect to these two neighborhoods, we try changing the position of a unique city. Figure 5.5 illustrates a possible Or-opt move.

This neighborhood is distinct from a limitation of the 3-opt neighborhood in which a sub-road would be limited to 3, 2, or 1 city. Indeed, Or neighborhood tries to reverse the sub-path direction. Testing if a tour is Or-optimal takes Θ(n ²) time.

5.1.3.3 Data Structure for TSP 2-Opt

The 2-opt neighborhood reverses the direction of a sub-path. Visually, this seems innocuous for a symmetrical instance. But, it is not so for the computer representation of a solution. A data structure, inspired by the work of [9], enables performing a 2-opt move in constant time. For each city, an array stores both adjacent cities. The array components at indices 2i and 2i + 1 provide both adjacent cities of the city i.

An array t with 2n indices represents a solution. Initially, t _2i∕2 provides the number of the city succeeding i and (t _2i+1 − 1)∕2 the number of the city preceding i. A 2-opt move consists in modifying four values of the array t. This can be realized in constant time. Figure 5.6 illustrates the operating principle of this data structure.

Code 5.3 initializes an array t implementing this data structure from a given tour. The last is provided by the list of cities successively visited (and not the successor of each city).

Code 5.4 implements a first improvement heuristic based on this principle. It uses the shift operator i >>1 to quickly evaluate the expression i/2. The last is the number of the ith city. The “exclusive or” operator i ̂ 1 allows quickly calculating the expression i+1-2*(i%2) providing the index to access the adjacent city.

Code 5.3 build_2opt_data_structure.py Implementation and initialization of the data structure presented in Fig. 5.6. The tour is provided by the sequence of the cities to visit. The data structure allows performing a 2-opt move in a constant time

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figaac_HTML.png

Code 5.4 tsp_2opt_first.pyImplementation of a first improvement heuristic for the TSP based on 2-opt neighborhood. This implementation exploits the data structure shown in Fig. 5.6 for performing the moves in constant time

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figaad_HTML.png

5.1.4 Neighborhood Properties

A neighborhood connects various solutions of the problem. Thus, a graph can represent it. The vertices are the solutions. An edge connects two neighbor solutions. The edges can be directed if the moves are not immediately reversible. An efficient neighborhood, or its representative graph, should have certain properties.

5.1.4.1 Connectivity

The connectivity property of a neighborhood stipulates that any feasible solution can reach at least one globally optimal solution. In other words, there must be a path from any vertex to one representing an optimal solution. This path is not necessarily monotonously improving.

5.1.4.2 Low Diameter

A neighborhood is expected to allow discovering an optimal solution in a few steps. By definition, the diameter of a graph is the maximum length of a shortest path connecting two vertices.

5.1.4.3 Low Ruggedness

A neighborhood should have as few local optima as possible and a strong correlation between the values of neighbor solutions. The ideal would be to have only one, which would then be the global optimum, achievable every time starting from any solution. This property is certainly not satisfied for intractable problems and polynomial neighborhoods. However, finding an adequate neighborhood for the problem under consideration is essential for the success of an improvement method.

For problems like the TSP, many neighborhoods have been devised, some being remarkably effective for obtaining excellent solutions. This can most likely be explained by the visual side of the problem, which considerably supports us in deciding what modifications to make to a solution to improve it. For other problems, it is challenging to imagine neighborhoods, and these sometimes lead to “egg carton”-type landscapes, very poorly suited to optimization.

One possibility for smoothing a neighborhood is to modify the fitness function. The flying elephant method of [13] exploits this trick for the optimization of continuous functions. The |x| terms in the objective function are replaced by \(\sqrt {x^2 + \tau ^2}\) and terms of the type max(0, x) by \((x + \sqrt {x^2 + \tau ^2})/2\). When the parameter τ tends toward 0, the modified fitness function gets closer to the original objective function. Flying elephants is a metaphor that makes the analogy with a large round object dropped on a rugged field. Such an object will roll further to a lower altitude than a small one that will stop at the first small basin.

5.1.4.4 Small Size

Since improvement methods are based on the systematic evaluation of neighbor solutions, the neighborhood size should not be excessively large. For instance, the 2-opt neighborhood for the TSP is in O(n ²). This allows addressing problems with several thousand cities. This is not possible with the 3-opt neighborhood, in O(n ³).

5.1.4.5 Fast Evaluation

Finally, the algorithmic complexity of evaluating the quality of the neighbor solutions should be as low as possible. For an asymmetric traveling salesman problem, the advantage of the small 2-opt neighborhood size is negated by the fact that a constant time cost evaluation is no longer possible. In this case, evaluating the larger 3-opt neighborhood could be faster. As shown in Fig. 5.7, part of the tour is reversed with the 2-opt neighborhood. Thus, the ruggedness of the 2-opt neighborhood is also higher than that of the 3-opt for highly asymmetric problems.

5.2 Neighborhood Limitation

Typically, the size of a neighborhood grows quadratically or cubically with the size of the problem instance. As local searches require repeatedly evaluating the entire neighborhood, the computations are prohibitive as the instance size increases. Various techniques have been proposed to limit the computational growth.

5.2.1 Candidate List

A first idea is to make the hypothesis that a favorable move for a solution will remain good for similar solutions. A general method for limiting the computational effort is first to evaluate all moves applicable to a given solution. A selection of the best ones is stored in a candidate list. Only the moves contained in the list are evaluated for some iterations. Periodically, the whole neighborhood must be evaluated. Indeed, the solution is likely to have been quite modified since the candidate list elaboration. Moves that were unfavorable can thus become interesting and vice versa.

5.2.1.1 Candidate List for the Euclidean TSP

In the case of the TSP, the move evaluation is independent of the solution. A candidate list of moves does not need to be periodically reconstructed. But, it implies developing a mechanism to detect whether a given move is valid. For instance, a move can create two or more sub-cycles. If the TSP cities are on the Euclidean plane, building a Delaunay triangulation requires a work in \(O(n \log n)\).

The candidate moves only consider the edges present in the triangulation. It can be proved that a Delaunay triangulation has Θ(n) edges and an average vertex degree not exceeding six. Hence, the size of this limited neighborhood is in Θ(n). Empirical observation reveals the edges of an optimal tour are almost all part of the Delaunay triangulation. This is illustrated in Fig. 5.8.

Unluckily, this technique solely applies to Euclidean problems. Indeed, the construction of a Delaunay triangulation relies on geometric properties. A general neighborhood limitation technique uses a form of learning with solutions (see Chap. 10) or vocabulary building (see Sect. 8.2).

The idea behind this technique is to generate a number of solutions and limit the neighborhood to moves comprising only elements making part of these solutions. They do not need to be of exceptional quality. But, they must have diverse structures and have similar portions with good solutions to the problem. Also, obtaining them should not need excessive computational effort. Chapter 6 shows how to proceed with large instances.

5.2.1.2 TSP Neighborhood Limitation with 1-Trees

Another general neighborhood limitation technique for the TSP, proposed by Helsgaun [5], uses the concept of 1-tree (see Sect. 3.1.1.2 presenting a Lagrangian relaxation for the TSP). Shortly, the idea is to compute the extra cost that would result if the edge (i, j) is part of the 1-tree. The neighborhood reduction proceeds by keeping a few edges—typically 5—adjacent to each vertex with the lowest extra costs. A beneficial collateral effect of this edge length modification technique is to lower the ruggedness of the neighborhood.

Its algorithmic complexity depends on the construction of a minimum spanning tree. As seen in Sect. 2.1, this complexity depends on the number of edges. For large instances, it is necessary to start by reducing their number, for example, with the merging tour technique presented in the previous section. Both neighborhood reduction techniques have similarities with granular search.

5.2.2 Granular Search

Granular search is to a priori eliminate solutions with certain characteristics. Illustrating this on the vehicle routing problem, one can assume that good solutions will not include a path directly connecting distant customers. For this problem, [12] proposed to ignore the solutions which involve trips between two clients whose length is greater than a value. The last is set to β times the average trip length of a solution obtained using a fast constructive heuristic, where β is a parameter generally slightly greater than 1. This parameter is called the local search granularity. However, the paths between the depot and the customers should remain, whatever their length is.

A similar technique has been used extensively for the TSP. Instead of considering a complete graph where each city is connected to all others, it is only connected to its p closest neighbors, with p limited to a few dozen. Thus, the size of a 2-opt neighborhood is n ⋅ p ² instead of n ². The quality loss of the solutions obtained with such a neighborhood reduction is often negligible.

By cons, implementation of this idea is not trivial. First, the reduced graph where each node is connected to its p nearest neighbors may be not connected. It is therefore necessary to add longer edges so that the graph contains at least one cycle passing through all nodes. Second, the local search implementation is more complex.

For instance, the data structure shown in Sect. 5.1.3.3 for the 2-opt neighborhood cannot be used directly. Indeed, it is fast to determine the city s _i succeeding to city i and a city j close to i (there are only p candidates). But it is not possible to immediately identify the city s _j succeeding to j by following the tour in the direction i → s _i.

In the same way, for the 3-opt neighborhood, we can quickly detect three cities i, j, and k that are close and can be candidates for a 3-opt move. But we cannot determine in a constant time if, starting from i, the tour visits first the city j before the city k. Indeed, if the 3-opt move (i, j, k) is feasible for a solution, the move (i, k, j) is not: it creates three sub-tours.

5.3 Neighborhood Extension

There are problems for which it is challenging to imagine reasonable neighborhoods which substantially modify a solution. Hence, the following problem arises: how to implement substantial changes to a solution on the basis of a simple and limited neighborhood.

Let us remark there is no contradiction in both limiting the size of a simple neighborhood with the techniques described above (to eliminate the moves that will never lead to good solutions) and extending this limited neighborhood with the techniques presented in this section.

5.3.1 Filter and Fan

To construct a neighborhood N _e extended from the definition of a small set M of moves applicable to a solution, we can consider k successive modifications N _e(s) = {s′|s′ = s ⊕ m ₁ ⊕⋯ ⊕ m _k, m ₁, …m _k ∈ M(s)}. The size of N _e increases exponentially with k. To avoid such a growth, we can use the beam search strategy presented in Sect. 4.4.1, but adapted to a local search rather than a constructive method.

This technique, proposed by Glover [4], is called the filter and fan strategy. Each level only retains the p best neighbor solutions. Few of them may be worse than the starting solution. Then their neighbors are evaluated before repeating the process up to level k. Thus, at each step, up to k modifications are made according to the original neighborhood. It should be noted here that the best solution encountered when evaluating the extended neighborhood is not necessarily one of the ultimate level. This process is illustrated in Fig. 5.9.

A variant of filter and fan search is not to retain a static number p of neighbor solutions at each level, but all the improving neighbor solutions. The choice of the ultimately retained neighbor solution at level 1 is not the one that leads to the best solution up to level k, but the one which most opens the fan, that is to say, the solution of level k − 1 which has the most improving solutions at level k.

5.3.2 Ejection Chain

Another way to build a large neighborhood from basic modifications is to go through unfeasible solutions. The name of ejection chain has been proposed by Glover [3]. It combines and generalizes ideas from different sources. The most famous is certainly the Lin and Kernighan neighborhood for the TSP.

A starting solution is transformed into an object called a reference structure . The last is not a proper solution, but it can easily be transformed either into other reference structures or into feasible solutions. The starting solution is disrupted by the ejection of one of its components to obtain a reference structure which can also be transformed by the ejection of another component. This chain of ejections ends either when a better solution than the starting one has been identified or when all the elements to eject have been tested.

If an improving solution is discovered, the process is reiterated from it. Otherwise, the chain is initiated by trying to eject another item from the initial solution. The process stops when all possible chain initializations have been vainly tried. To prevent an endless process, it is forbidden either to add an item previously ejected to the reference structure or to propagate the chain by ejecting an element that was added to the reference structure.

5.3.2.1 Lin-Kernighan Neighborhood

One of the most effective neighborhoods for the TSP is due to Lin and Kernighan [7]. It is based on an ejection chain. The initial tour is transformed into a path by removing an edge [a, b]. This path is transformed into a reference structure. The last consists of a path linked to a cycle by including an edge [b, d]. The removal of the edge [c, d], which is part of the cycle of the reference structure, transforms the reference structure into another path.

The last is either transformed into a tour by adding the edge [a, c] or in another reference structure by adding another edge incident to c. This node then plays the role of b of the previous step. Figure 5.10 illustrates the operating principle of this process.

The Lin and Kernighan local search is presented in Algorithm 5.4.

Algorithm 5.4: Ejection chain (Lin and Kernighan) for the TSP

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figd_HTML.png

The ejection chain mechanism may seem artificial. However, it is possible to obtain relatively complex modifications and improve not awful solutions, as illustrated in Fig. 5.11.

A basic Lin and Kernighan implementation is given in Code 12.3. Much more elaborated, highly efficient implementations are due to [1] and [6].

5.4 Using Several Neighborhoods or Models

A local optimum is relative to a given neighborhood structure. Hence, it is possible to use multiple neighborhoods simultaneously. For instance, a 2-optimal TSP tour is not necessarily 3-optimal.

Once a 2-optimum solution has been found, it is potentially possible to improve it with a method using a 3-opt neighborhood (see, for instance, Fig. 9.4). Similarly, a 3-optimal solution is not necessarily 2-optimal. We can therefore repeat the improvement processes as long as the solution found is not a local optimum with respect to all the neighborhood structures considered. The reader can verify this fact by running Code 12.7.

Finally, let us mention that one can switch from one modeling of the problem to another. To the extent that the neighborhood structure is not the same for the various modeling, it is equally possible to iterate improvement methods using different models. This technique may be inapplicable as is, since a feasible solution for one model can be unfeasible for another. In this case, repair methods should be provided when changing the modeling. This implies the process is no longer a strict improvement method. A corollary is that the search could enter an infinite cycle. Indeed, repairing a solution obtained with a first model and then improving it with a second model can cancel out the improvements obtained with the first model.

5.5 Multi-Objective Local Search

Various relatively simple local search techniques have been proposed for multi-objective optimization. We will examine two approaches not so difficult to implement and producing good solutions.

5.5.1 Scalarizing

A technique mentioned in Sect. 3.2.1 for multi-objective optimization is scalarizing. It aggregates the objectives by associating a weight w _i with the ith one. By providing a vector of weights and transmitting a scalar function to a local search, we can thus get an approximation of a supported solution. By varying the weight vector, we can discover more of these approximations. However, this technique produces at best one solution approximating the Pareto set for each local search run.

Algorithm 5.5: Framework of an improvement method for multi-objective optimization. The user must provide a parameter I_max giving the number of scalarizations. Here, the weights are just randomly generated

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Fige_HTML.png

Without needing much more programming efforts, it is possible to get a better approximation of the Pareto set by transmitting all the objectives to the local search. Hence, we can check each neighbor solution whether it improves the Pareto set approximation. An elementary implementation of this principle is presented in Algorithm 5.5. Figure 5.12 illustrates the behavior of Algorithm 5.5 for three different scalarizations.

5.5.2 Pareto Local Search

Algorithm 5.6: Framework of Pareto local search for multi-objective optimization. The interest of the method is to contain no parameter

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figf_HTML.png

An alternative approach to scalarization is the Pareto local search [10]. The idea is to start with any solution to the problem. The last is the first estimate—of poor quality—of the Pareto set. While the estimated Pareto set is not stabilized, generate all the neighbor solutions of the estimated Pareto set and update it with them.

Recursive procedures allow expressing the method very concisely, as shown by Algorithm 5.6. Code 5.5 implements a Pareto local search for the TSP.

However, the method can be sped up by not starting with an arbitrary solution, but by calling it several times with good solutions obtained by scalarizing the objectives. Indeed, there are habitually very effective methods for solving mono-objective problems. This allows us to immediately get supported solutions.

For instance, polynomial algorithms exist for the linear assignment problem, while the multi-objective version is NP-hard. Starting Pareto local search with a reliable estimate of the Pareto set limits the number of updates, and the procedure stops faster. This also avoids too deep recursive calls, likely overflowing the recursion stack.

Code 5.5 tsp_3opt_pareto.py Implementation of Pareto local search for the TSP, based on 3-opt moves. This function calls another one (Code 12.6) that updates the Pareto set for each neighbor of the provided solution. The data structure used to store the Pareto set is discussed in Sect. 5.5.3

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-13714-3_5/MediaObjects/522503_1_En_5_Figaae_HTML.png

5.5.3 Data Structures for Multi-Objective Optimization

Both techniques presented above for multi-objective optimization can be relatively inefficient if no adequate data structure is used to store the Pareto set. Indeed, it often requires a constant time to evaluate the objectives of a neighbor solution. This means a few nano-seconds on current computers. Checking for the domination of a solution can consume much more time than computing the objectives. Using a simple list to store the p solutions of the Pareto set may slow down the search by a factor proportional to p.

It is not uncommon for the Pareto set to contain thousands of solutions. Hence, an appropriate data structure for testing the dominance of a solution should not require a computational time growing linearly with the Pareto set size.

5.5.3.1 Array

Assuming a limited number of different integer values for the K objectives, a K − 1 dimensional array is a simple and extremely efficient data structure to store the Pareto set.

The size of this array in a dimension is given by the distinct possible values the corresponding objective can take. A cell of this array stores the value of the best solution found for the Kth objective. For a bi-objective problem, we have a simple array. For instance, if we know that the first objective can vary from 2 to 13 and we have identified solutions with objectives (2, 27), (4, 24), (6, 23), (7, 21), and (11, 17), the array contains [27, 27, 24, 24, 23, 21, 21, 21, 21, 17, 17, 17]. After the discovery of the solution (5, 20), the array is updated as follows:

[27, 27, 24, 20, 20, 20, 20, 20, 20, 17, 17, 17].

This data structure is limiting, because the objectives are not necessarily integers or do not involve a reasonable number of different values. However, if this data structure is usable in practice, it is incomparably fast, since it is possible to know the domination status of a solution in constant time.

5.5.3.2 KD-Tree

In the general case, a data structure whose query time is weakly growing with the number p of elements stored is the KD-tree. It is a binary search tree, where a node at the depth d discriminates the other elements of the tree on the dimension dmodK. Code 12.4 presents the basic procedures for including a new element in a KD-tree and inspecting all the elements stored in the tree.

The removal of a given node is a tricky procedure to program for a KD-tree. Figure 5.13 gives an example of updating a KD-tree after the discovery of a new efficient solution. Unlike a single-dimensional binary tree, the rightmost node from the left subtree or the leftmost one of the right subtree can generally not replace the removed node. Indeed, a KD-tree discriminates on a different dimension at each level. So, a walk through both sub-trees is required to find the replacing node. The last is itself recursively replaced, until it is a leaf, simply eliminated.

Code 12.5 implements a removal procedure of a given node within a KD-tree. Finally, to use a KD-tree to update a Pareto set, a function must find a point of the tree that dominates an attempted solution, if any. For this, it is necessary to look if the KD-tree possesses a point between the ideal point, which is the one whose values on all dimensions are those of the optimum of the objectives, taken separately.

In practice, the ideal point is unknown, but a coarse approximation is appropriate, for instance, (−∞, …, −∞), if all the objectives must be minimized. If the KD-tree contains points in the hyper-rectangle delimited by (−∞, …, −∞) and the attempted solution, then these points dominate the trial solution. The last is thus ignored. Otherwise, the trial solution dominates others, which must be eliminated from the KD-tree. The trial solution is then added to the KD-tree. Code 12.6 allows updating a Pareto set when seeking to include a new point.

Problems

5.1

Local Minima

Figure 5.2 shows the local optima of a function of a discrete variable x relative to a neighborhood consisting in changing x by one unit. On this figure, locate the local optima relative to an asymmetric neighborhood consisting in either adding 4 or subtracting 3 from x. Does this neighborhood have the connectivity property?

5.2

Minimizing an Explicit Function

An integer function of integer variables [−7, 6] × [−6, 7] → [−10, 650] is explicitly given in Table 5.1. We seek the minimum of this function by applying a local search with the first improvement policy, starting from the solutions (6, 7) of value 650 and from the solution (6, −6) of value of 510. It is assumed that the moves consist in modifying by a unit the value of a variable. The moves are checked in the order: (+1, 0), (0, +1), (−1, 0), and (0, −1). Next, apply a local search with the best improvement policy, starting from the solution (−7, 7) of value 248 and from the solution (−7, −6) of value of 92.

Table 5.1

Integer function f(x, y) explicitly given

		x
		− 7	− 6	− 5	− 4	− 3	− 2	− 1	0	1	2	3	4	5	6
y	7	248	216	210	222	230	234	256	304	336	372	428	495	585	650
	6	193	175	157	166	174	181	215	249	295	329	382	454	539	597
	5	138	144	126	116	124	150	184	194	250	305	361	425	480	566
	4	123	89	85	97	105	109	129	179	209	246	302	368	458	525
	3	92	58	70	70	78	94	98	148	168	223	282	339	413	510
	2	68	34	46	46	54	70	74	124	144	199	258	315	388	486
	1	51	17	14	25	33	38	57	107	136	174	230	296	386	454
	0	18	25	5	− 4	3	29	65	74	131	185	240	305	361	445
	− 1	27	6	− 10	0	8	13	46	83	126	160	213	284	371	429
	− 2	33	0	− 3	8	15	20	39	89	118	156	212	278	368	436
	− 3	33	12	− 4	6	14	19	52	89	132	166	219	290	377	435
	− 4	30	37	17	7	15	41	77	86	143	197	252	317	373	457
	− 5	69	35	32	43	51	56	75	125	154	192	248	314	404	472
	− 6	92	58	70	70	78	94	98	148	168	223	282	339	412	510

5.3

2-Opt and 3-Opt Neighborhood Properties

Show the following properties of the 2-opt and 3-opt neighborhoods:

The inversion of two cities or a 3-opt move can be obtained by a succession of 2-opt moves.
2-opt and 3-opt neighborhoods have connectivity property.

Provide an upper bound to the diameter of these neighborhoods.

5.4

3-Opt for Symmetric TSP

Section 5.1.3.1 introducing the 3-opt neighborhood shows a possibility to replace three arcs with three other arcs. This possibility respects the direction of travel of the three sub-paths located between the modified arcs. In the case of a symmetric problem where one accepts to change the direction of travel of some sub-paths, how many possibilities are there to replace three edges with three other edges while keeping an admissible tour?

5.5

4- and 5-Opt

For an asymmetric TSP, how many ways are there to replace four arcs with four other arcs while maintaining the direction of travel of the sub-paths? Same question for replacing five arcs.

5.6

Comparing 2-Opt Best and First

How many moves should be tested to show that a TSP tour with n cities is 2-optimal? Empirically evaluate the number of repetitions of the external loop. Provide this number as a function of the problem size for both procedures tsp_2opt_first and tsp_2opt_best. Analyze the difference if the procedures start either with the nearest neighbor solution or with a random one. Consider examples of Euclidean problems, randomly, uniformly generated in a square. Explain the results.

5.7

3-Opt Candidate List

To limit the size of a TSP neighborhood, only the 40 shortest arcs incident to each city are considered. With such a limitation, what is the complexity of verifying if a solution is 3-optimal? Is a special data structure required to achieve this minimum computational complexity? Is the neighborhood generated by such a limitation connected?

5.8

VRP Neighborhoods

Suggest four different neighborhoods for the vehicle routing problem. Give the size of these neighborhoods as a function of the number n of customers and the number m of tours. Specify whether these neighborhoods have connectivity property, depending on the problem modeling.

5.9

Steiner Tree Neighborhood

Two solution modelings have been proposed in Sect. 2.1.2 for the Steiner tree problem. Suggest neighborhoods adapted to each of these modelings.

5.10

Ejection Chain for the VRP

Propose a technique based on ejection chains for the vehicle routing problem. Specify how to initialize the chain, how to propagate it, and how to stop it. Estimate the computational complexity of evaluating an ejection chain for a solution with n customers and m tours.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Vorheriges Kapitel Constructive Methods

Nächstes Kapitel Decomposition Methods

Applegate, D.L., Bixby, R.E., Chvátal, V., Cook, W.J.: Concorde: A code for solving Traveling Salesman Problems. https://github.com/matthelb/concorde (1999). Accessed 16 June 2022

Croes, G.A.: A method for solving traveling salesman problems. Oper. Res. 6, 791–812 (1958). https://doi.org/10.1287/opre.6.6.791 CrossRef

Glover, F.: Ejection Chains, reference structures and alternating path methods for traveling salesman problems. Discrete Appl. Math. 65, 223–253 (1996). https://doi.org/10.1016/0166-218X(94)00037-E CrossRef

Glover, F.: A template for scatter search and path relinking. In: Hao, J.K., Lutton, E., Ronald, E., Schoenauer, M., Snyers, D. (eds.) Artificial Evolution, Lecture Notes in Computer Science, vol. 1363, pp. 13–54 (1998). https://doi.org/10.1007/BFb0026589

Helsgaun, K.: An effective implementation of the Lin-Kernighan traveling salesman heuristic. Eur. J. Oper. Res. 126(1), 106–130 (2000). https://doi.org/10.1016/S0377-2217(99)00284-2 CrossRef

Helsgaun, K.: Using POPMUSIC for Candidate set generation in the Lin-Kernighan-Helsgaun TSP solver. Department of Computer Science, Roskilde University, Denmark (2018)

Lin, S., Kernighan, B.W.: An effective Heuristic algorithm for the traveling-salesman problem. Oper. Res. 21(2), 498–516 (1973). https://www.jstor.org/stable/169020 CrossRef

Or, I.: Traveling salesman-type combinatorial problems and their relation to the logistics of regional blood banking. Ph.D. Thesis, Northwestern University (1976)

Osterman, C., Rego, C.: A k-level data structure for large-scale traveling salesman problems. Ann. Oper. Res. 244(2), 1–19 (2016). https://doi.org/10.1007/s10479-016-2159-7 CrossRef

10.

Paquete, L., Chiarandini, M., Stützle, T.: Pareto local optimum sets in the biobjective traveling salesman problem: An experimental study. In: Gandibleux, X., Sevaux, M., Sörensen, K., T’kindt, V. (eds.) Multiobjective Optimisation. Lecture Notes in Economics and Mathematical Systems, vol. 535, pp. 177–200. Springer, Berlin (2004). https://doi.org/10.1007/978-3-642-17144-4_7

11.

Reinelt, G.: TSPLIB — A T.S.P. Library . http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95 (1990). Accessed 16 June 2022

12.

Toth, P., Vigo, D.: The granular tabu search and its application to the vehicle-routing problem. INFORMS J. Comput. 15(4), 333–346 (2003). https://doi.org/10.1287/ijoc.15.4.333.24890 CrossRef

13.

Xavier, A.E., Xavier, V.L.: Flying elephants: A general method for solving non-differentiable problems. J. Heuristics. 22(4), 649–664 (2016). https://doi.org/10.1007/s10732-014-9268-8 CrossRef

Titel: Local Search
verfasst von: Éric D. Taillard
Verlag: Springer International Publishing
Buch: Design of Heuristic Algorithms for Hard Optimization
Print ISBN: 978-3-031-13713-6

Electronic ISBN: 978-3-031-13714-3

Copyright-Jahr: 2023
DOI: https://doi.org/10.1007/978-3-031-13714-3_5

Springer Professional