The Traveling Salesman Problem in Theory & Practice

The Traveling Salesman Problem in Theory & Practice Lecture 10: The Cutting Plane and Branch-and-Bound Approaches to TSP Optimization 1 April 2014 David S. Johnson dstiflerj@gmail.com http://davidsjohnson.net Seeley Mudd 523, Tuesdays and Fridays

Outline • Cutting Planes • Branch and Bound • Student Presentation by ArkaBhattacharya, on “An O(log n/loglog n) randomized approximation algorithm for the ATSP,” by Asadpour, Goemans, Madry, Gharan, and Saberi, SODA 2010.

Optimization, The Early Days[Applgate, Bixby, Chvátal, & Cook,The Traveling Salesman Problem, 2006] In the 1940’s and 50’s, even 10-city instances were state-of-the-art. • Computers were much slower than today, if you had access to one at all, and compiler optimization almost non-existent. • 9!/2 = 181,440 was just too daunting, so even pruned exhaustive search over permutations seemed infeasible. First breakthrough was [Dantzig, Fulkerson, & Johnson, “Solution of a large-scale traveling-salesman problem”, Operations Research 2 (1954), 393-410], which solved a 49-city problem, generated by picking one city from each of the then 48 U.S. states, plus Washington DC, and using the roadmap distances between them. (Actually, they deleted 7 East-coast cities, solved the resulting 42-city instance, and, lucky for them, the optimal tour contained a link from Washington DC to Boston, and the shortest path between those two contained all the deleted cities.)

The Cutting Plane Approach Solve the edge-based integer programming formulation for the TSP, as follows: • Start by solving a weak linear programming relaxation. • While the LP solution is not a tour, • Identify a valid inequality that holds for all tours but not the current solution(a “cutting plane” or “cut” for short) . • Add it to the formulation and re-solve. Dantzig, Ford, and Fulkerson started with Minimize Σedexe, subject to Σxϵexe= 2 for all cities x. where de is the length of edge e. Note: This can be solved in O(N3) time by b-matching techniques. Dantzig et al. solved it using the Simplex algorithm. By Hand.

The Cutting Plane Approach, Illustrated Hyperplane perpendicular to the vector of edge lengths Optimal Tour -- Points in RN(N-1)/2 corresponding to a tour.

Optimal Tour is a point on the convex hull of all tours. Facet Note: Subtour constraints are all facets. Unfortunately, the LP relaxation of the TSP, especially when we omit the subtour constraints, can be a very poor approximation to the convex hull of tours. To improve it, add more constraints (“cuts”)

Digression: Solving the Degree-2 Problem in O(N3) Time w1 w4 w2 w3 Graphical TSP Instance Replace each city by a pair of cities.

First Idea Doesn’t work – can get multiple copies of the same original edge. Replace each edge by a four edges, one joining each pair of a copy of one endpoint to a copy of the other, all with the original weight. Compute a standard weighted matching.

Doesn’t work – can get multiple copies of the same original edge.

Second Idea If you don’t take the weighted link, then you must take its neighbors. If you do take the weighted link, then you must take edges covering representatives of both of the original edge’s endpoints. w1 Hence you cannot cover any representatives of the original edge’s endpoint with this gadget. Replace each edge by a gadget like this, with all new edges having weight zero except the middle one, which gets the weight of the original edge.

Compute a standard weighted matching. The resulting matching corresponds to a set of edges in our original graph that makes every city have degree 2.

Finding Violated Constraints First Choice: Subtour-elimination (subtour) constraints. Σe={u,v},uϵS,v∉Sxe≥ 2for Sa proper subset of the cities. How to find a violated one: Consider the graph consisting of edges e with xe > 0. Determine its connected components (takes linear time). If the whole graph is not connected, we get a violated constraint for each connected component. This is unfortunately not enough – the graph containing all the non-zero edges can be highly connected and still violate a subtour constraint. Danzig et al. found violated subtour constraints by hand. We can do it in O(N4) time:

Finding Violated Subtour Constraints • Let G be a graph with the set C of all cities as vertices, and an edge joining cities a and b if and only if x{a,b} > 0, that edge having weight equal to x{a,b}. • Pick a city c. • If there is a violated subtour constraint, c will be on one side of it and some other city c’ will be on the other. • Thus there must be some a set S of cities containing c but not c’, such that the total weight of edges linking S and C-S less than 2. • But by the maxflow-mincut theorem, this means that the maximum flow from c to c’, where an edge’s capacity equals its weight, will be less than 2. • Such a maxflow problem can be solved in time O(N3). • To find a violated subtour constraint (if any exists) we need only solve such problems for all pairs (c,c’), c’ ∈ S-{c}, for a total of O(N4) time.

Digression: Computing the HK bound in Polynomial Time (by Linear Programming) • Input data: • Number N of cities • Edge lengths d{a,b}for each pair of cities. • Selected city c0. • Variables: • x{a,b} for each pair {a,b} of cities (from the LP relaxation of the TSP). • f(a,b,c’) for each ordered pair(a,b) of distinct cities and each city c’ ≠ c0. • Goal: Minimize Σ{a,b}d{a,b}x{a,b} • Subject to the following constraints.

Constraints of the HK Linear Program • Variable range constraints: 0 ≤ z ≤ 1, for all variables z. • Capacity constraints: f(a,b,c’)≤ x{a,b}, for all triples (a,b,c’). • Flow conservation constraints: Σbf(b,a,c’)= Σbf(a,b,c’), for all c’ ≠ c0and all a ∉ {c0,c’}. • MinCut constraints: Σbf(b,c’,c’) ≥ 2, for all c’ ≠ c0. O(N3) total variables and constraints

Back to [DFJ54]:More Cutting Planes Needed After reaching a solution that satisfied all degree-2 and subtour constraints, there were still fractional variable values. Solid edges have xe = 1, dashed edges have xe = ½. Picture from [ABCC06]

Another Facet Class: Comb Inequalities H T1 Tt Tt-1 T2 T3 Teeth Tiare disjoint,t ≥ 3 is odd, all teeth contain at least one city in H and one not in H.

For Y the handle or a tooth, let δx(Y) be the total value of the edge variables xefor edges with one endpoint in Y and one outside. • By the subtourinequalities, we must have δx(Y) ≥ 2 for each such Y. δx(Y)also must be even, which is exploited to prove the following comb inequality, which holds whenever x corresponds to a tour: H T1 Tt Tt-1 T2 T3 t δx(H) + ∑δx(Ti) ≥ 3t+1 i=1

Theorem: If x is a 0-1 vector of length n(n-1)/2 corresponding to a tour (with x[i] = 1 if and only if edge ei is in the tour) and the sets H and Ti are as described, then Proof: If δx(Ti) = 2, then the tour corresponding to x must contain a Hamilton path through the cities in Ti. That path must have at least one edge between a city in H and one not in H, which will be counted in both δx(Ti) and δx(H). If δx(Ti) > 2, then we must have δx(Ti) ≥ 4. Suppose that the number of teeth with the latter property is k.Then our sum is at least 3t + k, which is at least 3t + 1 if k > 0. If k = 0, then δx(H) ≥ t. Since t is odd, this means δx(H) ≥ t+1. In both cases we have that our overall sum is at least 3t+1. QED H T1 Tt Tt-1 T2 δx(H) + ∑δx(Ti) ≥ 3t+1. T3 t i=1

Back to [DFJ54]: Are there any violated combs? T1 H T2 T3 Picture from [ABCC06] δx(H) = 1 + 1 + 1 = 3 δx(T1) = 1 + ½ + ½ = 2 t δx(T2) = ½ + ½ + ½ + ½ = 2 δx(H) + ∑δx(Ti) = 9 < 3t+1 = 10. δx(T3) = 1 + ½ + ½ = 2 i=1

New LP Solution Are there any more violated combs? H T3 T2 T1 Picture from [ABCC06] δx(H) = 1 + 1 + ½ + ½ = 3 δx(T1) = 1 + ½ + ½ = 2 t δx(T2) = ½ + ½ + ½ + ½ = 2 δx(H) + ∑δx(Ti) = 9 < 3t+1 = 10. δx(T3) = ½ + ½ + ½ + ½ = 2 i=1 With this extra constraint, the LP yields an optimal tour.

Finding Violated Combs No general polynomial-time algorithm known. Not known to be NP-hard either. Many fast heuristics (See [ABCC06] for details and references). Some polynomial-time solvable special cases: • Combs in which all teeth contain exactly two cities. • These are called “blossom inequalities.” • Takes time O(N3). • See [Padberg & Rao, “Odd minimum cuts and b-matchings,” Math. of O.R.7 (1982), 67-80]. • Combs with tteeth for fixed t. • See [R. Carr, “Separating clique trees and bipartition inequalities having a fixed number of handles and teeth in polynomial time,” Math. of O.R. 22 (1997), 257-265]. • Takes time O(N2t)which is not practical even for combs with 3 teeth (and even this assumes that your value of x already satisfies all the subtour constraints).

Generalizations of Combs I: The Clique Tree • A non-empty collection of pairwise-disjoint “handles” H1, H2, …, Hh. • A set of at least three pairwise-disjoint “teeth” T1, T2, …, Tt. • These sets satisfying: • Each tooth contains at least on city that is not in any handle. • Each handle intersects an odd number of teeth, with that number being at least 3. • The graph G, with a vertex for each handle and each tooth, and an edge between to vertices if the corresponding sets intersect, is a tree. h Clique Tree Inequality: h ∑δx(Hi) + ∑δx(Tj)≥ 2h + 3t - 1 t i=1 j=1 These inequalities all correspond to TSP facets [Grötschel & Pullyblank, 1986]. A violated one can be found in polynomial time for fixed h and t [Carr,1997].

Generalizations of Combs II: Stars and Paths • Start with a set of common-tooth combs {Hi,T1,T2,…,Tk}, 1 ≤ i ≤ k, where • Hi ⊂ Hi+1, 1 ≤ i< k, and • Hi+1 - Hi - ∪Tj = ϕ. • Call a sequence {p, p+1, …, p+k} ⊆ {1, 2, …, h} a “handle interval” for tooth Tjif it is a maximal such sequence with the following property: Hp∩Tj = Hp+1∩Tj= … = Hp+k∩Tj If each city in a tooth can only be used in one comb, we will need two copies of the teeth T1, T2, and T4, and three copies of T5. H3 {1}, {2,3} for T1 H2 H1 h {1,2}, {3} for T2 {1}, {2}, {3} for T3 {1}, {2,3} for T4 {1,2,3} for T5 T1 T2 T3 T4 T5

Stars and Paths, Continued • Now suppose we have αicopies of handle Hi and βj copies of tooth Tj. • We get “Star Inequality” if, for each tooth Tjand each handle interval I for Tj, ∑αi≤ βj. • We have a “Path Inequality” if, for each tooth Tjand each handle interval I for Tj, ∑αi= βj. • In both cases, for valid tours, we have • Path Inequalities are facet-inducing [Naddef & Pochet, 2001]. i∈I i∈I h h t t h ∑αiδx(Hi) + ∑βjδx(Tj)≥ (t+1)∑αi+ 2∑βj i=1 i=1 j=1 j=1

More Cutting Planes • Domino Parity Cuts [Letchford, 2000] • Bipartition Cuts [Carr, 1997]. • Local Cuts [ABCC, 2006]. • … But the number of facets of the TSP polytope grows exponentially in N, so the cutting plane approach, by itself, runs out of gas quite quickly. Another idea is needed.

Branch & Bound for the TSP Assume we have an algorithm ALB that computes a lower bound on the TSP length when certain edges are fixed (forced to be in the tour, or forced not to be in the tour), and which, for some subproblems, may produce a tour as well. • Start with an initial heuristic-created “champion” tour TUB, an upper bound UB = length(TUB) on the optimal tour length, and a single “live” subproblem in which no edge is fixed. • While there is a live subproblem, pick one, say subproblemP, and apply algorithm ALB to it. • If LB < UB • Pick an edge e that is unfixed in P and create two new subproblems as its children, one with e forced to be in the tour, and one with e forced not to be in the tour. • If algorithm ALB produced a tour T, and length(T) <UB • Set UB = length(T) and TUB =T. • Delete all subproblems with current LB≥UB, as well as their children and their ancestors that no longer have any live children. • Otherwise (LB ≥ UB), delete subproblemP and all its ancestors that no longer have live children. • Halt. Our current champion is an optimal tour.

UB = 97 Initial LP, UB = 100, LB = 90 X{a,b} = 0 X{a,b} = 1 LB = 92 LB = 93 X{c,d} = 0 X{c,d} = 1 X{a,c} = 0 X{a,c} = 1 LB = 92 LB = 100 LB = 98 LB = 97 New Opt = 97 X{e,a} = 0 X{e,a} = 1 Opt = 97 LB = 101 LB = 100

First Effective Implementation • A branch-and-bound scheme of Held & Karp that used their iterative one-tree-based approach to approximating the LP relaxation of the TSP (described last week) for computing lower bounds. [Held & Karp, ‘‘The traveling-salesman problem and minimum spanning trees: Part II,’’ Math. Programming 1 (1971), 6-25]. • Beat Dantzig-Fulkerson-Johnson’s record of 49 cities, set 17 years earlier, by solving two 64-city instances (one random Euclidean instance, and one that merged the 42-city reduced instance of Dantzig et al. with 22 random Euclidean cities). • Unlike Dantzig et al., who solved their instance by hand, this one was solved by computer (programmed Linda Ibrahim at UC Berkeley). • Subsequent work along this line pushed the record up to 67. [Camerini, Fratta, & Maffioli, “On improving relaxation methods by modified gradient techniques,” Math. Programming Study 3 (1975), 26-34]. • But again we seem to be hitting limits and another idea is needed.

Branch & Cut • Combine Branch-and-bound with Cutting Planes. • More details next time.

The Traveling Salesman Problem in Theory & Practice