1 / 37

An Affine-invariant Approach to Linear Programming

An Affine-invariant Approach to Linear Programming. Mih á ly B á r á sz and Santosh Vempala. Talk outline. Brief history of LP Ideas Algorithm Analysis Conclusion. Brief history of LP. Max c.x s.t. Ax ≤ b. (several equivalent formulations) Simplex method,1947, Dantzig.

ngallegos
Download Presentation

An Affine-invariant Approach to Linear Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Affine-invariant Approach to Linear Programming Mihály Bárász and Santosh Vempala

  2. Talk outline • Brief history of LP • Ideas • Algorithm • Analysis • Conclusion

  3. Brief history of LP • Max c.x s.t. Ax ≤ b. (several equivalent formulations) • Simplex method,1947, Dantzig. Many variants. Still most popular LP algorithm. No polynomial guarantee known.

  4. History: polynomial algorithms • Ellipsoid method, Khachiyan 1979. • Interior point method, Karmarkar, 1984. • Nestorov, Nemirovski, Vaidya, Renegar et al., • Perceptron method, Block, Novikoff, Minsky-Papert 62 • Polynomial perceptron, Dunagan-V. 2004 • Random walk method, Bertsimas-V. 2002. • Simplex+Scaling, Kelner-Spielman, 2006. • … All these methods are geometric. They “scale” space for efficiency Complexity depends on #bits in the input.

  5. Strongly polynomial cases • Maximum weight matching in general graphs Edmonds, 1965. • Linear programming in fixed dimension Megiddo, 1984. • Minimum cost flow E. Tardos, 1986. • MDPs with fixed discount rate Ye, 2010. • … These are specialized combinatorial algorithms

  6. Simplex pivot rules • Assume polyhedron is nondegenerate, i.e., each vertex is the intersection of n facets. • Then simplex moves from vertex to vertex, improving objective value. • The rule for determining the next vertex is a “pivot” rule.

  7. Simplex pivot rules Deterministic pivot rules • Dantzig’s largest coefficient rule. Exponential example: Klee and Minty, 1972. • Greatest increase. Example: Jeroslow, 1973. • Bland’s least index rule. Ex: Avis and Chvatal, 1978. • Steepest increase. Ex: Goldfarb and Sit, 1979. • Shadow vertex rule. Ex: Murty, 1980 and Goldfarb, 1983. • These constructions are “variations” of Klee–Minty’s. • They fit a generalization called deformed products defined by Amenta and Ziegler.

  8. Simplex pivot rule • Randomized edge rule: Among all pivots that improve the objective, pick one at random. • Complexity was open for a long time. • Best known bound (G. Kalai), is subexponential. • Now a subexponential lower bound (FHZ10) • “Is there a polynomial simplex pivot rule?” This question has dominated the search for strongly polynomial algorithms.

  9. Polytope diameter • (Hirsch) Diameter of a polytope with m facets is at most m-n (Diameter = Diameter of graph induced by vertices and edges of polytope) • Best known upper bound is super-polynomial (G. Kalai). • Long-standing open problem to improve this.

  10. Hirsch vs Simplex • Complexity of any simplex pivot rule gives an upper bound on diameter of polytope graph. • Thus proving an efficient simplex rule would also be a major combinatorial breakthrough. • Is strongly polynomial LP really a nongeometric, combinatorial question?

  11. But, • [Matousek-Szabo] Take a polytope combinatorially equivalent to a hypercube; orient the edges arbitrarily, so that each face has a unique sink. There exist orientations for which the random edge pivot rule is exponential! (pivot rule runs by picking a random out-edge at each vertex visited) • Other less abstract constructions [E.g. EHRR10] • Does this imply strongly polynomial pivot rules are impossible? NO, because not all orientations are geometrically realizable. • However, it does suggest that the geometry plays an important role.

  12. New Approach • Algorithm will be affine-invariant • So complexity will not depend on how the input is scaled. Two step iteration: • At current vertex, pick a line to travel along in an affine-invariant manner and move along the line. • Go to vertex of at least as high objective value in the face reached.

  13. Step 1: Affine-invariant direction How to pick a direction? • Compute the set of improving rays (edges that lead to vertices of higher objective value) • Take a linear combination of the improving rays. Two candidate rules: • Average of all improving rays (centroid rule) • Random convex combination of improving rays (random rule). Lemma: Both rules are affine-invariant. (i.e. applying an affine transformation before computing the direction gives the same effect as applying it after)

  14. New algorithm, roughly • At current vertex, • pick direction • Follow to get to new face • Go to vertex of at least as high value • Introduces many geometric shortcuts in the polytope. • No bearing on polytope diameter or Hirsch conjecture!

  15. Step 2: go to vertex • How to go to a vertex of at least as high value? • E.g., Follow gradient, keep adding facets hit as equalities till a vertex is reached. • Doing this step arbitrarily can lead to an exponential number of iterations. • So we will go to a vertex in an affine-invariant manner.

  16. Algorithm: AFFINE INPUT: Polyhedron P: Ax ≤ b, objective c, vertex z. OUTPUT: A max objective vertex, or “unbounded” While the current vertex z is not optimal, repeat: • (Initialize) H = indices of active inequalities at z Find edges: For every t in H compute a vector vt s.t. ah.vt = 0 for h H\{t} and at .vt < 0. (vt satisfies all as equalities except at) T = {t H : c.vt≥ 0}, S = H\T.

  17. Algorithm: AFFINE While the current vertex z is not optimal, repeat: 1. (initialize) H = active inequalities at z Find edges at z; let T = improving edges and S = H\T. 2. (Iteration) While T is nonempty, repeat: Compute improving rays: For every t in T compute a vector vt≠ 0 : ah. vt= 0 for h H\{t}, c. vt≥ 0 length of vt = largest value for which z + vt remains feasible. Pick direction: compute a nonnegative combination v of T. Move: Let r be max s.t. z + r v is in P (if no max, return “unbounded”) Move the current point: z := z + r v. Update inequalities: s = inequality that becomes active. t: any index in T such that {ah: h {s} U S U T \ t} is linearly independent. Set S := S U {s}; T := T\{t} and H := S U T.

  18. Algorithm: notes • In Step 2, one new active inequality is added in each iteration; thus a vertex is reached in at most n iterations. • Step 2 can be viewed as a recursive application of the original procedure in lower dimensional faces. • Each iteration: O(mn).

  19. Analysis: How many iterations? • Klee-Minty: n iterations. • Main Theorem: For any polytope that is a deformed product, Algorithm Affine takes at most n outer iterations. (O(n2) total iterations). • Thus, algorithm is efficient on all simplex counterexamples (prior to [FHZ10]).

  20. Metrics • f(P,c,z) = (expected) Max #vertices visited when AFFINE is applied to P,c,z. • f(P) = supc,z f(P,c,z) • h(P,c) = Max length of a directed path in the graph of P, when edges are oriented according to c. • h(P) = supc h(P,c)

  21. Deformed products • P: k-dim polytope Ax ≤ a • V,W: dim polytopes, normally equivalent • linear function

  22. Deformed products

  23. Deformed products

  24. Properties • Q is combinatorially same as P x V. • If q = (y, y v + (1- y w) is a vertex of Q, then y, v, w are vertices of P,V,W resp. with v,w corresponding. • Facets of Q come from P or V/W.

  25. Affine on deformed products Theorem. f(Q) ≤ h(V) + f(P) (max #vertices visited in Q ≤ max directed path length in V + max #vertices visited in P.) Corollary. Let Q be recursively defined, with dim(V) ≤ 2 at each step of the recursion. Then, f(Q) ≤ dim(Q) + |facets(Q)|

  26. Analysis • n = Dim(Q). Intersection of n-1 defining hyperplanes of Q is called a (V,W)-ray if at most dim(V)-1 planes correspond to V,W inequalities. • Claim: The first dim(P) coordinates of a (V,W)-ray are zero.

  27. Partial order For vertex x of Q, Lemma. Let x, y be consecutive vertices visited by the algorithm applied to Q. Then in the partial order induced on the vertices of V by cV unless is already maximal.

  28. Proof of Theorem 1 Theorem. f(Q) ≤ h(V) + f(P) Proof. • #vertices of Q visited before πV(x) is maximal is at most h(V) (no vertex of V is repeated). • After reaching a maximal vertex of V, (V,W)-rays are never improving. • So algorithm proceeds as if in P and takes f(P) additional steps.

  29. Products • Consider Q = P x V (standard product) • Then vertex x of Q can be written as x = (y,v) • Would like to say f(Q) ≤ f(P) + f(V). • But, moving along a ray in Q projects stops after hitting a facet in P or in V, but not both. • How to bound complexity?

  30. Projected AFFINE (for analysis only) Each time a combination is picked in AFFINE, repeat arbitrarily many times: • Stop on the chord at any interior point • Recompute the coefficients and the combination Let g(P,c,z) = (expected) max #vertices visited by Proj. AFFINE Inserted steps are *not* counted

  31. Product polytopes • Q = P x V Theorem. f(Q,c,z) ≤ g(Q,c,z) ≤ g(P,cP,zP) + g(V,cV,zP). Proof idea. • From x = (xp,xv) we go along a ray to x’ on a facet of either P or V, say P. • Projection is a step in P and a partial step in V. • Next step in Q projects to a step of AFFINE in P and a step of Proj. AFFINE in V.

  32. Analysis • Proof also works for mild perturbations of deformed polytopes, which can change the combinatorial structure considerably (thus algorithm is not specifically designed for these counterexample classes). • Algorithm makes heavy use of geometric shortcuts through the interior of the polytope.

  33. Next steps • Polynomial upper bound? (not strongly polynomial, so one could use a geometric scaling type argument showing progress towards optimum). • Counterexample? • Upper bound for combinatorial cubes? • Upper bound for random polytopes? (to get away from the combinatorial structure of known counterexamples) • Complexity for Markov Decision Processes?

  34. Q. Is AFFINE currently the only explicit candidate for strong polynomiality?! (for LP or even MDP?)Thank you!

More Related