1 / 28

Algorithmic Frontiers of Doubling Metric Spaces

Algorithmic Frontiers of Doubling Metric Spaces. Robert Krauthgamer Weizmann Institute of Science Based on joint works with Yair Bartal , Lee-Ad Gottlieb, Aryeh Kontorovich. The Traveling Salesman Problem: Low-dimensionality implies PTAS. Robert Krauthgamer

trista
Download Presentation

Algorithmic Frontiers of Doubling Metric Spaces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Algorithmic Frontiers of Doubling Metric Spaces Robert Krauthgamer Weizmann Institute of Science Based on joint works with YairBartal, Lee-Ad Gottlieb, Aryeh Kontorovich

  2. The Traveling Salesman Problem: Low-dimensionality implies PTAS Robert Krauthgamer Weizmann Institute of Science Joint work with YairBartal and Lee-Ad Gottlieb TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA

  3. Traveling Salesman Problem (TSP) • Definition: Given a set of cities (points), find a minimum-length tour that visits all points • Classic, well-studied NP-hard problem • [Karp‘72; Papadimitriou-Vempala‘06] • Mentioned in a handbook from 1832! • Common benchmark for optimization methods • Many books devoted to TSP… • Numerous variants • Closed/open tour • Multiple tours • Average visit time (repairman) • Etc… Optimal tour Algorithmic Frontiers of Doubling Metric Spaces

  4. Metric TSP • Basic assumptions on distances • Symmetric • d(x,y) = d(y,x) • Metric • Triangle inequality: d(x,y) + d(y,z) ≤ d(x,z) • Easy 2-approximation via MST • Since OPT ≥ MST • Can do better… • MST+MatchingOPT [Christofides’76] MST Algorithmic Frontiers of Doubling Metric Spaces

  5. Euclidean TSP • Sanjeev Arora [JACM‘98] and Joe Mitchell [SICOMP‘99]: Euclidean TSP with fixed dimension admits a PTAS • Find (1+Ɛ)-approximate tour • In time n∙(log n)Ɛ-Õ(dimension) where n = #points • (Extends to other norms) • They were awarded the 2010 Gödel Prize for this discovery Algorithmic Frontiers of Doubling Metric Spaces 5

  6. PTAS Beyond Euclidean? • To achieve a PTAS, two properties were assumed • Euclidean space (at least approximately) • Fixed dimension • Are both these assumptions required? • Fixed dimension is necessary • No PTAS for (log n)-dimensions unless P=NP [Trevisan’00] • Is Euclidean necessary? • Consider metric spaces with low Euclideanintrinsic dimension… Algorithmic Frontiers of Doubling Metric Spaces 6

  7. Doubling Dimension • Definition: Ball B(x,r) = all points within distance r from x. • The doubling constant (of a metric M) is the minimum value >0such that every ball can be covered by balls of half the radius • First used by [Assoud‘83], algorithmically by [Clarkson‘97]. • The doubling dimension is ddim(M)=log (M) [Gupta-K. -Lee‘03] • M is called doubling if its doubling dimension is constant • Packing property of doubling spaces • A set with diameter D>0 and inter-point distance ≥a, contains at most (D/a)O(ddim)points Here ≤7. Algorithmic Frontiers of Doubling Metric Spaces

  8. Applications of Doubling Dimension • Nearest neighbor search • [K.-Lee’04; HarPeled-Mendel’06; Beygelzimer-Kakade-Langford’06; Cole-Gottlieb‘06] • Spanners, routing • [Talwar’04; Kleinberg-Slivkines-Wexler’04; Abraham-Gavoille-Goldberg-Malkhi’05; Konjevod-Richa-Xia-Yu’07, Gottlieb-Roditty’08; Elkin-Solomon‘12;] • Distance oracles • [HarPeled-Mendel’06;Bartal-Gottlieb-Roditty-Kopelowitz-Lewenstein’11] • Dimension reduction • [Bartal-Recht-Schulman’11, Gottlieb-K.’11] • Machine learning and statistics • [Bshouty-Yi-Long‘09; Gottlieb-Kontorovich-K.’10,‘12; ] G H 1 2 2 1 1 1 1 Algorithmic Frontiers of Doubling Metric Spaces 8

  9. PTAS for Metric TSP? • Does TSP on doubling metrics admit a PTAS? • Aroraand Mitchell made strong use of Euclideanproperties • “Most fascinating problem left open in this area” [James Lee, tcsmath blog, June ’10] • Some attempts • Quasi-PTAS [Talwar‘04] (First description of problem) • Quasi-PTASfor TSP w/neighborhoods [Mitchell’07; Chan-Elbassioni‘11] • Subexponential-TAS, under weaker assumption [Chan-Gupta‘08] • Our result: TSP on doubling metrics admits a PTAS • Find (1+Ɛ)-approximate tour • In time: n2O(ddim)2Ɛ-Õ(ddim) 2O(ddim2) log½n • Euclidean(to compare): n∙(log n)Ɛ-Õ(dimension) Throughout, think of ddimand εas constants Algorithmic Frontiers of Doubling Metric Spaces 9

  10. Metric Partition • A quadtree-like hierarchy [Bartal’96, Gupta-K.-Lee’03, Talwar‘04] • At level i: • Random radii Ri2 [2i, 2·2i] • Centers are 2i-apart in arbitraryorder Algorithmic Frontiers of Doubling Metric Spaces

  11. Metric Partition (2) • A quadtree-like hierarchy [Bartal’96, Gupta-K.-Lee’03, Talwar‘04] • Recursively to level i-1: • Caveat: log(n)hiearchical levels suffice • Ignore tiny distances < 1/n2 • Random radii Ri-12 [2i-1, 2·2i-1] Algorithmic Frontiers of Doubling Metric Spaces

  12. Dense Areas • Key observation: • The points (metric space) can be decomposed into sparse areas • Call a level i ball “dense” if • local tour weight (i.e. inside Ri-ball) is ≥ Ri/Ɛ • Such a ball can be removed, solving each sub-problem separately • Cost to join tours is relatively small: • only Ri Algorithmic Frontiers of Doubling Metric Spaces

  13. Sparsification • Sparse decomposition: • Search hierarchy bottom-up for dense balls. • Remove dense ball: • Ball is composed of 2O(ddim) sparse sub-balls • So it’s barely dense, i.e. local tour weight ≤ 2O(ddim)Ri-1/Ɛ • Recurse on remaining point set • But how do we know the local weight of the tour in a ball? • Can be estimated using the local MST • Modulo caveats like “long” edges… • OPTɅ B(u,R)≤ O(MST(S)) • OPT Ʌ B(u,3R)≥ Ω(MST(S)) - Ɛ-O(ddim) R Henceforth, we assume the input is sparse Algorithmic Frontiers of Doubling Metric Spaces

  14. Light Tours • Definition: A tour is (m,r)-light on a hierarchy if it enters all cells (clusters) • At most r times, and • Only via m designated portals • Choose portals as (2i/M)–net points • Then m = MO(ddim) 2i-1/M Algorithmic Frontiers of Doubling Metric Spaces

  15. Optimizing over Light Tours • Theorem [Arora‘98,Talwar‘04]: Given a hierarchical partition, a minimum-length (m,r)-light tour for it can be computed exactly • In time mr∙O(ddim)n∙logn • Via dynamic programming • Join tours for small clusters into tour for larger cluster Typically both m,r ≈ polylog(n/ε), thus mr ≈ npolylogn Algorithmic Frontiers of Doubling Metric Spaces

  16. Better Partitions and Lighter Tours • Our Theorem: For every (optimal) tour T, there is a partition with an (m,r)-light tour T’ such that • M = ddim∙logn/Ɛ • m = MO(ddim) = (log n/Ɛ)Õ(ddim) • r = ε-O(ddim) loglogn • And length(T’) ≤ (1+Ɛ)∙length(T) • If the partition were known, then a tour like T’ could be found in time • mr O(ddim)n∙log n = n 2Ɛ-Õ(ddim) loglog2n • It remains to prove the Theorem, and show how to find the partition Now mr≈ poly(n) after that a bit later Algorithmic Frontiers of Doubling Metric Spaces

  17. Constructing Light Tours • Modify a tour T to be (m,r)-light [Arora‘98, Talwar‘04] • Part I: Focus on m (i.e. net points) • Move cut edges to be incident on net points • Expected cost at one level (for edge of unit length) • Radius Ri-12i-1 • Pr[cut edge] ≤ O(ddim/Ri-1) • Expected cost ≤ (Ri-1/M)(ddim/Ri-1) = ddim/M = Ɛ/log n • Expected cost to edge over alllevels: ≤ log n ∙ Ɛ/log n = Ɛ • We thus constructed a (1+Ɛ)-approximate tour 2i-1/M Algorithmic Frontiers of Doubling Metric Spaces

  18. Constructing Light Tours (2) • Modify a tour to be (m,r)-light [Arora‘98, Talwar‘04] • Part II: Focus on r (i.e. number of crossing edges) • Reduce number of crossings • Patching step: Reroute (almost all) crossings back into cluster • Cost ≈ length of tour on the patched endpoints ≈ MST of these points • MST Theorem [Talwar ‘04]: For a set S of points • MST(S) ≤ diam(S)∙|S|1-1/ddim • Cost per point ≤ diam(S) / |S|1/ddim diam(S) Algorithmic Frontiers of Doubling Metric Spaces

  19. Constructing Light Tours (3) • Modify a tour to be (m,r)-light [Arora‘98, Talwar‘04] • Part II: Focus on r (i.e. number of crossing edges) • Reduce number of crossings • Expected cost to edge at level i-1 • Radius Ri-1 ≈ 2i-1 • Pr[edge is patched ] ≤ Pr[edge is cut ] • Expected cost ≤ (Ri-1/r1/ddim)(ddim/Ri-1) = ddim/r1/ddim • As before, want this to be ≤ Ɛ/log n (because we sum over log n levels) • Could take r = (ddim∙logn /Ɛ)ddim • But dynamic program runs in time mr QPTAS! [Talwar ‘04] 2Ri-1 Challenge: smaller value for r Algorithmic Frontiers of Doubling Metric Spaces

  20. Patching in Sparse Areas • Suppose a tour is q-sparse with respect to hierarchy • Every R-ball contains weight qR (for all R=2i) • Expectation: Random R-ball cuts weight Rq/R = q • Cluster formed by cuts from many levels • Expectation: weight q is cut per level • If r = q∙2loglog n • Expectation: level i-1 patching includes edges cut at muchhigher levels • Charge only “top” half of patched edges • Each charged about 2Ri-1 • Pr[edge is charged for patching] ≤ Pr[edge is cut at level i+loglogn] ≤ ddim/(Ri-1 log n) Ri-1/M Algorithmic Frontiers of Doubling Metric Spaces

  21. Wrapping Up (Patching Sparse Areas) • Modify a tour to be (m,r)-light [Arora‘98, Talwar‘04] • Part II: Focus on r (i.e. number of crossing edges) • Reduce number of crossings • Expected cost at level i-1 • Expected cost ≤ (Ri-1/r1/ddim)(ddim/Ri-1log n) = ddim/log n∙r1/ddim • As before, want this term to be equal to Ɛ/log n • Take r = (ddim/Ɛ)ddim • Obtain PTAS! 2Ri-1 Algorithmic Frontiers of Doubling Metric Spaces

  22. Technical Subtleties • Outstanding problem: • Previous analysis assumed ball cuts only q edges • True in expectation… Not good enough • Solution: try many hierarchies • Choose at random log n radii for each ball and try all their combinations! • WHP, some hierarchy cuts q edges in every ball • Drives up runtime of dynamic program Ri-1/M Algorithmic Frontiers of Doubling Metric Spaces

  23. Algorithmic Frontiers of Doubling Metrics Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb and Aryeh Kontorovich TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA

  24. Machine Learning in Doubling Metrics Large-margin classification in metric spaces [vonLuxburg-Bousquet’04] • Unknown distribution D of labeled points (x,y) 2 M£{-1,1} • M is a metric space (generalizes Rdim) • Labels are L-Lipschitz: |yi-yj| ≤ L∙d(xi,xj)(generalizes margin) • Resource: Sample of labeled points • Goal: Build hypothesis f:M {-1,1} that has (1-ε)-agreement with D • Statistical complexity: How many samples needed? • Computational complexity: Running time? • Extensions: • Small fraction of labels are wrong (adversarial noise) • Real-valued labels y2[-1,1] (metric regression) +1 2/L -1 2/L f Algorithmic Frontiers of Doubling Metric Spaces

  25. Generalization Bounds • Our approach: Assume Mis doubling and use generalized VC-theory [Alon-BenDavid-CesaBianchi-Haussler’97, Bartlett-ShaweTaylor’99] • Example: Earthmover distance (EMD) in the plane between sets of size k has ddim ≤ O(k log k) • Standard algorithm: pick hypothesis that fits all/most observed samples • Theorem:Class of L-Lipschitz functions has fat-shattering dimension fsdim ≤ (c∙L∙diam(M))ddim. • Corollary: If f is L-Lipschitz and classifies n samples correctly, WHP PrD[sgn(f(x)) ≠ y] ≤ O(fsdim∙(log n)2/n). Similarly, if fcorrectly classifies all but η-fraction, then WHP PrD[sgn(f(x)) ≠ y] ≤ η + O(fsdim∙(log n)2/n)1/2. • Bounds incomparable to [vonLuxburg-Bousquet’04] Algorithmic Frontiers of Doubling Metric Spaces

  26. Algorithmic Aspects (noise-free) • Computing a hypothesis f from the samples (xi,yi): • Where S+and S- are the positively and negatively labeled samples • Lemma (Lipschitz extension): If labels are L-Lipschitz, so is f. • Evaluatingf(x) requires solving Nearest Neighbor Search • Explains a common classification heuristic, e.g. [Cover-Hart’67] • But might require Ω(n) time… • We show how to use (1+ε)-Nearest Neighbor Search • This can be solved quickly in doubling metrics • We prove similar generalization bound by sandwiching sgn(f(x)) +1 ? -1 f Algorithmic Frontiers of Doubling Metric Spaces

  27. Extensions (noisy case) 1. A small fraction of labels are wrong (adversarial noise) • How to compute a hypothesis? • Build a bipartite graph (on S+[S-) of all violations to Lipschitz condition (edge between two points at distance < 2/L). • Compute a minimum vertex cover (or faster: 2-approximation) 2. Real-valued labelsy2[-1,1] (metric regression) • Minimize risk (expected loss) Ex,y|f(x)-y| • Extend the statistical framework by similar ideas • But how to compute a hypothesis? • Write LP: minimize Σi |f(xi)-yi| subject to |f(xi)-f(xj)| ≤ L∙d(xi,xj)8i,j • Reduce #constraints from O(n2) to O(ε-ddimn) using (1+ε)-spanner on xi’s • Apply fast approximate LP solver Algorithmic Frontiers of Doubling Metric Spaces

  28. Conclusion • General paradigm: low-dim. Euclidean spaces$doubling metric spaces • Mathematically– latter is different (strictly bigger) family • Not even low-distortion embeddings [Laakso’00,’01] • For algorithmic efficiency – strong analogy/similarity • E.g., nearest neighbor search, distributed computing and networking, combinatorial optimization, machine learning • Research directions: • Other computational tasks or application areas? • Particularly in machine learning, data structures • Scenarios where analogy fails? • E.g. [Indyk-Naor’05] which uses random projections • Other metric models? E.g. hyperbolic … Algorithmic Frontiers of Doubling Metric Spaces

More Related