1 / 41

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis. Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University University of Aarhus. The Union-Find Problem. A universe of N elements: x 1 , x 2 , …, x N Initially N singleton sets: { x 1 }, { x 2 }, …, { x N }

tocho
Download Presentation

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University University of Aarhus

  2. The Union-Find Problem • A universe of N elements: x1, x2, …, xN • Initially N singleton sets: {x1}, {x2 }, …, {xN} • Each set has a representative • Maintain the partition under • Union(xi, xj) : Joins the sets containing xi and xj • Find(xi) : Returns the representative of the set containing xi

  3. The Solution representatives d h i p b j a f l z s r c k e g m n Union(d, h) : Find(n) : h h d f l d f l m n b j a b j a m path compression link-by-rank e g e g n

  4. Complexity • O(N α(N)) for a sequence of N union and find operations [Tarjan 75] • α(•) : Inverse Ackermann function (very slow!) • Optimal in the worst case [Tarjan79, Fredman and Saks 89] • Batched (Off-line) version • Entire sequence known in advance • Can be improved to linear on RAM [Gabow and Tarjan 85] • Not possible on a pointer machine [Tarjan79]

  5. Simple and Good, as long as … The entire data structure fits in memory

  6. The I/O Model Main memory of size M One I/O transfers B items between memory and disk Disk of infinite size

  7. Our Results • An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os expected • Same as sorting • optimal in the worst case • A practical algorithm using O(sort(N) log(N/M)) I/Os • Applications to terrain analysis • Topological persistence : O(sort(N)) I/Os • Contour trees : O(sort(N)) I/Os

  8. I/O-Efficient Batched Union-Find • Assumption: No redundant unions • Each union must join two different sets • Will remove later • Two-stage algorithm • Convert to interval union-find • Compute an order on the elements s.t. each union joins two adjacent sets • Solve batched interval union-find

  9. Union Graph (Tree if no redundant unions) 1: Union(d, g) 2: Union(a, c) 3: Union(r, b) 4: Union(a, e) 5: Union(e, i) 6: Union(r, a) 7: Union(a, d) g 8: Union(d, h) r 9: Union(b, f) r r 9 3 6 6 3 f a b a b 4 4 2 9 2 7 7 c d e f c d e 1 8 5 1 5 g h i g i 8 h Equivalent union trees

  10. Transforming the Union Tree r r r 7 3 3 3 6 6 6 8 8 a b a h b d a h b 4 2 9 2 9 9 4 4 7 7 1 2 c d e f c d e f g c e f 1 8 5 1 5 5 i g h i g i r 7 9 6 3 8 d a h b f Weights along root-to-leaf path decrease 1 2 4 5 g c e i

  11. Formulating as a Batched Problem r 3 6 a b r 7 4 9 2 9 6 3 7 8 d a h b f c d e f 1 2 1 8 5 4 5 g c e i g h i For each edge, find the lowest ancestor edgewith a higher weight

  12. Cast in a Geometry Setting r 3 9 6 8 a b 7 4 2 9 7 6 c d e f 5 1 8 5 4 3 g h i 2 1 Euler Tour x: positions in the tour y: weight In O(sort(N)) I/Os [Chiang et al. 95]

  13. Cast in a Geometry Setting r 3 9 6 8 a b 7 4 2 9 7 6 c d e f 5 1 8 5 4 3 g h i 2 1 For each edge, find the lowestancestor edgewith a higher weight For each segment, find the shortest segment above and containing it

  14. Distribution Sweeping M/B vertical slabs checkedrecursively Total cost: O(sort(N)) checked here

  15. In-Order Traversal r 3 9 6 Weights along root-to-leaf path decrease 7 8 b a d h f 1 2 4 5 c e i g • At u, with child u1,…, uk(in increasing order of weight) • Recursively visit subtree at u1 • Return u • For i=2 ,…, kRecursively visit subtree at ui b r c a e i g d h f Claim: this traversalproduces the right order

  16. Solving Interval Union-Find Union: x: two operands y: time stamp Find: x: operand y: time stamp representative

  17. Solving Interval Union-Find Union: x: two operands y: time stamp Find: x: operand y: time stamp Four instances of batched ray shooting: O(sort(N))

  18. Solving Interval Union-Find Union: x: two operands y: time stamp Find: x: operand y: time stamp Four instances of batched ray shooting: O(sort(N))

  19. Handling Redundant Unions • Union tree becomes a general graph • Compute the minimum spanning tree • O(sort(N)) I/Os (randomized) [Chiang et al. 95] O(sort(N) loglog B) I/Os (deterministic) [Arge et al. 04] • Deterministic O(sort(N)) I/Os if graph is planar • Only MST edges are non-redundant

  20. Applications Topological Persistence Contour Trees

  21. Application: Topological Persistence • Introduced by Edelsbrunner et al. 2000 • Measure importance on a surface • Feature extraction • Topological de-noising • Many applications • Surface modeling • Shape analysis • Terrain analysis • Computational Biology

  22. Topological Persistence Illustrated

  23. Formulated as Batched Union-Find • Represented as a triangulated mesh • Consider minimum-saddle pairs • When reach • A minimum or maximum: do nothing • A regular point u: Issue union(u,v) for a lower neighbor v • A saddle u: let v and w be nodes from u’s two connected pieces in its lower link Issue: find(v), find(w), union(u,v), union(u,w) lower link

  24. Experiment 1:Random Union-Find 128MB memory

  25. Experiment 2: Topological Persistence on Terrain Data Neuse River Basin of North Carolina: ~ 0.5 billion points

  26. Experiment 2: Topological Persistence on Terrain Data 128MB memory Entire data set (0.5b): IM fails and EM takes 10 hours

  27. Contour Trees

  28. Summary • An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os • optimal in the worst case • A practical algorithm using O(sort(N) log(N/M)) I/Os • Applications to terrain analysis • Topological persistence : O(sort(N)) I/Os • Contour trees : O(sort(N)) I/Os • Open Question: • On-line case: Can we get below O(N α(N)) I/Os?

  29. Thank you!

  30. Previous Results • Directly maintain contours • O(N log N) time [van Kreveld et al. 97] • Needs union-split-find for circular lists • Do not extend to higher dimensions • Two sweeps by maintaining components, then merge • O(N log N) time [Carr et al. 03] • Extend to arbitrary dimensions

  31. Join Tree and Split Tree Qualified nodes 9 9 9 9 8 8 8 8 7 7 7 7 6 6 6 6 5 5 5 5 4 4 4 4 3 3 3 3 2 2 1 1 1 1 Join tree Split tree Join tree Split tree

  32. Final Contour Tree Hard to BATCH! 9 9 9 8 8 8 7 7 7 6 6 6 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 Join tree Split tree Contour tree

  33. Another Characterization Let w be the highest node that is a descendant of v in join tree and ancestor of u in split tree, (u, w) is a contour tree edge 9 9 9 Now can BATCH! 8 8 8 u 7 7 u 7 u 6 6 6 v u v 5 5 5 w w w 4 4 4 3 3 3 2 2 2 1 1 1 Join tree Split tree Contour tree

  34. Map to Rectangles 9 9 w 8 8 u 7 7 u u 6 6 v v 5 5 w w 4 4 v 3 3 2 2 1 1 Can be solved in O(sort(N)) I/Os (practical, too) Join tree Split tree

  35. Topological Persistence

  36. Label Nodes with Intervals 9 8 7 6 5 4 3 2 1 Using Euler tour (O(sort(N) I/Os)

  37. Map to Rectangles 9 9 w 8 8 u 7 7 u u 6 6 v v 5 5 w w 4 4 v 3 3 2 2 1 1 Can be solved in O(sort(N)) I/Os (practical, too) Join tree Split tree

  38. Formulated as Batched Union-Find • Represented as a triangulated mesh • Consider minimum-saddle pairs • When reach • A minimum or maximum: do nothing • A regular poin u: Issue union(u,v) for a lower neighbor v • A saddle u: let v and w be nodes from u’s two connected pieces in its lower link Issue: find(v), find(w), union(u,v), union(u,w) lower link

  39. Experiment 1:Random Union-Find

  40. Experiment 2: Topological Persistence on Terrain Data

  41. Experiment 2: Topological Persistence on Terrain Data

More Related