Create Presentation
Download Presentation

Download Presentation

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis

93 Views

Download Presentation
Download Presentation
## I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**I/O-Efficient Batched Union-Find and Its Applications to**Terrain Analysis Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University University of Aarhus**The Union-Find Problem**• A universe of N elements: x1, x2, …, xN • Initially N singleton sets: {x1}, {x2 }, …, {xN} • Each set has a representative • Maintain the partition under • Union(xi, xj) : Joins the sets containing xi and xj • Find(xi) : Returns the representative of the set containing xi**The Solution**representatives d h i p b j a f l z s r c k e g m n Union(d, h) : Find(n) : h h d f l d f l m n b j a b j a m path compression link-by-rank e g e g n**Complexity**• O(N α(N)) for a sequence of N union and find operations [Tarjan 75] • α(•) : Inverse Ackermann function (very slow!) • Optimal in the worst case [Tarjan79, Fredman and Saks 89] • Batched (Off-line) version • Entire sequence known in advance • Can be improved to linear on RAM [Gabow and Tarjan 85] • Not possible on a pointer machine [Tarjan79]**Simple and Good, as long as …**The entire data structure fits in memory**The I/O Model**Main memory of size M One I/O transfers B items between memory and disk Disk of infinite size**Our Results**• An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os expected • Same as sorting • optimal in the worst case • A practical algorithm using O(sort(N) log(N/M)) I/Os • Applications to terrain analysis • Topological persistence : O(sort(N)) I/Os • Contour trees : O(sort(N)) I/Os**I/O-Efficient Batched Union-Find**• Assumption: No redundant unions • Each union must join two different sets • Will remove later • Two-stage algorithm • Convert to interval union-find • Compute an order on the elements s.t. each union joins two adjacent sets • Solve batched interval union-find**Union Graph**(Tree if no redundant unions) 1: Union(d, g) 2: Union(a, c) 3: Union(r, b) 4: Union(a, e) 5: Union(e, i) 6: Union(r, a) 7: Union(a, d) g 8: Union(d, h) r 9: Union(b, f) r r 9 3 6 6 3 f a b a b 4 4 2 9 2 7 7 c d e f c d e 1 8 5 1 5 g h i g i 8 h Equivalent union trees**Transforming the Union Tree**r r r 7 3 3 3 6 6 6 8 8 a b a h b d a h b 4 2 9 2 9 9 4 4 7 7 1 2 c d e f c d e f g c e f 1 8 5 1 5 5 i g h i g i r 7 9 6 3 8 d a h b f Weights along root-to-leaf path decrease 1 2 4 5 g c e i**Formulating as a Batched Problem**r 3 6 a b r 7 4 9 2 9 6 3 7 8 d a h b f c d e f 1 2 1 8 5 4 5 g c e i g h i For each edge, find the lowest ancestor edgewith a higher weight**Cast in a Geometry Setting**r 3 9 6 8 a b 7 4 2 9 7 6 c d e f 5 1 8 5 4 3 g h i 2 1 Euler Tour x: positions in the tour y: weight In O(sort(N)) I/Os [Chiang et al. 95]**Cast in a Geometry Setting**r 3 9 6 8 a b 7 4 2 9 7 6 c d e f 5 1 8 5 4 3 g h i 2 1 For each edge, find the lowestancestor edgewith a higher weight For each segment, find the shortest segment above and containing it**Distribution Sweeping**M/B vertical slabs checkedrecursively Total cost: O(sort(N)) checked here**In-Order Traversal**r 3 9 6 Weights along root-to-leaf path decrease 7 8 b a d h f 1 2 4 5 c e i g • At u, with child u1,…, uk(in increasing order of weight) • Recursively visit subtree at u1 • Return u • For i=2 ,…, kRecursively visit subtree at ui b r c a e i g d h f Claim: this traversalproduces the right order**Solving Interval Union-Find**Union: x: two operands y: time stamp Find: x: operand y: time stamp representative**Solving Interval Union-Find**Union: x: two operands y: time stamp Find: x: operand y: time stamp Four instances of batched ray shooting: O(sort(N))**Solving Interval Union-Find**Union: x: two operands y: time stamp Find: x: operand y: time stamp Four instances of batched ray shooting: O(sort(N))**Handling Redundant Unions**• Union tree becomes a general graph • Compute the minimum spanning tree • O(sort(N)) I/Os (randomized) [Chiang et al. 95] O(sort(N) loglog B) I/Os (deterministic) [Arge et al. 04] • Deterministic O(sort(N)) I/Os if graph is planar • Only MST edges are non-redundant**Applications**Topological Persistence Contour Trees**Application: Topological Persistence**• Introduced by Edelsbrunner et al. 2000 • Measure importance on a surface • Feature extraction • Topological de-noising • Many applications • Surface modeling • Shape analysis • Terrain analysis • Computational Biology**Formulated as Batched Union-Find**• Represented as a triangulated mesh • Consider minimum-saddle pairs • When reach • A minimum or maximum: do nothing • A regular point u: Issue union(u,v) for a lower neighbor v • A saddle u: let v and w be nodes from u’s two connected pieces in its lower link Issue: find(v), find(w), union(u,v), union(u,w) lower link**Experiment 1:Random Union-Find**128MB memory**Experiment 2: Topological Persistence on Terrain Data**Neuse River Basin of North Carolina: ~ 0.5 billion points**Experiment 2: Topological Persistence on Terrain Data**128MB memory Entire data set (0.5b): IM fails and EM takes 10 hours**Summary**• An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os • optimal in the worst case • A practical algorithm using O(sort(N) log(N/M)) I/Os • Applications to terrain analysis • Topological persistence : O(sort(N)) I/Os • Contour trees : O(sort(N)) I/Os • Open Question: • On-line case: Can we get below O(N α(N)) I/Os?**Previous Results**• Directly maintain contours • O(N log N) time [van Kreveld et al. 97] • Needs union-split-find for circular lists • Do not extend to higher dimensions • Two sweeps by maintaining components, then merge • O(N log N) time [Carr et al. 03] • Extend to arbitrary dimensions**Join Tree and Split Tree**Qualified nodes 9 9 9 9 8 8 8 8 7 7 7 7 6 6 6 6 5 5 5 5 4 4 4 4 3 3 3 3 2 2 1 1 1 1 Join tree Split tree Join tree Split tree**Final Contour Tree**Hard to BATCH! 9 9 9 8 8 8 7 7 7 6 6 6 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 Join tree Split tree Contour tree**Another Characterization**Let w be the highest node that is a descendant of v in join tree and ancestor of u in split tree, (u, w) is a contour tree edge 9 9 9 Now can BATCH! 8 8 8 u 7 7 u 7 u 6 6 6 v u v 5 5 5 w w w 4 4 4 3 3 3 2 2 2 1 1 1 Join tree Split tree Contour tree**Map to Rectangles**9 9 w 8 8 u 7 7 u u 6 6 v v 5 5 w w 4 4 v 3 3 2 2 1 1 Can be solved in O(sort(N)) I/Os (practical, too) Join tree Split tree**Label Nodes with Intervals**9 8 7 6 5 4 3 2 1 Using Euler tour (O(sort(N) I/Os)**Map to Rectangles**9 9 w 8 8 u 7 7 u u 6 6 v v 5 5 w w 4 4 v 3 3 2 2 1 1 Can be solved in O(sort(N)) I/Os (practical, too) Join tree Split tree**Formulated as Batched Union-Find**• Represented as a triangulated mesh • Consider minimum-saddle pairs • When reach • A minimum or maximum: do nothing • A regular poin u: Issue union(u,v) for a lower neighbor v • A saddle u: let v and w be nodes from u’s two connected pieces in its lower link Issue: find(v), find(w), union(u,v), union(u,w) lower link