Union-Find Problem section 12.9.2

2017 Views

Download Presentation
## Union-Find Problem section 12.9.2

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Union-Find Problemsection 12.9.2**• Given a set {1, 2, …, n} of n elements. • Initially each element is in a different set. • {1}, {2}, …, {n} • An intermixed sequence of union and find operations is performed. • A union operation combines two sets into one. • Each of the n elements is in exactly one set at any time. • A find operation returns the set that contains a particular element.**Using Arrays And Chains**• See Section 7.7 for applications as well as for solutions that use arrays and chains. • Best time complexity obtained in Section 7.7 is O(n + u log u + f), where u and f are, respectively, the number of union and find operations that are done. • Using a tree (not a binary tree) to represent a set, the time complexity becomes almost O(n +f) (assuming at least n/2 union operations).**Uses of Union/Find**• Task scheduling. A factory has a single machine that is to perform n tasks. Task i has an integer release time ri and an integer deadline di. The completion of each task requires one unit of time on this machine. A feasible schedule is an assignment of tasks to time slots on the machine such tat task I is assigned to a time slot between its release time and deadline and no slot has more than one task assigned to it.**Uses of Union/Find**• Task scheduling. Consider the following four tasks: Task A B C D Release time 0 0 1 2 Deadline 4 4 2 3 Tasks A and B are released at time 0, etc.**Uses of Union/Find**• Task scheduling. The following task-to-slot assignment is a feasible schedule: 0 1 2 3 4 A C D B Task A B C D Release time 0 0 1 2 Deadline 4 4 2 3**Uses of Union/Find**• Task scheduling. Intuitive algorithm: 1. Sort the tasks into nonincreasing order of release time. 2. Consider the tasks in this nonincreasing order. For each task determine the free slot nearest to, but not after, its deadline. If this free slot is before the task’s release time, fail. Otherwise, assign the task to this slot.**Uses of Union/Find**• Task scheduling. Can use Union/Find for part 2 • Let d denote the latest deadline of any task • Usable time slots are j-1 to j where 1<=j <= d • Refer to these slots as slots 1 through d • For any slot a, define near(a) as the largest j such that j <= a and slot j is free. • If no such j exists, then near(a)=near(0)=0 • Two slots a and b are in the same equivalence class iff near(a) = near(b)**Uses of Union/Find**• Task scheduling. Can use Union/Find for part 2 • Beginning: near(a) = a for all slots and each slot is in its own equivalence class. • When slot a is assigned a task, near changes for all slots b with near(b) = a. • The new value of near is near(a-1) • Thus perform a union on the equivalence classes that currently contain slots a and a-1. F • Find the equivalence class of a-1 by doing a find(a)**Uses of Union/Find**• Task scheduling. Can use Union/Find for part 2 • For each equivalence class e must retain in nearest[e] the value of near of its members • Get near(a) by doing nearest[find(a)] • Assume that the equivalence class name is taken to be whatever the find returns.**Task scheduling Example**• Consider the following four tasks: Task A B C D Release time 0 0 1 2 Deadline 4 4 2 3 Putting in non-increasing order of release: Task D C A B Deadline 3 2 4 4 latest schedule 2 1 3 3**Task scheduling Example**• Putting in non-increasing order of release: Task D C A B latest schedule 2 1 3 3 class near 0 1 2 3 nearest: 0 1 2 3 0 1 2 3 0 1 2 3**Task scheduling Example**• Schedule D in slot 2. All items in class 2 become class 1 Task D C A B latest schedule 2 1 3 3 class near 0 1 3 nearest: 0 1 2 3 0 1 1 3 0 1 3 2 2**Task scheduling Example**• Schedule C in slot 1. All items in class 1 become class 0 Task D C A B latest schedule 2 1 3 3 scheduled 2 1 class near 0 nearest: 0 1 2 3 0 0 1 3 0 3 1 3 2 1 2**Task scheduling Example**• Schedule A in slot 3. All items in class 3 become class 2 Task D C A B latest schedule 2 1 3 3 scheduled 2 1 3 class near 0 nearest: 0 1 2 3 0 0 1 0 0 3 1 2 3 1 2**Task scheduling Example**• Schedule B in slot 3 which is in e-class 0 which has nearest value 0. All items in class 3 become class 2 Task D C A B latest schedule 2 1 3 3 scheduled 2 1 3 0 class near 0 nearest: 0 1 2 3 0 0 0 0 0 3 1 2 3 1 2**Task Scheduling**• Resulting task schedule. The following task-to-slot assignment is a feasible schedule: 0 1 2 3 4 B C D A Task A B C D Release time 0 0 1 2 Deadline 4 4 2 3**5**4 13 2 9 11 30 5 13 4 11 13 4 5 2 9 9 11 30 2 30 A Set As A Tree • S = {2, 4, 5, 9, 11, 13, 30} • Some possible tree representations:**4**2 9 11 30 5 13 Result Of A Find Operation • find(i) is to return the set that contains element i. • In most applications of the union-find problem, the user does not provide set identifiers. • The requirement is that find(i) and find(j) return the same value iff elements i and j are in the same set. find(i) will return the element that is in the tree root.**13**4 5 9 11 30 2 Strategy For find(i) • Start at the node that represents element i and climb up the tree until the root is reached. • Return the element in the root. • To climb the tree, each node must have a parent pointer.**7**13 4 5 8 3 22 6 9 11 30 10 2 1 20 16 14 12 Trees With Parent Pointers**Possible Node Structure**• Use nodes that have two fields: element and parent. • Use an array table[] such that table[i] is a pointer to the node whose element is i. • To do a find(i) operation, start at the node given by table[i] and follow parent fields until a node whose parent field is null is reached. • Return element in this root node.**13**4 5 9 11 30 2 1 table[] 0 5 10 15 (Only some table entries are shown.) Example**13**4 5 9 11 30 2 1 parent[] 0 5 10 15 Better Representation • Use an integer array parent[] such that parent[i] is the element that is the parent of element i. 2 9 13 13 4 5 0**Union Operation**• union(i,j) • i and j are the roots of two different trees, i != j. • To unite the trees, make one tree a subtree of the other. • parent[j] = i**7**13 4 5 8 3 22 6 9 11 30 10 2 1 20 16 14 12 Union Example • union(7,13)**The Find Method**public int find(int theElement) { while (parent[theElement] != 0) theElement = parent[theElement]; // move up return theElement; }**The Union Method**public void union(int rootA, int rootB) {parent[rootB] = rootA;}**Time Complexity Of union()**• O(1) • Time for u unions: O(u)**5**4 3 2 1 Time Complexity of find() • Tree height may equal number of elements in tree. • union(2,1), union(3,2), union(4,3), union(5,4)… • So complexity of a single search is O(u).**Time Complexity of find()**• Worst case: • do u unions resulting in one tree as on previous slide (height u) • Do f finds on last element • Time is O(uf)**Back to the drawing board.**u Unions and f Find Operations • O(u + uf) = O(uf) • Time to initialize parent[i] = 0 for all i is O(n). • Total time is O(n + uf). • Worse than solution of Section 7.7!**7**13 4 5 8 3 22 6 9 11 30 10 2 1 20 16 14 12 Smart Union Strategies • union(7,13) • Which tree should become a subtree of the other?**7**13 4 5 8 3 22 6 9 11 30 10 2 1 20 16 14 12 Height Rule • Make tree with smaller height a subtree of the other tree. • Break ties arbitrarily. union(7,13)**7**13 4 5 8 3 22 6 9 11 30 10 2 1 20 16 14 12 Weight Rule • Make tree with fewer number of elements a subtree of the other tree. • Break ties arbitrarily. union(7,13)**Implementation**• Root of each tree must record either its height or the number of elements in the tree. • When a union is done using the height rule, the height increases only when two trees of equal height are united. • When the weight rule is used, the weight of the new tree is the sum of the weights of the trees that are united.**Height Of A Tree**• If we start with single element trees and perform unions using either the height or the weight rule. The height of a tree with p elements is at most floor (log2p) + 1. • Proof is by induction on p. See next slide.**Height Of A Tree**• Proof is by induction on p. • Trivial for p=1 • assume true for all trees with i nodes, i<=p-1. • show for trees with p nodes. • consider the last union operation union(k,j) used to create tree t which has p nodes. • Let m be the number of nodes in tree j and let p-m be the number of nodes in tree k. • assume that 1<=m<=p/2. So j is made a subtree of k.**Height Of A Tree**• Proof is by induction on p. • assume that 1<=m<=p/2. So j is made a subtree of k. • therefore height of t is either same as that of k or is one more that that of j. • if former then, since num nodes in k is p-m and by induction, height of t is**Height Of A Tree**• Proof is by induction on p. • if latter then height of t is by induction since log2p/2 = log2p - log22 = log2p - 1**7**13 8 3 22 6 4 5 9 g 10 f 11 30 e 2 20 16 14 12 d 1 a, b, c, d, e, f, and g are subtrees a b c Sprucing Up The Find Method • find(1) • Do additional work to make future finds easier.**7**13 8 3 22 6 4 5 9 g 10 f 11 30 e 2 20 16 14 12 d 1 a, b, c, d, e, f, and g are subtrees a b c Path Compaction • Make all nodes on find path point to tree root. • find(1) Makes two passes up the tree.**7**13 8 3 22 6 4 5 9 g 10 f 11 30 e 2 20 16 14 12 d 1 a, b, c, d, e, f, and g are subtrees a b c Path Splitting • Nodes on find path point to former grandparent. • find(1) Makes only one pass up the tree.**7**13 8 3 22 6 4 5 9 g 10 f 11 30 e 2 20 16 14 12 d 1 a, b, c, d, e, f, and g are subtrees a b c Path Halving • Parent pointer in every other node on find path is changed to former grandparent. • find(1) Changes half as many pointers.**Time Complexity**• Ackermann’s function. • A(i,j) = 2j, i = 1 and j >= 1 • A(i,j) = A(i-1,2), i >= 2 and j = 1 • A(i,j) = A(i-1,A(i,j-1)), i, j >= 2 • Inverse of Ackermann’s function. • alpha(p,q) = min{z>=1 | A(z, p/q) > log2q}, p >= q >= 1 • Means find the minimum z such that A(z,p/q) > log2q**Time Complexity**• Ackermann’s function grows very rapidly as p and q are increased. • A(2,1) = A(1,2) = 22 = 4 • For j >= 2, A(2,j) = A(1,A(2,j-1)) = 2A(2,j-1). • So A(2,2) = 2A(2,1) = 24 =16 • A(2,3) = 2A(2,2) = 216 =65,536 • A(2,4) = 265,536 . • A(2,j) = 222…. Where there are j + 1 rows of twos. • A(2,j) for j > 3 is very large**Time Complexity**• Ackermann’s function grows very rapidly as p and q are increased. • A(2,4) = 265,536 • The inverse function grows very slowly. • alpha(p,q) < 5 until q = 2A(4,1) • A(4,1) = A(2,16) >>>> A(2,4) • In the analysis of the union-find problem, q is the number, n, of elements; p = n + f; and u >= n/2. • For all practical purposes, alpha(p,q) < 5.**Time Complexity**Theorem 12.2[Tarjan and Van Leeuwen] Let T(f,u) be the time required to process any intermixed sequence of f finds and u unions. Assume that u >= n/2. a*(n + f*alpha(f+n, n) <= T(f,u) <= b*(n + f*alpha(f+n, n) where a and b are constants. These bounds apply when we start with singleton sets and use either the weight or height rule for unions and any one of the path compression methods for a find.