1 / 21

Disjoint Set Structures for Operations over Set

This recitation discusses the implementation of disjoint set structures for set operations such as finding which set an object belongs to and merging two sets. It includes performance analysis and different approaches to improve efficiency.

sald
Download Presentation

Disjoint Set Structures for Operations over Set

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Disjoint set structures --for Operations over set(Reference: textbook, pp175-180) CS2223 Recitation 3 March 30, 2005 Song Wang

  2. Problem Description • Given: • A set S with N objects, identified using number 1 to N. • Disjoint partitions (subsets) of the set S. • Any item belongs to one partition • No one item belongs to more than one partitions. • What to do: • Find: given an object, find which set contains it. • Merge: given two set, merge them into one set. • Why: • Basic and frequently used functions for set operations, like union, intersection, and etc. • Consequently, important problem for many other algorithms, like finding the minimum spanning tree.

  3. 1 2 7 6 5 4 3 8 9 Set 1 Set 2 Set 3 Preliminaries • Data Structure for Set: Tree • Ex. Parent Node denotes each set Smallest object as the parent node (one choice)

  4. 1 3 1 1 1 2 3 2 3 3 2 1 1 0 0 3 0 1 Some adaptation: Index: 1 2 3 4 5 6 7 8 9 Array Preliminaries II • Degraded Linked List: Array to record parent only Index: 1 2 3 4 5 6 7 8 9 Array

  5. Index: 1 2 3 4 5 6 7 8 9 Array 1 2 3 3 2 3 1 1 1 Solution 1: find1() find1(7): 1--belongs to set 1 find1(2): 2—belongs to set 2 Function find1(x) return set[x]

  6. Index: 1 2 3 4 5 6 7 8 9 Index: 1 2 3 4 5 6 7 8 9 Array Array 1 1 1 2 3 3 2 3 1 1 1 1 1 3 3 1 3 1 Solution 1: merge1() Merge set 1 and 2: Procedure merge1(a,b) i<- min (a, b) j<-max (a, b) for k<-1 to N do if set[k]=j then set[k]<-i Scan

  7. Performance Analysis of find1() and merge1() • Case Study: n times of find and <=N-1 times of merge. (n is comparable to N) • Function find1 takes constant time: Θ(1) • Procedure merge1 takes linear time: Θ(N) • Total: n* Θ(1)+(N-1)Θ(N)= Θ(N2) or Θ(n2)

  8. 8 1 2 7 6 5 4 3 9 2 8 5 7 1 8 5 9 2 1 9 7 Set 1 Set 2 Set 3 Set 1 Set 1 Can We do Better? Merge set 1 and 2:

  9. Index: 1 2 3 4 5 6 7 8 9 Index: 1 2 3 4 5 6 7 8 9 Array Array 1 1 1 2 3 3 2 3 1 1 1 1 1 3 3 2 3 1 Solution 2: merge2() Merge set 1 and 2: Procedure merge2(a,b) if a<b then set[b]<-a else set[a]<-b Guarantee the root of the tree is the smallest

  10. Index: 1 2 3 4 5 6 7 8 9 Array 1 1 3 3 3 1 1 1 2 8 1 2 7 5 9 Set 1 Solution 2: find2() find1(5): 1 Need traverse the whole path from node 5 to the root node 1 Function find2(x) r<-x while set[r]!=r do r<-set[r] return r Only for root, r=set[r]

  11. Performance Analysis of find2() and merge2() • Case Study: n times of find and <=N-1 times of merge. (n is comparable to N) • Function find2 takes linear time: Θ(N) in the worst case. • Procedure merge2 takes constant time: Θ(1) • Total: n* Θ(N)+(N-1)Θ(1)= Θ(N2) or Θ(n2) • No improvement!

  12. Merge2(5,6) 6 4 4 1 2 3 6 4 5 6 1 5 3 4 2 1 2 1 2 6 5 5 3 3 Merge2(4,5) …… Merge2(1,2) What is the Problem? • The worst case: linear tree Find2(6)? Height of the tree is essential for performance

  13. 5 3 4 3 5 7 6 2 1 4 1 7 6 2 1 4 3 5 7 6 2 How to Avoid a Bad Merge Tree Merge(1,4)

  14. Who’s whose subtree? • Tree t1 has height h1 and Tree t2 has height h2 • If h1< h2 : t1 becomes subtree of t2 and merged tree’s height is h2 • If h1== h2 : t1 becomes subtree of t2 and merged tree’s height is h1+1 • The root of the tree is not always the smallest node any more!

  15. Theorem 5.9.1, pp 177 • A tree containing k nodes has a height at most └log k┘ • Proof by induction.

  16. Solution 3: merge3() Procedure merge3(a,b) if height[a]=height[b] then height[a]<-height[a]+1 set[b]<-a else if height[a]>height[b] then set[b]<-a else set[a]<-b

  17. Performance Analysis of find2() and merge3() • Case Study: n times of find and <=N-1 times of merge. (n is comparable to N) • Function find2 takes <linear time: Θ(logN) in the worst case. • Procedure merge3 takes constant time: Θ(1) • Total: n* Θ(logN)+(N-1)Θ(1)= Θ(n log n) • Some improvement

  18. 16 20 11 21 20 12 10 21 16 10 11 12 9 4 8 6 1 4 1 6 8 9 Path Compression in find3() • Intuitive explanation • More fan-out of children, less height of the tree. Find3(20)

  19. Solution 3: find3() Function find3(x) r<-x while set[r]!=r do r<-set[r] i<-x while i!=r do j<-set[i] set[i]<-r i<-j return r First traverse of the path Find the root Second traverse of the path Connect nodes on path to root

  20. Performance Analysis of find3() and merge3() • Case Study: n times of find and <=N-1 times of merge. (n is comparable to N) • Function find3 takes little more than constant time. • Procedure merge3 takes constant time: Θ(1) • Total: close to Θ(n) • Best one!

  21. 2 5 7 1 9 9 8 5 7 2 1 8 Summery Find1() and merge1(): Best for find, worst for merge (height =1, always ) Find2() and merge2() Best for merge, worst for find (height = N, worst case) Mixing above Mixing above Find2() and merge3() (height = lgN, worst case) Find3() and merge3() (height close to 1) Best for both

More Related