- 122 Views
- Uploaded on
- Presentation posted in: General

CHAPTER 8 THE DISJOINT SET ADT

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

CHAPTER 8

THE DISJOINT SET ADT

§1 Equivalence Relations

【Definition】A relationR is defined on a set S if for every pair of elements (a, b), a, b S, a R b is either true or false. If a R b is true, then we say that a is related to b.

【Definition】A relation, ~, over a set, S, is said to be an equivalence relation over S iff it is symmetric, reflexive, and transitive over S.

【Definition】Two members x and y of a set S are said to be in the same equivalence class iff x ~ y.

1/12

Given an equivalence relation ~, decide for any a and b if a ~ b.

§2 The Dynamic Equivalence Problem

〖Example〗Given S = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 } and 9 relations: 124, 31, 610, 89, 74, 68, 35, 211, 1112.

The equivalence classes are { 2, 4, 7, 11,12 }, { 1, 3, 5 }, { 6, 8, 9, 10 }

Algorithm:

{ /* step 1: read the relations in */

Initialize N disjoint sets;

while ( read in a ~ b ) {

if ( ! (Find(a) == Find(b)) )

Union the two sets;

} /* end-while */

/* step 2: decide if a ~ b */

while ( read in a and b )

if ( Find(a) == Find(b) ) output( true );

else output( false );

}

(Union / Find)

Dynamic (on-line)

2/12

§2 The Dynamic Equivalence Problem

10

4

2

6

8

7

1

9

3

5

Elements of the sets: 1, 2, 3, ..., N

Sets : S1, S2, ... ... and SiSj = ( ifi j ) —— disjoint

〖Example〗S1 = { 6, 7, 8, 10 }, S2 = { 1, 4, 9 }, S3 = { 2, 3, 5 }

Note:

Pointers are

from children

to parents

A possible forest representation of these sets

Operations :

(1) Union( i, j ) ::= Replace Si and Sj by S = SiSj

(2) Find( i ) ::= Find the set Sk which contains the element i.

3/12

10

10

10

4

4

4

2

6

6

6

8

8

8

7

7

7

1

1

1

9

9

9

3

5

S1

name[ ]

S2

S3

§3 Basic Data Structure

Union ( i, j )

Idea: Make Sia subtree of Sj , or vice versa. That is, we can set the parent pointer of one of the roots to the other root.

S1S2

S2 S1

Implementation 1:

S2 S1

S2

4/12

§3 Basic Data Structure

void SetUnion ( DisjSet S,

SetType Rt1,

SetType Rt2 )

{ S [ Rt2 ] = Rt1 ; }

SetType Find ( ElementType X,

DisjSet S )

{ for ( ; S[X] > 0; X = S[X] ) ;

return X ;

}

S

10

4

2

[5]

[7]

[8]

[9]

[10]

[3]

[4]

[2]

[1]

[6]

6

8

0

2

2

0

10

4

10

10

0

4

7

1

9

3

5

j

name[k]

S

...

i

Implementation 2: S [ element ] = the element’s parent.

Note: S [ root ] = 0 and set name = root index.

〖Example〗The array representation of the three sets is

Here we use the fact that

the elements are numbered from 1 to N.

Hence they can be used as

indices of an array.

10

( S1S2 S1) S [ 4 ] = 10

Find ( i )

Implementation 1:

Implementation 2:

find ( i ) =

‘S’

5/12

§3 Basic Data Structure

Algorithm using union-find operations

{ Initialize Si = { i } fori = 1, ..., 12 ;

for ( k = 1; k <= 9; k++ ) { /* for each pair i j */

if ( Find( i ) != Find( j ) )

SetUnion( Find( i ), Find( j ) );

}

}

T = ( N2 ) !

That’s not

good.

N

N1

1

Analysis

Practically speaking, union and find are always paired. Thus we consider the performance of a sequence of union-find operations.

Sure. Try this one:

union(2, 1), find(1);

union(3, 2), find(1);

..., ... ;

union(N, N 1), find(1).

Can you think of

a worst case example?

〖Example〗 Given S = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 } and 9 relations: 124, 31, 610, 89, 74, 68, 35, 211, 1112. We have 3 equivalence classes { 2, 4, 7, 11, 12 }, { 1, 3, 5 }, and { 6, 8, 9, 10 }

6/12

Now T = O( N )

for the worst case

example I gave.

1

2

N

§4 Smart Union Algorithms

Union-by-Size

-- Always change the smaller tree

S [ Root ] = – size; /* initialized to be –1 */

【Lemma】Let T be a tree created by union-by-size with N nodes, then

Proof: By induction. (Each element can have its set name changed at most log2N times.)

Time complexity of N Union and M Find operations is now O( N + M log2N ).

Union-by-Height

-- Always change the shallow tree

Please read Figure 8.13 on p.273 for detailed implementation.

7/12

§5 Path Compression

SetType Find ( ElementType X, DisjSet S )

{

if ( S[ X ] <= 0 ) return X;

else return S[ X ] = Find( S[ X ], S );

}

Slower for

a single find, but

faster for a sequence of

find operations.

SetType Find ( ElementType X, DisjSet S )

{ ElementType root, trail, lead;

for ( root = X; S[ root ] > 0; root = S[ root ] )

; /* find the root */

for ( trail = X; trail != root; trail = lead ) {

lead = S[ trail ] ;

S[ trail ] = root ;

} /* collapsing */

return root ;

}

Note: Not compatible with union-by-height since it changes the heights. Just take “height” as an estimated rank.

8/12

§6 Worst Case for

Union-by-Rank and Path Compression

【Lemma (Tarjan)】Let T( M, N ) be the maximum time required to process an intermixed sequence of M N finds and N 1 unions. Then:

k1M ( M, N ) T( M, N ) k2M ( M, N )

for some positive constants k1and k2 .

log* 265536 = 5 since logloglogloglog ( 265536 )= 1

Ackermann’s Function and ( M, N )

http://mathworld.wolfram.com/AckermannFunction.html

O( log*N ) 4

log*N (inverse Ackermann function)

= # of times the logarithm is applied to N until the result 1.

9/12

Home work: p.282 8.7 A formatted version

We have a network of computers and a list of bi-directional connections. Each of these connections allows a file transfer from one computer to another. Is it possible to send a file from any computer on the network to any other?

Input:Input consists of several test cases.For each test case, the first line contains N (<=10,000), the total number of computers in a network. Each computer in the network is then represented by a positive integer between 1 and N.Then in the following lines, the input is given in the format:

I c1 c2where I stands for inputting a connection between c1 and c2; or

C c1 c2where C stands for checking if it is possible to transfer files between c1 and c2; or

Swhere Sstands for stopping this case.

10/12

Output:For each C case, print in one line the word yes or no if it is possible or impossible to transfer files between c1 and c2, respectively. At the end of each case, print in one line The network is connected. if there is a path between any pair of computers; or There are k components. where k is the number of connected components in this network.Print a blank line between test cases.

Sample Input:

3

C 1 2

I 1 2

C 1 2

S

3

I 3 1

I 2 3

C 1 2

S

Sample Output:

no

yes

There are 2 components.

yes

The network is connected.

Must use union-by-size and path compression to obtain a worst-case running time of O( M ( M, N ) ).

11/12

Laboratory Project 3

Attack of Panda

Due: Friday, October 21st, 2011 at 10:00pm

Detailed requirements can be downloaded from http://www.cs.zju.edu.cn/ds/

Don’t forget to sign

you names

and duties at the end of

your report.