New algorithms for enumerating all maximal cliques
This presentation is the property of its rightful owner.
Sponsored Links
1 / 22

New Algorithms for Enumerating All Maximal Cliques PowerPoint PPT Presentation


  • 126 Views
  • Uploaded on
  • Presentation posted in: General

New Algorithms for Enumerating All Maximal Cliques. Kazuhisa Makino Takeaki Uno Osaka University National Institute of JAPAN Informatics, JAPAN 9/Jul/2004 SWAT 2004. Background.

Download Presentation

New Algorithms for Enumerating All Maximal Cliques

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


New algorithms for enumerating all maximal cliques

New Algorithms for Enumerating All Maximal Cliques

Kazuhisa Makino Takeaki Uno

Osaka University National Institute of

JAPAN Informatics, JAPAN

9/Jul/2004 SWAT 2004


Background

Background

Recently, Enumeration algorithms are interesting

・There are still many unsolved nice problems

(unlike to ordinal discrete algorithms)

・Recent increase of computer power makes

many enumeration problems practically solvable

 many applications have been appearing,

such as, genome, data mining, clustering, so on

・Some (theoretical) algorithms use enumeration as subroutines

(recognition of perfect graph)


Background cont

Background (cont.)

・My institute has 100 researchers of informatics

・ At least 5 researchers (independently) use implementations of enumeration algorithms

・Suppose that there are 100,000 researchers of informatics

in the world

5000 researchers use enumeration algorithms ?????


Problems and results

Problems and Results

Problem1 : for a given graph G=(V, E),

enumerate all maximal cliques in G

Problem2 : for a given bipartite graph G=(V1∪V2, E),

enumerate all maximal bipartite cliques in G

( Problem2 is a special case of Problem1 )

・ We propose algorithms for solving these problems,

reduce the time complexity in dense cases and sparse cases.

・ Computational experiments for random graphs and real-world data


Difficulty

Difficulty

・ Consider branch-and-bound type enumeration:

divide maximal cliques into two groups

maximal cliques includingv / not includingv

・ If a group includes no maximal clique,  cut off the branch

 Finding a maximal clique not including given vertices of S

is NP-Complete

 Can not cut off subproblems(branches)

including no maximal clique

v1∈K

v1∈K

v2∈K

v2∈K


Existing studies and ours

Existing Studies and Ours

O(|V||E|): Tsukiyama, Ide, Ariyoshi & Shirakawa,

O(|V||E|),lexicographic order: Johnson, Yanakakis & Papadimitriou

O(a(G)|E|): Chiba & Nishizeki

( a(G): arboricity of Gwith m/(n-1)≦a(G) ≦m1/2 )

・ many heuristic algorithms in data mining, for bipartite case

Ours:

O(|V|2.376) (dense case)

O(Δ4) (sparse case)

O((Δ*)4 + θ3 ) (θ vertices have degree >Δ* )

O(Δ3) (bipartite case)

O(Δ2) (bipartite case with using much memory)


Enumeration of maximal cliques

9

4

10

7

3

6

8

Enumeration of Maximal Cliques

・Improved version of algorithm of Tsukiyama et. al.

Idea: Construct a route on all maximal cliques to be traversed

・ For a maximal clique K of G = ( V, E ):

C (K) : lexicographically maximum maximal clique including K

K≦i: vertices of K with indices ≦i

i(K) :minimum index s.t. C(K≦i) =C(K≦i+1)

parent of a maximal clique K : C(K≦i(K)-1)

・parent is lexicographically larger than K

Lexicographically

larger

9

4

1

11

7

1,2,3>1,2,4

3

10

1,3,6>1,4,5

2

K

6

8

i(K)

5


Graph representation of relation

Graph Representation of Relation

・Parent-child relation is acyclic

graph representation forms atree (enumeration tree)

Visit all maximal cliques by depth-first search

・need to find children of a maximal clique


Child of maximal clique

10

9

4

K[8]

8

Child of Maximal Clique

Γ(vi): vertices adjacent to vi

K[i] = C ( K≦i∩Γ(vi)∪ {vi} )

・ H is a child of K only if H = K[i] for some i>i(K)

(H is a child of K if the parent of K[i] is K )

・ i(K[i]) = i

・construct K[i] in O(|E|) time

・construct parent in O(|E|) time

( O(Δ2 ) time)

・for i=i(K)+1,…,|V| in O(|V||E|) time

enumerate O(|V||E|) time

per maximal clique

K,i(K)=6

9

4

1

11

7

3

10

2

6

8

5


Characterization of child

5

1

4

K≦5∪

Characterization of Child

The parent of K[i]=K⇔

(1) no vj , j<i is adjacent to all vertices in K≦i∩Γ(vi) ∪ {vi}

(2) no vj , j<i is adjacent to all vertices in K≦i∩Γ(vi) ∪ K≦j

(1) is not satisfied ⇔K[i] and parent of K[i] includes vj∈K

(2) is not satisfied ⇔ parent of K[i] includes vj∈K

K = {3,4,7,9}

K[10] = {3,7,10}

K≦5= {3,4}

K ≦7∩Γ(v10) = {3,7}

7

4

9

3

10

K ≦10∩Γ(v10)

∪ {v10}


Use of matrix multiplication

Use of Matrix Multiplication

・ Check the conditions (1) and (2) by matrix multiplication

(1) no vj , j<i is adjacent to all vertices in K ≦i∩Γ(vi) ∪ {vi}

ith row of left ⇒K≦i∩Γ(vi)∪{vi}

jth column of right ⇒Γ(vj)

ij cell of product ⇒ |K≦i∩Γ(vi)∪{vi} ∩Γ(vj) |

= |K≦i∩Γ(vi)∪{vi}| ?

Γ(vj) ∩

K ≦i∩Γ(vi) ∪ {vi}

K≦i∩Γ(vi)∪{vi}

Γ(vj)

Condition (2) can be checked in the same way

Checked in O(|V|2.368 ) time ⇒ time complexity is O(|V|2.368 ) for each


Sparse cases

O((Δ*)4 + |Θ|3 ) if partially dense

Δ*: max. degree in V\Θ

Sparse Cases

・If vi is adjacent to no vertex in K

K[i] = C ( K≦i∩Γ(vi)∪ {vi} ) = C ({vi})

parent of K[i] = C ( C ({vi}) ≦i )

If C ({vi}) ≦i=φ,parent of K[i] is K0

If C ({vi}) ≦i≠φ,(1) is not satisfied

If K≠ K0,K[i] is not a child of K

・ Since |K|≦Δ+1 , at most Δ(Δ+1) vertices are adjacent to K

・ Each K[i] takes O(Δ2) time to construct the parent

Δ: max. degree

O(Δ4 ) per maximal clique


Bipartite clique

Bipartite Clique

・ Enumerate maximal bipartite cliques in G =(V1 ∪V2 ,E )

( = maximal cliques in G’ =(V1 ∪V2 , E ∪V1 ×V1∪V2×V2))

 enumerated in O(|V|2.368 ) time for each

・ But a sparse bipartite graph will be dense

 need some improvements for sparse cases

V1

V2


Fast construction of k i

K[i]

vi

Fast Construction of K[i]

・ For any maximal bipartite clique K

K∩V2= ∩v∈K∩V1Γ(v)

K∩V1= ∩v∈K∩V2Γ(v)

・K[i]∩V1for all i are computed in O(Δ2) time

・K[i]for all i are computed in O(Δ3) time

K[v1]

K[v6]

V1

1

2

3

4

V2


Checking the parent

K[i]

vi

Checking the Parent

・・・

V1

1

2

3

|V1|-1

|V1|

・ Put small indices to V1 , large indices to V2

K[i] is a child of K ⇔ K[i]≦i = K≦i

checked in O(Δ)time

V2

・・・

|V1|+1

|V1|+2

V1

V2

Enumerated in O(Δ3) time for each

O(Δ2) by using memory


Computational experiments

Computational Experiments

・ for graphs randomly generated

・ vertex viis connected to vertices from i-rto i+rwith probability 1/2

・ Faster than Tsukiyama’s algorithm

・ Computation time is linear in maximum degree


Benchmark problems

Benchmark Problems

・ Problem of finding frequent closed item sets from database

 equivalent to maximal bipartite clique enumeration

・ Used on KDDcup (data mining algorithm competition )

BMS-WebView1  (from Web-log data)

    |V|=60,000, ave. degree2.5

BMS-WebView2 (from Web-log data)

    |V|=80,000, ave. degree5

BMS-POS(from POS data)

   |V|=510,000, ave. degree 6

IBM-Artificial  (artificial data)

   |V|= 100,000, ave.degree10


Results

Results


Conclusion and future work

Conclusion and Future Work

・ Proposed fast algorithms for enumerating

maximal cliques: O(|V|2.376), O(Δ4 ), O((Δ*)4 + θ3 )

maximal bipartite cliques: O(|V|2.376), O(Δ3 ), O(Δ2)

・ Examined benchmark problems of data mining,

and showed that our algorithm performs well.

Future work:

・ Can we improve more? What is the difficulty ?

・ Can we enumerate other maximal (minimal) graph objects ?

・ Can we apply matrix multiplication to other enumeration problems ?

・ What can be enumerated efficiently in practice ?


Frequent sets

Frequent Sets

customer1

customer2

customer3

customer4

beer

nappy

milk

Input graph:

An item and a customer is connected

iff the customer purchased the item

In a maximal bipartite clique:

Customers: have similar favorites

Items: frequently purchased together

[Agrawal et al. 96, Zaki et al. 02, Pei 00, Han 00, … ]


Few large degree vertices

Few Large Degree Vertices

・Very few vertices (denoted by Θ) have large degrees

・Divide the maximal cliques into two groups:

(a) cliques not included in Θ

(b) cliques included in Θ

・(a) can be enumerated in O(Δ’4) time

・ Maximal clique K in the induced graph by Θ is

a maximal clique of G⇔K is not included in any of (a)

 O(|Θ|3) timefor each

small degree < Δ’

large

degree

O(Δ’4 + |Θ|3 ) per maximal clique


Avoid duplications by using memory

Avoid Duplications by Using Memory

・We can avoid duplications by storing all maximal bipartite cliques

・ From K∩V1=Γ(K∩V2) ,we store all K∩V1

1. Get a K from memory (which is un-operated)

2. generate all K[i]∩V1

3. Store each K[i]∩V1 if it is not in memory

4. Go to 1 if a maximal clique is un-operated

Enumerated in O(Δ2) time for each


  • Login